Manylinux1 is obsolete, manylinux2010 is almost EOL, what is next?
The basic installation format for users who install packages via pip
is
the wheel format. Wheel names are composed of four parts: a
package-name-and-version tag (which can be further broken down), a Python tag,
an ABI tag, and a platform tag. More information on the tags can be found in
PEP 425. So a package like NumPy
will be available on PyPI as numpy-1.19.2-cp36-cp36m-win_amd64.whl
for 64-bit
windows and numpy-1.19.2-cp36-cp36m-macosx_10_9_x86_64.whl
for macOS. Note
that only the plaform tag win_amd64
or macosx_10_9_x86_64
differs.
But what about Linux? There is no single, vendor controlled, "Linux platform" e.g., Ubuntu, RedHat, Fedora, Debian, FreeBSD all package software at slightly different versions. What most Linux distributions do have in common is the glibc runtime library, and a smattering of various additional system libraries. So it is possible to define a least common denominator (LCD) of software expected to be on a Linux platform (exceptions apply, e.g. non-glibc distributions).
The decision to converge on a LCD common platform gave birth to the
manylinux1 standard. Going back
to our example, numpy is available as
numpy-1.19.2-cp36-cp36m-manylinux1_x86_64.whl
.
The first manylinux standard, manylinux1, was based on CentOS5 which has been obsolete since March 2017. The subsequent manylinux2010 standard is based on CentOS6, which will hit end-of-life in December 2020. The manylinux2014 standard still has some breathing room. Based on CentOS7, it will reach end-of-life in July 2024.
So what is next for manylinux
, and what manylinux
should users and package
maintainers use?
If manylinux1
is obsolete, why are there still manylinux1 wheels?
Wheels are typically consumed by Pip via pip install
. Manylinux wheels are
used for projects that require compilation, otherwise they would ship
pure Python wheels with the "none" platform tag, meaning they are compatible with
any platform. So say you are a library author and want to make it convenient
for users to install your package. If you ship a manylinux2014 wheel, but the
version of Pip your users have is too old to support manylinux2014 wheels, Pip
will happily download the source package and compile it for them. Havoc ensues:
Windows users typically cannot compile, prerequisites will be missing. Pip has
a --only-binary
option to prevent it from downloading source code and
compiling, and a --prefer-binary
option to prefer older binary packages over
compiling from source, but neither is on by default.
Pip began supporting manylinux2010 wheels with version 19.0, released in Jan 2019. The version of Pip that is officially shipped with Python 3.6, via the ensurepip module, is version 18. Python 3.7 ships pip 20. It is easy enough to upgrade, but to be on the safe side, and prevent havoc, library authors will ship a manylinux1 wheel for Python 3.6 support.
What happens now that Python 3.6 is falling out of favor?
Python 3.6 is no longer in active
development. In fact, the scientific
Python community has decided to stop actively supporting Python 3.6 from
July 2020.
So I would expect to see projects begin to drop the older manylinux1 format,
and drop support for Python 3.6 sooner rather than later, meaning that
manylinux2014
may soon become the only option for new versions.
What about Conda packages?
Conda does not use the same kind of wheel format provided by PyPI and Pip. Conda's
build system is internally consistent, and Conda packagers build a binary
package for each supported OS, thus they are not bound to the manylinux
designation. Conda does not have a declared policy around deprecating Python
3.6 yet. Conda does support pip
(but try not to mix conda
and pip
usage!), and the Pip provided should be version 20 or later. If needed,
conda upgrade pip
should get you a modern version, so here too
manylinux2014
will soon become the only option.
What comes after manylinux2014?
The glibc used in manylinux2014 is defined as the one used by CentOS7. This OS
was released in June 2014. This manylinux standard, for the first time,
declared support for non-x86 hardware systems like ARM64 (aarch64), Power
(ppc64) and S390X. However the ARM platform has grown greatly since 2014, and
glibc has moved from version 2.17 to 2.31, fixing many bugs. Since the real
driver for platform compatibility is glibc, PEP
600 defined a "perennial manylinux
spec" that is based on the glibc version number. A lot of work has already
taken place to support the next
version. Now we need to take the dive: decide what the base OS for the next
manylinux tag will be, roll out a Docker image and tooling around it, and
convince library packaging teams to adopt it. This is needed to allow libraries
like NumPy to confidently use the glibc routines fixed after 2014. For
instance, this issue is
preventing NumPy from properly supporting np.float128
on Power and S390X.
What about non-x86 machines and Linux?
As mentioned before, starting with manylinux2014 pip
and wheel
supports
non-x86 architectures like ARM64. Many packages are just now starting to roll
out support for these architectures, as the CI systems that support open source
projects (like TravisCI) have only recently made those platforms available.
It might be easier for users to adopt Conda and the conda-forge
channel
since conda-forge has support for non-x86 architectures today.
OK, so what is the bottom line?
- Use pip v20 or later to make it easier on libarary packagers: modern pip versions will take the latest manylinux package they can support and will be forward-compatible with the PEP 600 perennial manylinux standard.
- Manylinux1 and Python 3.6 are going away. Update your systems.
- For people looking to move PEP 600 forward, the next step is to dive into the auditwheel repo to define and support the next manylinux version.
Comments