Moving SciPy to the Meson build system
Published July 25, 2021
Let's start with an announcement: SciPy now builds with Meson on Linux, and the full test suite passes!
This is a pretty exciting milestone, and good news for SciPy maintainers and contributors - they can look forward to much faster builds and a more pleasant development experience. So how fast is it? Currently the build takes about 1min 50s (a ~4x improvement) on my 3 year old 12-core Intel CPU (i9-7920X @ 2.90GHz):
As you can see from the tracing results, building a single C++ file
bsr.cxx, which is one of SciPy's sparse matrix formats) takes over 90
seconds. So the 1min 50 sec build time is close to optimal - the only ways to improve it are major surgery on that C++ code, or buying a faster CPU.
Why move to Meson, and why now?
In Python 3.10
distutils will be deprecated and in Python 3.12 it will be
removed (see PEP 632). SciPy
setuptools as its build system, so the removal
distutils from the Python standard library would have a major impact on
SciPy - and even more on
numpy.distutils, which directly extends
distutils. That PEP was written almost a year ago, and at the time my first
instinct was to wait until
distutils (which the
setuptools maintainers still plan to do, with a cleaned up API) and then
numpy.distutils for that change. That would also require moving
setuptools, like Fortran support (see
more details). It has become clear though that this will be a really slow and
painful process - after almost a year, the vendored
distutils still hasn't
been re-enabled inside
The other driver was that SciPy development has become more painful over
time. SciPy contains C, C++, Fortran and Cython code, exports C and Cython
APIs, has complex dependencies (see diagram below) and does a lot of code
generation. The growing amount of C++ and Cython code has increased build
times. CI systems frequently time out because build plus test suite run takes
close to 60 minutes (the limit on several free-for-open-source CI offerings).
Working on compiled code is cumbersome and slow, to the point where a couple
of other maintainers have told me it's a blocker for them to work on new
features. Finally, debugging build issues is quite difficult -
doesn't really have a design, so every other extension just monkey patches
things as needed. Earlier in the year I spent half a day hunting a build
problem through four (!) different projects that all modified how extensions
Those things together made me realize that it's time to move to a better
build system - if we are forced to do a lot of work because of PEP 632
anyway, then now is the time. That left the question: which build system?
Really there were only two viable candidates: CMake and Meson. After some
experimentation, I had a strong preference for Meson for two main reasons: it
has great documentation (unlike CMake), and it's
easy to contribute to (it is ~25,000 lines of clean Python code, compared to
~1,000,000 lines of C/C++ for CMake). Given that there are very few Python
projects using either CMake or Meson, good docs and an easy to understand
code base are very important - we're going to have to contribute! CMake has a
larger user base, as well as better Python integration than Meson via
scikit-build right now. But to me
that was less important. Also, scikit-build still depends on
one of the goals of this exercise is to get away from distutils/setuptools
completely and use only a modern build system and Python packaging standards
pyproject.toml-based builds) to interact with
pip and other Python
So in February I wrote an RFC titled "switch to Meson as a build system", and after positive feedback started working on this project.
SciPy's external build and runtime dependencies. Vendored dependencies (e.g., SuperLU, ARPACK, large parts of Boost, Uarray, HiGHS, Qhull, PocketFFT, etc.) aren't shown.
Some of the benefits of building with Meson
Let's start with the biggest benefit: Meson is very fast.
Compare this with the
numpy.distutils based build:
So the Meson build is about 4x faster. The main reason
slow is that parallelism is limited - there are hard to fix race conditions
that prevent running
build_ext completely in parallel. Another reason is
that it just goes around invoking compilers directly in a fairly ad-hoc
fashion, while Meson uses Ninja as a backend.
Ninja is about as fast as it gets.
What we can also see is that we now get a clean build log. This is part of
the reason it took a while to get to this point - every single compiler
warning was either silenced in the
meson.build files or fixed in SciPy's
Better debugging of build issues
One very nice improvement is that you don't actually have to run the build
anymore to figure out how an extension is built exactly. To look at compiler
flags, include paths, and other information relevant to how a target is
built, simple run
meson introspect build/ -i --targets to obtain a JSON
representation like this:
This helps quickly pinpointing mistakes in your
meson.build files. In
addition, debugging potential issues in Meson itself, the generated
ninja.build file is fairly readable too - its syntax is simple enough that
missing dependencies can be found easily (e.g.,
scipy.linalg.cython_linalg, and that dependency must be declared
Other significant benefits include:
- Cross-compiling will become possible. For years we've told people "sorry,
distutilswasn't really made for cross-compiling, let us know if you have any luck". As a result we've completely ignored users on some exotic platforms, and also spent a lot of time fighting with different CI systems to do native builds. For example, we will likely want to cross-compile to
aarch64rather than struggle with underpowered ARM hardware on Travis CI.
- Developers can use multiple builds at the same time. Because Meson builds are out-of-tree by design, it is now easy to have for example a GCC build, a Clang build, and a Python debug build in parallel in the same repo.
- More development tools work out of the box, e.g.
ccachewill just be picked up if it's installed, no configuration needed. I also managed to get AddressSanitizer to work with only a few small tweaks.
- Build definitions are easier to understand and modify. Not everything is easier, but common tasks like setting a compiler flag depending on some condition (like "the compiler supports this flag") certainly are:
Some key Meson design principles
Meson is a well-designed build system, and both the docs and a number of talks by Meson devs on YouTube do a great job at explaining that design. So I won't try and give a full picture here. However, there are a few things that are particularly important and would have helped me to fully grasp when I started on this project. So let's have a look at those.
Builds must be out-of-tree. Meson will not allow building inside your existing
source tree - and that includes code generation. There are good reasons for
this, but when coming from
distutils it may bite you. For example, if you
have a script to generate Cython
.pxd files (quite common in
SciPy), those files cannot be placed next to the template, they must go into
a build directory (the one we chose with
meson setup <builddir>).
Meson is not Turing-complete, and not extensible via APIs. What this means is that if something is not supported in Meson itself, you cannot just write a bit of Python code to hack around the problem. Instead, you must add it in Meson itself. If that takes too long, just fork Meson and use your fork until your feature is merged upstream. This seems painful, but it guarantees that people don't just copy around changes from project to project and long-term maintainability deteriorates. Instead, the philosophy is to fix things once for all users.
All source files and targets must be listed explicitly. This means that if you build a single Python extension from 50 Fortran files, you must list all 50 files names. This is not really a problem in practice, but it can be a little verbose. Using a few helper snippets to generate file listings in IPython saves time:
Meson has an imperative DSL with immutable objects. The syntax of the DSL is
inspired by Python, and is easy to read. Objects - which can be dependencies,
generated files, built extensions, etc. - being immutable makes things easy
to debug, but in some cases it can restrict what you can do. For example,
this code for building all
scipy.sparse.csgraph extensions is quite
elegant, but the
foreach pattern will not work if the
.pyx files are
generated. This is because they'd then be objects created by
and there is no syntax to give them unique names within the
foreach loop -
this cost me some time to figure out:
pkg-config for discovering dependencies. This means certain
things "just work". E.g.,
blas = dependency('openblas') followed by using
dependencies: blas to build the
scipy.linalg._fblas extension worked on
the first try for me - definitely did not expect that.
There is one escape hatch to do things Meson does not natively support:
As long you list its source files and outputs, you can invoke Python scripts
custom_target. This is how the SciPy build invokes
cython --cplus (Meson's Cython support is brand new, and only
supports targeting C if you give a
.pyx file to
The sharp bits
Not everything was smooth sailing. Here are a few of the most important issues I ran into:
- Cython + generated sources. This was by far the most time-consuming topic.
cimportmechanism relies on the complete source tree. For example, not having a
__init__.pyfile two directories up from where the
.pyxfile that you are building lives changes the C code that Cython generates. The full source tree layout matters for Cython. Cython was largely designed around in-tree builds, and Meson only allows you to generate files out-of-tree. This necessitates hacks like writing scripts or invoking
cpto copy over many
.pxi|.pxdfiles. And then more hacks to ensure those files are respected as dependencies by Ninja.
- SciPy is a tangled mess - there are 17
scipy.xxxsubmodules and almost all of them depend on each other. So we could only start running tests after the Meson build was close to 100% complete. We have had issues in SciPy with import cycles before, and I'm now surprised we didn't have more issues in the past.
- One feature I missed in Meson: installing generated files with
py3.install_sourcesis not allowed. One can specify
custom_target, but it is a bit hacky to make that recognize the correct Python-specific install directory and it's then not possible to generate say 10 files and only install two of those (I ran into this multiple times, typically with Cython again where we need the
.pxifiles at build time, and the
.pxdfiles also at runtime).
- You must use
meson setup. If you don't, Meson just ignores with which Python interpreter you built and simply installs to the default
/usr/local/lib/, asking for elevated permissions if needed. Meson doesn't know it's building a Python package (we've just told it in the
projectdefinition that we're building C, C++, Cython and Fortran code) so from the Meson perspective this is normal. It's not a major issue, however I have forgotten to add
--prefix=$PWD/installdiroften enough that it's a point of friction. The plan is to solve this by writing a
dev.pyCLI wrapper, similar to SciPy's
Next steps for SciPy and beyond
Right now the Meson support lives in my fork of SciPy. If you'd like to play with it, see rgommers/scipy/MESON_BUILD.md. There's a lot left to do before we can declary victory and make Meson the default build system. The most important topics are:
- support for macOS (should be easy) and Windows (probably tricky),
- implementing support for MKL and other BLAS/LAPACK libraries in Meson,
- making the BLAS/LAPACK library to use user-configurable, and
- building wheels and sdists.
When I started this endeavour, I wasn't yet completely sure using Meson was
going to work out. However I did know that if building SciPy was successful,
Meson will work for pretty much any other scientific Python project. We have
about two years left before
distutils disappears, and my hope is that we
numpy.distutils over that time frame. Projects which don't have
complex build requirements (i.e. just Cython, or maybe a few C extensions)
should be fine to simply use
setuptools. And those projects which do have
complex builds can move to Meson, or to scikit-build if they prefer CMake.
There are a lot of technical reasons to argue for or against using any build
system. However, why I'm convinced Meson is a great choice for both SciPy and
other projects is: Meson is a pleasure to use. It "feels right". That's a
rare thing to say about a build system - I've certainly not heard anyone say
setuptools. Nor about CMake. We have had
two other alternative build systems before in NumPy and SciPy: first NumScons
(based on Scons) and then Bento (based on Waf). Both were created by David
Cournapeau; NumScons in 2008 and Bento in 2011.
Bento/Waf was most similar to Meson, and it made working on SciPy so much
better that I maintained it until 2018 - but it never made the transition to
Python 3. Yes, I stayed with Python 2.7 until 2018 specifically so I could
use Bento rather than distutils - nothing in Python 3.x was interesting
enough that it was worth dealing with distutils on a daily basis.
Unfortunately Waf was not a stable enough project to build on, and Bento was
a one-person project. Meson is even better than Waf and is well-maintained,
so we can finally have a single nice build system.
This project has consumed a lot of nights and weekends to get it to this
point. I wouldn't have gotten this far in five months without some very
important contributions though. Dylan Baker,
one of the Meson maintainers, implemented Cython support in Meson
impressively quickly after my "Treating Cython like a
language?" proposal. That
was a real life saver - using
custom_target for every Cython extension
would have been very tedious. My colleague Gagandeep
Singh added Meson support for some of the
largest SciPy modules (
sparse.linalg). Smit Lunagariya, who
started an internship with SciPy earlier this month, added support for the
misc submodules, and is fixing build warnings and bugs in
SciPy that the Meson build made visible. Jussi
Pakkanen (the Meson project leader), offered
encouraging advice early on. Thibault
Saunier implemented preliminary support for
PEP 621 in
mesonpep517. Frederik Rietdijk sent
me a PR to demonstrate building SciPy with Meson and Nix. And many fellow
SciPy maintainers and others gave positive feedback on my proposals to try
adopting Meson. In short, it takes a village to switch build systems - thanks
to everyone who is participating in this effort!