An illustration of a dark brown hand holding up a microphone, with some graphical elements highlighting the top of the microphone.
Back to blog

Improving SymPy's Documentation

Published October 31, 2023

asmeurer

asmeurer

Aaron Meurer

The Chan Zuckerberg Initiative (CZI) has funded SymPy as part of Cycle 4 of its Essential Open Source Software for Science (EOSS) grant program. As part of this work, Aaron Meurer worked on improving the SymPy documentation, with a focus on writing new narrative documentation guides.

SymPy is a mature project, and has over 1000 functions and classes. Most of these functions and classes have API reference documentation in the form of docstrings, but the SymPy documentation has historically been lacking in long-form narrative documentation to supplement these reference docs.

In this post, I will go over some of the key documentation improvements that were made over the course of the 2-year grant period. Note that the documentation improvements were only one part of the CZI grant to improve SymPy. Other SymPy developers were funded to improve the performance of SymPy, and to improve its code generation capabilities.

Documentation Survey

To start the project, from November 29, 2021 to January 5, 2022 we ran a short survey on the SymPy community, to get a feel for SymPy's documentation needs. We had three takeaways.

One - the main SymPy documentation site (https://docs.sympy.org) is overwhelmingly the most popular resource that people use to get help with SymPy. This is true across all levels of experience, compared to other resources like the SymPy website, StackOverflow, and community sites (note: this survey was given in 2021, before the popularity of LLM tools like ChatGPT, so this was not included as an option for respondents). Consequently, we decided that it would be most impactful to spend efforts on improving the documentation site over those other resources.

Plot showing results of SymPy documentation question "which of the
following resources do you use when getting help with SymPy (select all that
apply)?" The choice "SymPy Docs Website" has the most responses at
91%.

Two - survey respondents identified many deficiencies in the SymPy docs which made it clear that certain improvements needed to be made to the overall layout and organization of SymPy's documentation site. In particular, we identified 4 major improvements that could be made:

  • Better top-level organization.
  • A better Sphinx theme that provides better sidebar navigation.
  • The docs have many large pages which would benefit from being split into smaller pages.
  • There were several issues with the SymPy Live extension. This was a Sphinx extension that allowed users to execute the example code blocks in the SymPy documentation directly in their browser.

Of these, all except the third, splitting large pages, were done as part of the CZI grant work. Splitting large pages hasn't been done yet due to technical difficulties with the Sphinx autodoc extension, as well as due to the fact that the large pages are now much easier to navigate with the new Sphinx theme.

The SymPy Live extension in the documentation was removed, as it was considered too much of a maintenance burden for the SymPy community to maintain. There is a new SymPy Live shell that runs on JupyterLite — that is, it runs entirely in the browser using Pyodide. We are hopeful that the JupyterLite community can come up with an equivalent SymPy Live-like extension that so that we can re-enable similar functionality in the SymPy documentation.

Three - we were been able to identify some primary areas to prioritize for writing new documentation guides.

Click if you wish to read the full survey results.

Improved Sphinx Theme - Furo

Prior to this project, the SymPy documentation used the "classic" Sphinx theme. This is the same theme that is used by the official Python documentation, but it is outdated in many ways. It lacks interactive navigation. As can be seen from the screenshot below, the layout of the page in the context of the rest of the documentation is only shown by a small breadcrumb at the top of the page. The overall layout of the subheadings on the page was given by a table of contents on the left side of the page, but this was hard to navigate. SymPy's green color scheme, while giving the docs a distinctive flavor, had poor contrast in some places such as the left sidebar, making it difficult or impossible to read for people with low vision. It cannot be seen from these screenshots, but the classic Sphinx theme does not work well on mobile (the components do not scale to smaller screen sizes at all), and it does not have native support for dark modes.

SymPy 1.7 ODE Module documentation page (May 2021)
Early 2021 SymPy 1.7 documentation page for the ODE submodule (courtesy WayBack Machine)

In order to pick a replacement theme, we ran a second survey from February 5-19, 2022. The results of that survey are summarized. The candidate themes were Read the Docs, PyData Sphinx Theme, Book, and Furo

Based on the results of the survey, we decided to use the Furo theme. The Furo theme was ranked the highest by survey respondents. In particular, they liked the improved sidebar navigation, the dark mode, and mobile support. Additionally, Furo has good accessibility and the CSS is easy to customize.

SymPy 1.12 ODE Module documentation page (October 2023)
Now using the Furo theme: ODE submodule page

The result is a documentation site that has navigable sidebars. We spent considerable time retheming the default Furo colors to match the traditional SymPy green theme. This included adding a dark mode set of colors (this can be accessed by clicking the circle icon at the top of a docs page, or by setting your device to use dark mode). We took care to make sure all color combinations used throughout the documentation were at least WCAG level AA color contrast so that text can be perceived by a wide audience of readers, including many with low vision. This included modifying the Pygments syntax highlighting styles to have better color contrast.

Improved Organization - Diataxis

A related project was reorganizing the top-level organization of the documentation. The SymPy documentation main page used to just consist of a long list of every page in the documentation. The new Furo theme makes this list unnecessary, but it also gave us an opportunity to explore how these pages could be explored in a more logical way.

We decided to adopt the Diátaxis framework for documentation organization. Diátaxis splits documentation pages into one of four categories, depending on whether the reader is interested in practical or theoretical knowledge, and on whether they have study or work oriented goals. As can be seen from the current documentation main page, the docs are now organized into four categories: tutorials, how-to guides, explanations, and API reference. We additionally added "installation" and "contributing" as separate top-level categories. Installation is important enough to warrant calling out documentation for it separately. Contribution documentation is separate because it serves a separate audience, people who want to contribute to SymPy, rather than people who are interested in using it.

Main page of the SymPy docs (v1.12, October 2023)
New SymPy docs landing page

In addition to this, we reorganized the dozens of API reference pages into eight sub-categories: Basics, Code Generation, Logic, Matrices, Number Theory, Physics, Utilities, and Topics.

Contribution Documentation

One of the most important things an open source project can do to attract new contributors is to have good contributor documentation. SymPy has historically had a wealth of contributor documentation, but much of it was outdated. It was also stored on SymPy's wiki, which made it less accessible and harder to maintain in the context of SymPy's full documentation.

Consequently, we decided to move all contributor documentation from the wiki to the main SymPy docs. Additionally, we rewrote the new contributors guide to be more inline with modern SymPy contribution practices, and to reduce the parts that only explain details on how to use Git and GitHub, which are now explained better in other sources on the internet.

Live Documentation Previews on Pull Requests

A live documentation preview build was added to the SymPy CI so that people can easily view how it looks as HTML. While this is a relatively minor change compared to some of the other things mentioned here, this has made things significantly easier for SymPy developers to review documentation changes.

To view a preview of the documentation, reviewers just need to click the button in the status checks for the pull request:

Link saying "Click here to see a preview of the documentation." from a SymPy
pull request CI checks listing

and they will be shown a rendered page like:

header on the page that says "This is a preview build from SymPy pull request #25512. It was built against a9765f6. If you aren't looking for a PR preview, go to the main SymPy documentation."

New Top-level Documentation Guides

In addition to these organizational cleanups, the project involved writing new documentation on deprecation, custom functions, best practices, and a glossary.

New Deprecation Policy

SymPy as a symbolic mathematics system is designed not just as an interactive piece of software, but also as a library, which can be used as a dependency in other Python projects. Consequently, we in the SymPy community take backwards compatibility breakages in our API very seriously. Any time the API changes in a backwards incompatible way, downstream users of that API are forced to update their code before they can update SymPy, which can be disruptive.

Previously, SymPy's actual policies on backwards compatibility breaks were vague, and sometimes developers would make breaks that ended up being unnecessarily disruptive to SymPy's end-users. A new deprecation guide has been written that outlines a deprecation policy. This guide brings three new things to SymPy.

One - a clear policy on when backwards compatibility breaks should be made. The gist is that deprecations should be avoided, and only done if absolutely necessary. There is also now a policy that all such public compatibility breaks should come with a deprecation when possible, and this deprecation should last at least a year before being removed.

Two - a new SymPyDeprecationWarning class for deprecation warnings, which gives much more user friendly error messages. For example


>>> import sympy.core.compatibility
<stdin>:1: SymPyDeprecationWarning:
The sympy.core.compatibility submodule is deprecated.
This module was only ever intended for internal use. Some of the functions
that were in this module are available from the top-level SymPy namespace,
i.e.,
from sympy import ordered, default_sort_key
The remaining were only intended for internal SymPy use and should not be used
by user code.
See https://docs.sympy.org/latest/explanation/active-deprecations.html#deprecated-sympy-core-compatibility
for details.
This has been deprecated since SymPy version 1.10. It
will be removed in a future version of SymPy.

These warning messages give detailed information on what is deprecated, what users can replace their code with, what version the deprecation was added in, and a link to an even more detailed page of deprecation explanations.

Three - all active deprecations are listed in a single page. This page gives more details about each deprecation than would be appropriate to put in the deprecation message, including details on why each deprecation was made. The page also gives helpful information on how to silence deprecation warnings.

Guide on Writing Custom Functions

SymPy comes with hundreds of mathematical functions built-in. But it also comes with a standard functionality for users to define their own custom functions. This is achieved by subclassing sympy.Function and defining various methods to specify the symbolic behavior. For example,


class log(Function):
"""
Simplified version of sympy.log that supports basic evaluation and
differentiation.
"""
@classmethod
def eval(cls, x):
if x == 1:
return 0
def fdiff(self, argindex=1):
return 1/self.args[0]


>>> x = sympy.Symbol('x')
>>> log(1)
0
>>> log(x).diff(x)
1/x

SymPy now includes an extensive how-to guide on defining custom symbolic functions. This guide is useful to advanced users, but also this exact same method is used to define the functions that are included in SymPy itself. So this guide serves as both a guide to advanced end-users as well as a guide to SymPy developers looking to define or extend one of the functions that comes with SymPy.

Guide on SymPy Best Practices

SymPy has many pitfalls, both for new users and advanced users. The new guide on best practices goes over some ways to avoid these pitfalls.

For example, one pitfall that many new SymPy users run into is using strings as inputs to SymPy functions, like


>>> from sympy import expand
>>> expand("(x**2 + x)/x")
x + 1

It's much better to define symbolic variables and create expressions directly, like


>>> from sympy import symbols
>>> x = symbols('x')
>>> expand((x**2 + x)/x)
x + 1

The best practices page outlines why one should avoid string inputs, as well as dozens of other best practices surrounding both basic and advanced usage of SymPy.

Glossary of SymPy Terminology

As a technical library, SymPy makes use of many terms which have a specific meaning. If you are not already familiar with SymPy, you might not know what these specific terms mean, or not realize that they have a specific meaning in the context of SymPy.

For example, the term "solve" is often used generically in mathematics to refer to any sort of problem solving. But in the context of SymPy, "solve" always refers to the act of isolating a variable or set of variables in an equation, like "solve for x in x2 = 1."

The new glossary page in the SymPy documentation defines various terms as used in the context of SymPy. This is useful not only as a standalone guide, but it is now easy for other places in the SymPy documentation to cross-reference these specific terms in the glossary so that readers of those documents can understand what those terms mean.

More articles from our Blog

Feature image for the blog post

Numpy QuadDType: Quadruple Precision for Everyone

By Swayam Singh

September 30, 2024

Polar bear with a yellow safety helmet and a red hammer on the ground, representing work-in-progress.

Polars Plugins: let's make them easier to use

By Bruno Kind

September 30, 2024