Faster numerical integration in SciPy

What do you actually have to do to speed up numerical integration in python (when using scipy.integrate)? It turns out, if you want to use Cython, the steps are fairly opaque; I’m not sure they’re any easier for other methods (calling external C code directly, or using Numba), but I feel that Cython has the most pythonic look and feel of the three choices.

This article (and the associated GitLab repository) is the result of my constructing a minimal working example to do numerical integration in SciPy with the integrand specified in a compiled Cython function. The content below is reproduced as the README.md file in the repository. All four files referenced below can be found in the repository.

As for whether it is faster, I haven’t tested it on a challenging integral yet…

Calling Cython from SciPy

Perhaps you have noticed that SciPy offers low-level callback functions that let you use compiled code (e.g. from C, or via Numba or Cython) as, for example, the integrand in scipy.integrate.quad().

I was keen to help a colleague accelerate their numerical integrations, while keeping the project as high-level and pythonic as possible. Faced with the above options (C, Numba, Cython), I noticed that none of them were documented particularly well, but Cython was the one that seemed likely to result in the cleanest, most modular code. It also happens that I was familiar with Cython.

It took me a couple of hours to figure out how to get this to work properly, and with no real tutorial or simple advice to follow, I made a starter kit showing a minimal working example.

Note that I don’t fully know all the jargon names given to bits of Cython. Frankly, I didn’t find all the answers I needed in the Cython documentation and had to do some good old-fashioned guesswork and piecing things together. See the further reading below.

How does it work

Long story short, LowLevelCallable.from_cython() needs to be pointed at a compiled, C-style function to work. You can do that with C, Cython, or Numba. Of these, the Cython method seems the easiest, but is strangely the least documented.

The starter kit contains the following files:

  • test.pyx is a Cython file, which contains a C-style function called integrand(), defined using cdef (see Cython’s language basics for an explanation of what cdef is). We want to turn this into C, compile it, and then later on use it to instantiate a scipy.LowLevelCallable instance so scipy.integrate can use that rather than working with interpreted python (which is slow).

  • test.pxd is a Cython “header” file (I don’t know what the correct name is, but the Cython documentation uses the same analogy). When (I think) cythonize() in setup.py detects this, it knows it’s building a Cython file we want to call from C and it should behave itself accordingly, exporting symbols that a C linker can use if need be. Otherwise, Cython does not let c-style functions (those defined with cdef) be called from outside the Cython file, e.g. from other C code.

    The corresponding python module then has the attribute __pyx_capi__, which is also needed for LowLevelCallable.from_cython() to work. To be honest, I couldn’t find useful documentation about this, but I think this means that it has a ‘parallel’ C interface, as it were.

    This subdirectory of a repo by Ashwin Vishnu was useful, as was this article by Pierre de Buyl.

  • setup.py is the usual distutils thing, but it knows it’s building a Cython module because of the call to cythonize() and thus detects the .pxd file (there is otherwise no relationship between them, no cimport or similar); I’m not entirely sure, though.

  • run.py uses the integrand() function from test.pyx to do some numerical integrals.

To try this out you will need: python 3, cython

  1. Run:

    $ python3 setup.py build_ext --inplace
    

    This builds the module from test.pyx (with symbols from test.pxd). Although running setup.py like this does the compilation for you, you can by the way see the C code generated in test.c, the function call starts something like:

    static double __pyx_f_4test_integrand(CYTHON_UNUSED int __pyx_v_n, double *__pyx_v_args)
    

    [CYTHON_UNUSED I think because the argument n is not used in integrand(), but the standard rubric for LowLevelCallable.from_cython() needs it]

    But note that it really churns out doubles, not some wacky higher-level type. This is really C at the end of the day.

  2. To test:

    $ python3 run.py
    Should be 0: 0
    Should be 4: 4
    Should be 21.333...: 21.3333
    

    Uses scipy.integrate.quad() to evaluate three numerical integrals. Note that I’ve tried to make it clear how to pass numerical coefficients from the python to the C code, so things don’t need to be hardcoded.

Outstanding issues

I didn’t quite figure out how to make this work with NumPy data types. But maybe that doesn’t matter.

Further reading