Faster numerical integration in SciPy

What do you actually have to do to speed up numerical integration in python (when using scipy.integrate)? It turns out, if you want to use Cython, the steps are fairly opaque; I'm not sure they're any easier for other methods (calling external C code directly, or using Numba), but I feel that Cython has the most pythonic look and feel of the three choices.

This article (and the associated GitLab repository) is the result of my constructing a minimal working example to do numerical integration in SciPy with the integrand specified in a compiled Cython function. The content below is reproduced as the README.md file in the repository. All four files referenced below can be found in the repository.

As for whether it is faster, I haven't tested it on a challenging integral yet...

Update 7.4.2020: we got an approximately 10x speedup over interpreted Python when using this for an oscillatory integral including Bessel functions: see below.

Calling Cython from SciPy

Perhaps you have noticed that SciPy offers low-level callback functions that let you use compiled code (e.g. from C, or via Numba or Cython) as, for example, the integrand in scipy.integrate.quad().

I was keen to help a colleague accelerate their numerical integrations, while keeping the project as high-level and pythonic as possible. Faced with the above options (C, Numba, Cython), I noticed that none of them were documented particularly well, but Cython was the one that seemed likely to result in the cleanest, most modular code. It also happens that I was familiar with Cython.

It took me a couple of hours to figure out how to get this to work properly, and with no real tutorial or simple advice to follow, I made a starter kit showing a minimal working example.

Note that I don't fully know all the jargon names given to bits of Cython. Frankly, I didn't find all the answers I needed in the Cython documentation and had to do some good old-fashioned guesswork and piecing things together. See the further reading below.

How does it work

Long story short, LowLevelCallable.from_cython() needs to be pointed at a compiled, C-style function to work. You can do that with C, Cython, or Numba. Of these, the Cython method seems the easiest, but is strangely the least documented.

The starter kit contains the following files:

  • test.pyx is a Cython file, which contains a C-style function called integrand(), defined using cdef (see Cython's language basics for an explanation of what cdef is). We want to turn this into C, compile it, and then later on use it to instantiate a scipy.LowLevelCallable instance so scipy.integrate can use that rather than working with interpreted python (which is slow).

  • test.pxd is a Cython "header" file (I don't know what the correct name is, but the Cython documentation uses the same analogy). When (I think) cythonize() in setup.py detects this, it knows it's building a Cython file we want to call from C and it should behave itself accordingly, exporting symbols that a C linker can use if need be. Otherwise, Cython does not let c-style functions (those defined with cdef) be called from outside the Cython file, e.g. from other C code.

    The corresponding python module then has the attribute __pyx_capi__, which is also needed for LowLevelCallable.from_cython() to work. To be honest, I couldn't find useful documentation about this, but I think this means that it has a 'parallel' C interface, as it were.

    This subdirectory of a repo by Ashwin Vishnu was useful, as was this article by Pierre de Buyl.

  • setup.py is the usual distutils thing, but it knows it's building a Cython module because of the call to cythonize() and thus detects the .pxd file (there is otherwise no relationship between them, no cimport or similar); I'm not entirely sure, though.

  • run.py uses the integrand() function from test.pyx to do some numerical integrals.

To try this out you will need: python 3, cython

  1. Run:

    $ python3 setup.py build_ext --inplace
    

    This builds the module from test.pyx (with symbols from test.pxd). Although running setup.py like this does the compilation for you, you can by the way see the C code generated in test.c, the function call starts something like:

    static double __pyx_f_4test_integrand(CYTHON_UNUSED int __pyx_v_n, double *__pyx_v_args)
    

    [CYTHON_UNUSED I think because the argument n is not used in integrand(), but the standard rubric for LowLevelCallable.from_cython() needs it]

    But note that it really churns out doubles, not some wacky higher-level type. This is really C at the end of the day.

  2. To test:

    $ python3 run.py
    Should be 0: 0
    Should be 4: 4
    Should be 21.333...: 21.3333
    

    Uses scipy.integrate.quad() to evaluate three numerical integrals. Note that I've tried to make it clear how to pass numerical coefficients from the python to the C code, so things don't need to be hardcoded.

Outstanding issues

I didn't quite figure out how to make this work with NumPy data types. But maybe that doesn't matter.

Update 7.4.2020: some tips on getting Cython to generate native C code

  1. If you employ float division, use the Cython decorator @cython.cdivision(True) before a cdef function:

    import cython
    
    ...
    
    # The decorator stops division by zero checking
    @cython.cdivision(True)
    cdef double C1(double t, double t_cut, double t_0):
    
        ... etc ...
    

    to avoid Python checking for a DivisionByZeroError. Note that you will need import cython to get access to this decorator.

  2. You can find Cython version of Scipy special functions in scipy.special.cython_special, for the Bessel function normally given by scipy.special.jv we needed to do:

    from scipy.special.cython_special cimport jv
    
  3. Similarly, Cython makes available C versions of many common C library functions (read more here):

    from libc.math cimport sqrt, exp, sin, cos
    
  4. Note that although Cython knows about the C99 complex numbers, I still haven't quite worked out how to do fast mathematics with them directly (but using the real and imaginary parts separately is ok).

  5. Keep checking the yellow-highlighted 'annotation' HTML file Cython generates, and don't panic if something is still yellow (meaning it may be calling interpreted Python still). Click the + to the left of the line number to see what is actually happening.

    Even after doing the above things, we were using

    return result.real
    

    which led to some yellow highlighting. But the underlying code contained the following preprocessor stuff:

    #if CYTHON_CCOMPLEX
      #ifdef __cplusplus
        #define __Pyx_CREAL(z) ((z).real())
        #define __Pyx_CIMAG(z) ((z).imag())
      #else
        #define __Pyx_CREAL(z) (__real__(z))
        #define __Pyx_CIMAG(z) (__imag__(z))
      #endif
    #else
        #define __Pyx_CREAL(z) ((z).real)
        #define __Pyx_CIMAG(z) ((z).imag)
    #endif
    

    Comparing this with the GCC documentation, it seems we are generating native C code after all, but Cython can't tell that.

Further reading