The default Python at NBER is 2.7. 3.0 is also available as python3. plain python is for 2.7.
Python Compilers
The NBER has three Python compilers available. Any time you have a long running Python job, and are unwilling/unable to convert to a compiled language such as C, C++, Java or Fortran, it is worthwhile to try compiling your Python code. Python was not intended for compilation, and to achieve a dramatic speedup some additional tweaking of your code is sometimes necessary. However, factors of 5, 10 or even 50 in speed have been achieved. So it is very ofter worth your while to obtain results faster. I know that computers are cheap, and you are expensive. But you waiting for results is not cheap.
Pypy
The pypy compiler is very easy. If your program is test.py
then: pypy test.py
will compile and execute it. You should visit the PyPy website for compatibility information.
Numba
compiler is also very easy. If your program is test.py then:
numba test.py
will compile and execute it. You should visit the Numba website for compatibility information.
Cython
Cython is a bit more complicated and the following instructions and testimonial are due to Drew Johnson who has achieved remarkable results with it.
Cython is a way of converting Python code into C, which can then be compiled to create a much faster program. Cython is just the first step in the process of conversion - it basically takes a .pyx file and converts it into a .c file, which you then compile into a module with GCC and run from within yet another Python program. It sounds confusing, but it's not too bad once you get the handle of it.
The .pyx file is nothing complicated - it can just be a standard Python file with a modified filename, or a file not compatible with base Python language that uses some of Cython's unique syntax to make the compiled program even faster. Be sure to include the extension .pyx though! Though Cython can create .c files from standard .py files, some optimizations are turned off for files lacking the extension, so I would strongly recommend you include it.
Probably the best way to start with Cython is to simply start with an existing program, written in normal Python. Start with one that has a straightforward main() function that you're trying to run— perhaps this one (let's call it test.pyx).
def main(): print(“Hello, world!”) if __name__ == '__main__': main()
You can just run cython test.pyx now, and cython will spit out a test.c file that you can then compile using C. There are some important flags that you should be aware of though, and my normal command looks something like this:
gcc -shared -pthread -fPIC -fwrapv -O3 -Wall -fno-strict-aliasing -I/usr/local/include/python2.7 -o test.so test.c
The -I option is including some Python header files that make the compilation work and these may vary system-to-system and between Python versions. The ones on NBER are stored in a nonstandard location, so this may require tweaking on other systems. Other important things to consider: the gcc compiler has a number of optimization settings (this is the - O flag). It can be extremely helpful to play with these settings, and picking the right optimization level can make your program a lot faster.
So, after all of this configuration, gcc will spit out a module file (test.so, in this script) that contains a compiled version of your file. You can't run it quite yet though - you have to treat this file like a library.
I would run this command to transfer it to a folder that Python normally uses to store libraries:
mv test.so ~/.local/lib/python2.7/site-packages/
I normally put these three commands in a shell (.sh) script so that I can run them repeatedly and with just one or two keystrokes.
After this, you just need to create a small program that can call upon the Cythonized functions in your library. This can be something short and sweet, like:
import test test.main()
And you're golden! You can now run any function that appears in the file test, and it'll go much faster than before (in almost all cases). If you change your test file, you have to remember to do the whole build process again including running Cython, compiling with gcc, and moving the .so file to the correct folder. This is why it's very useful to create a build script.
This is Cython in its most basic form. There's a lot more to it, and a number of Cython-specific ways that you can craft more efficient C code from your initial .pyx file. The most important of these is typing your functions and variables. This basically defines what is going to be stored in them at compile time, which allows for the program to save a lot of time guessing what data is in a variable when you run the program. Cython makes this pretty simple, but it's worth reading a more comprehensive guide than I can offer. A really nice guide can be found here. Cython also plays nicely with Pandas and Numpy, and you can make your program substantially faster by taking advantage of this. I'd recommend reading this. Take to heart what they say about typing your np.ndarrays. This can often result in a speedup of 5x if your arrays are heavily used! Typing is incredibly important when using Cython. Plain Python code can speed your program up 10-25%, but using typing on heavily used variables can result in even bigger improvements. I've had my code increase in speed by an order of magnitude. You can also look into some of the Cython decorators if you need even bigger speedups. These disable some aspects of error checking and are good if you have mature code that you know is bug-free. They provide an even larger speedup but can cause unexpected and unexplained errors if your code has hidden bugs! Use them carefully, but they can be very helpful. Examples include:
@cython.boundscheck(False) # turn off bounds-checking for entire function @cython.wraparound(False) # turn off negative index wrapping for entire function
Addendum by drf
This script cylg will clobber main.py, if you have one, but is otherwise an easy way to compile, link and execute a python program.
#!/bin/csh cython $1.pyx gcc -shared -pthread -fPIC -fwrapv -O3 -Wall -fno-strict-aliasing \ -I/usr/local/include/python2.7 -o $1.so $1.c /bin/mv $1.so ~/.local/lib/python2.7/site-packages/ unset noclobber echo >main.py "import test" echo >>main.py "test.main()" python main.py
Typically you would modify it for your own use.
For support, please email it-support@nber.org.