Python is a versatile and powerful programming language known for its simplicity and readability. However, when it comes to performance-critical tasks, Python’s interpreted nature can sometimes be a limiting factor. While Python offers ease of use and rapid development, it may not always be the best choice for computationally intensive tasks.
This is where Cython comes into play. Cython is a superset of Python that allows you to write Python code that can be compiled to highly efficient C code. By using Cython, we can combine the best of both worlds - the simplicity and expressiveness of Python with the performance of low-level languages like C. Cython allows us to write code that can be compiled to C extensions, resulting in significant performance improvements.
In this article, we will explore how to leverage Cython to write fast, compiled extensions for Python, enabling us to boost performance in critical areas of our code. We will first review some features of Cython, and then we’ll walk through an example to demonstrate how you can write compiled extensions using Cython.
Important Features of Cython
Cython offers several features that make it a powerful tool for writing high-performance extensions for Python:
Static Type Declarations
By declaring static types for variables and function arguments, Cython enables the code to be compiled to efficient C or C++ code. This allows for better optimization and faster execution compared to dynamic typing in Python.
Direct C-Level Interoperability
Cython provides direct access to C libraries and data structures, allowing seamless integration with existing C or C++ codebases. This feature is particularly useful when working with performance-critical tasks that require low-level control.
Efficient Memory Management
Cython includes features for manual memory management, such as C-style memory allocation and deallocation. This allows developers to optimize memory usage and reduce overhead associated with Python’s garbage collector.
Easy Python Integration
Cython can seamlessly call Python functions and use Python libraries, making it easy to leverage existing Python code. This feature allows developers to write performance-critical sections of code in Cython while utilizing the extensive ecosystem of Python libraries.
Support for Python Debugging
Cython supports Python’s debugging capabilities, including exceptions, tracebacks, and profiling tools. This allows developers to benefit from Python’s powerful debugging ecosystem while working with Cython code.
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.
Comparing Speed Between Python and Cython
To illustrate the performance difference between Python and Cython, let’s consider a simple example where we calculate the sum of squares of numbers from 1 to N.
First you need to install the Cython package via the following pip command:
pip install Cython
Next we import the cython
module and define two functions, one written in Python (sum_of_squares_vanilla
) and the other using Cython (sum_of_squares_cython
). We’ll save this code in a .pyx
file named sum_of_squares.pyx
.
import cython
def sum_of_squares_vanilla(n):
result = 0
for i in range(n):
result += i ** 2
return result
cpdef sum_of_squares_cython(int n):
cdef int result = 0
cdef int i
for i in range(n):
result += i ** 2
return result
Let’s break down the Cython implementation in the above code and highlight some key aspects.
In the Cython function sum_of_squares_cython
, there are a few notable differences compared to the Python implementation:
The cpdef
keyword
In Cython, we use the cpdef
keyword instead of def
to define a function. This keyword indicates that the function can be called from both Python and Cython, providing the flexibility of easy Python integration and efficient C-level execution.
Static type declarations
Inside the sum_of_squares_cython
function, we have static type declarations for the variables int
, we are informing Cython to treat these variables as integers, allowing for more efficient C-level execution.
The cdef
keyword
We use the cdef
keyword to declare the variables
To convert the above code into a Cython extension, we need to compile it. To do so create setup.py
file in the same folder where we created the sum_of_squares.pyx
. The setup.py
file should contain the following script.
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules=cythonize('sum_of_squares.pyx')
)
Run the above script using the following command:
python setup.py build_ext --inplace
By running this script, we instruct distutils
to compile the Cython code into a compiled extension module. The compiled module will have a .pyd
file extension on Windows or a .so
file extension on Unix-like systems, representing the binary extension module. In addition, you will see a C language file generated which contains the above code in C language.
Let’s now test the performance of both the methods we defined in the sum_of_squares.pyx
.
In the following code, we import the functions sum_of_squares_vanilla
and sum_of_squares_cython
from the compiled extension (sum_of_squares.pyd
). We then use the Python timeit
module to measure the execution time of each function by calling them with the argument 10000 for a sufficient number of iterations.
import timeit
from sum_of_squares import sum_of_squares_vanilla, sum_of_squares_cython
#Measure the execution time of the Python function
python_time = timeit.timeit("sum_of_squares_vanilla(10000)",
globals=globals(),
number=10000)
print("Python execution time:", python_time)
#Measure the execution time of the Cython function
cython_time = timeit.timeit("sum_of_squares_cython(10000)",
globals=globals(),
number=10000)
print("Cython execution time:", cython_time)
speedup_factor = python_time / cython_time
print("Speedup factor:", speedup_factor)
When we run the above code, we obtain the following output:
Python execution time: 16.361194699999942
Cython execution time: 0.03972609999982524
Speedup factor: 411.85001044834297
As we can see, the Cython implementation is significantly faster than the pure Python version. In this case, the Cython implementation achieves a speedup factor of approximately 411, meaning it is about 411 times faster than the pure Python implementation.
The substantial speed improvement is due to Cython’s ability to generate optimized C or C++ code from the Cython code, resulting in faster execution.
In conclusion, Cython is a valuable tool for writing fast, compiled extensions for Python. It allows developers to combine the simplicity and expressiveness of Python with the performance of compiled languages. By utilizing Cython, developers can optimize critical sections of their code, achieve substantial speed improvements, and seamlessly integrate with existing Python libraries and codebases.
Code More, Distract Less: Support Our Ad-Free Site
You might have noticed we removed ads from our site - we hope this enhances your learning experience. To help sustain this, please take a look at our Python Developer Kit and our comprehensive cheat sheets. Each purchase directly supports this site, ensuring we can continue to offer you quality, distraction-free tutorials.