Numba Tutorial: Accelerating Python Code with JIT Compilation (2024)

In today’s data-driven world, performance optimization plays a crucial role in computational tasks. Python, being an interpreted language, may not always provide the desired speed for computationally intensive operations. However, Numba, a powerful library, comes to the rescue by offering just-in-time (JIT) compilation capabilities, significantly accelerating Python code execution. This tutorial aims to provide a comprehensive understanding of Numba functionality, installation, usage, and common challenges faced when utilizing it.

Table Of Contents

  1. JIT Compilers: How They Work
  2. Tutorial: What is Numba?
  3. Installing and Importing Numba
  4. Numba Tutorial: Understanding the Inner Workings of Numba
  5. Numba Tutorial: Explicit Type declaration
  6. Numba Tutorial: Parallelization
    • When to use Numba, and when Not to:
  7. Common Challenges and Solutions with Numba

JIT Compilers: How They Work

Just-in-Time (JIT) compilation is a technique used by certain programming language implementations to improve the runtime performance of code. While traditional compilers translate the entire source code into machine code before execution, JIT compilers take a different approach.

JIT compilers work by combining aspects of both interpretation and compilation. When a program is executed, the JIT compiler analyzes the code at runtime, identifies frequently executed portions (hotspots), and dynamically compiles them into machine code. This process occurs just before the code is executed, hence the term “just-in-time.”

The JIT compilation process involves several steps:

  1. Parsing and Lexical Analysis: The source code is parsed and converted into a syntax tree or an intermediate representation (IR).
  2. IR Optimization: The IR is optimized to eliminate redundancies, perform constant folding, and apply other optimizations.
  3. Just-in-Time Compilation: The optimized IR is translated into machine code specific to the target platform.
  4. Code Execution: The compiled machine code is executed, providing significant performance improvements over interpreted code.

JIT compilers offer the advantage of dynamically optimizing code based on runtime information, enabling them to adapt to specific execution patterns and make tailored optimizations.

Tutorial: What is Numba?

Numba is a just-in-time (JIT) compiler specifically designed for Python. It aims to enhance the performance of Python code by compiling it to efficient machine code, thus eliminating the overhead associated with Python’s interpreted execution.

Numba achieves this by leveraging the LLVM compiler infrastructure. When a Python function decorated with @jit is encountered, Numba analyzes the function’s bytecode and performs type inference to determine the optimal data types to use. Numba then generates optimized machine code for the identified hotspots, replacing the interpreted execution with compiled execution.

Numba offers two compilation modes:

  1. Nopython Mode, is the default compilation mode in Numba and aims to achieve the highest performance gains. When a function is compiled in nopython mode, Numba tries to generate machine code without relying on the Python runtime. In this mode, the function and its dependencies must be written in a subset of Python that can be fully compiled to machine code.
  2. Object mode, on the other hand, provides more flexibility at the cost of potential performance optimizations. In this mode, Numba retains the full Python runtime semantics and falls back to using Python objects and runtime calls when necessary. Object mode is useful when working with code that cannot be fully compiled to machine code due to dynamic or unsupported Python features.

Unlike some other JIT compilers, Numba seamlessly integrates with NumPy, allowing efficient execution of NumPy array operations. By combining the power of NumPy’s vectorized operations with Numba’s JIT compilation, developers can achieve significant speedups in numerical computations.

Furthermore, Numba supports parallel execution through the prange function, enabling developers to take advantage of multi-threading. By parallelizing loops or array operations, Numba distributes the workload across multiple threads, further enhancing performance in scenarios where parallelization is applicable.

Installing and Importing Numba

To begin with, let’s install Numba on your system. Numba can be easily installed using pip or conda package managers. Open your terminal and run the appropriate command based on your preferred package manager. For pip, use:

pip install numba

For conda, use:

conda install numba

Once installed, import the necessary modules and functions for Numba in your Python script or notebook. Importing Numba is as simple as:

import numbafrom numba import jit

Numba Tutorial: Understanding the Inner Workings of Numba

Numba’s core functionality revolves around the @jit decorator, which stands for “just-in-time.” By applying the @jit decorator to Python functions, Numba compiles them to machine code, resulting in performance improvements. Here’s an example:

import numbafrom numba import jit# Use nopython=True where possible for best performance@jit(nopython=True)def square(x): return x ** 2

In this case, the square function will be compiled by Numba for optimized execution. If we run some benchmarks for comparing Numba vs regular Python, we get the following results.

Time taken with Numba JIT: 0.9326467 secondsTime taken without Numba JIT: 0.0000041 seconds

As you can see here, Numba performs much worse than the regular Python code. Why is this?

We mentioned earlier, that Numba does an initial compile when executing an function. This initial compile adds some overhead to the execution for a single function call using Numba. Is there a way of checking how long it takes for compiling the Numba function? Yes there is!

All we need to do is run a second execution, and measure the time of the second execution. After performing some benchmarks, we were left with the following results.

Numba JIT (compilation + execution) = 0.5045291Numba JIT (only execution) = 0.0000022Normal Python = 0.0000033

As we can see here, the execution time of the compiled Numba JIT is a tiny fraction of the compilation. It is almost 50% faster than the native Python code, even though the code is extremely simple with little room optimization.

We will present a solution to this “compilation time” problem in the next section.

For reference, here is the benchmarking code.

@jit(nopython=True)def square(x): return x ** 2def normal_square(x): return x ** 2x = 100start = time.perf_counter()square(x)end = time.perf_counter()print(f"Numba JIT (compilation + execution) = {end - start:.7f}")start = time.perf_counter()square(x)end = time.perf_counter()print(f"Numba JIT (only execution) = {end - start:.7f}")start = time.perf_counter()normal_square(x)end = time.perf_counter()print(f"Normal Python = {end - start:.7f}")

Numba Tutorial: Explicit Type declaration

In Python, variables are dynamically typed, meaning their types can change at runtime. While this flexibility is convenient, it can also result in performance overhead. Numba mitigates this by allowing developers to explicitly specify the types of variables, enabling the compiler to generate specialized machine code tailored to those types.

Explicitly typing variables in Numba offers several advantages (Read carefully)

  1. Improved Performance: When Numba has type information, it can generate highly optimized machine code that bypasses Python’s dynamic typing system. This leads to significant performance improvements, especially in computationally intensive tasks.
  2. Reduced Overhead: With type information, Numba eliminates the need for runtime type checks and conversions. In simpler terms, this means that the compilation to machine code will occur during the very beginning of the program execution, not when the function is first called.
  3. Code Safety: Explicit typing helps catch potential errors at compile time, allowing you to identify and fix type-related issues early in the development process.

Here is a code snippet from the documentation of Numba, to which we will apply “explicit type hinting” and compare the differences.

@jit(nopython=True)def go_fast(a): # Function is compiled to machine code when called the first time trace = 0.0 for i in range(a.shape[0]): # Numba likes loops trace += np.tanh(a[i, i]) # Numba likes NumPy functions return a + trace # Numba likes NumPy broadcasting

Benchmarking this code, produces the following results:

Numba JIT (compilation + execution) = 0.6488045Numba JIT (only execution) = 0.0000237Normal Python = 0.0001621

Now let’s add some type information.

@jit('float64[:,:](float64[:,:])', nopython=True)def go_fast(a): trace = 0.0 for i in range(a.shape[0]): trace += np.tanh(a[i, i]) return a + trace 

Benchmarking this updated code, produces the following results:

Numba JIT (compilation + execution) = 0.0002219Numba JIT (only execution) = 0.0000535Normal Python = 0.0010343

As we can see here, the compilation + execution and only execution times are very similar to each other now. There is some randomness in the results, which is causing their values to be different. But the main point here is that the compilation time has disappeared (leaving only the execution time).

So where did the compilation time go?

The compilation was handled in the very beginning of the program, before this function was ever called. Since Numba had the type information available, it was able to generate the machine code before hand. Otherwise it would have to wait until the function call was made, determine the types being used, and then generate the machine code based on that information.

Numba Tutorial: Parallelization

Numba also excels at optimizing loops and array operations. By using Numba’s capabilities to parallelize execution, you can further boost performance by leveraging multiple threads. Here’s an example showcasing the parallelization with Numba:

from numba import jitimport numpy as npfrom numba import prange, int64@jit(parallel=True, nopython=True)def parallel_sum(a): result = 0 for i in prange(len(a)): result += a[i] return resultx = np.random.randint(0, 1001, size=1000, dtype=np.int64)

By benchmarking the above code, we get the following timings.

Numba JIT (compilation + execution) = 0.8292531Numba JIT (only execution) = 0.0000407Normal Python = 0.0003943

As we can see from these benchmarks, Numba proved to be 10x times faster than Python. The only downside here is the initial compilation time.

Another important thing to keep in mind is the overhead introduced by parallel code. For example, in the above code, for arrays of length 1000 or less, parallel=False yields better performance. For larger sized arrays (10000+) parallel=True performs better.

With an array of size 1,000,000 we observed the following benchmarks:

Numba JIT (parallel + execution only) = 0.0002723Numba JIT (execution only) = 0.0005995

When to use Numba, and when Not to:

Numba is typically used when there is a need for accelerating the execution of numerical computations in Python. It is especially useful when working with large arrays, mathematical operations, and loops. Numba achieves performance improvements by just-in-time (JIT) compiling the Python code, resulting in efficient execution on the CPU or GPU.

Don’t forget our lesson from earlier where we learned not to use Numba on a small operations, as the overhead makes it slower.

However, Numba’s focus is primarily on computation, and it does not provide significant optimizations for I/O operations such as reading or writing files. Therefore, if your code involves extensive I/O operations, other libraries or approaches may be more suitable.

Common Challenges and Solutions with Numba

While Numba provides substantial performance gains, there are a few challenges that users may encounter. One common challenge is compatibility issues with CPython, the reference implementation of Python. Numba’s just-in-time compilation relies on low-level LLVM infrastructure, which may not support all Python features. In such cases, alternative solutions or workarounds are required.

For instance, Numba does not support certain Python constructs, such as generators or some types of nested functions, directly within JIT-compiled functions. To overcome this, it is recommended to refactor the code or use Numba’s object mode, which allows more flexibility but may not offer the same level of optimization.

Additionally, Numba’s performance heavily relies on type inference. Explicit typing can enhance performance further, but it may require additional effort. Sometimes, Numba’s type inference may fail due to complex code logic or unsupported Python constructs, requiring manual type specification. By using explicit typing, you can provide type information to Numba, ensuring optimal performance.

This marks the end of the Numba Tutorial. Any suggestions or contributions for CodersLegacy are more than welcome. Questions regarding the tutorial content can be asked in the comments section below.

Share on FacebookTweet

Numba Tutorial: Accelerating Python Code with JIT Compilation (3)Follow us

Numba Tutorial: Accelerating Python Code with JIT Compilation (2024)

FAQs

How does Numba speed up Python code? ›

Using Numba, the right way

It needs to operate on whole arrays (so-called “vectorization”) so that it doesn't use slow Python code. From an algorithm perspective, we can convert each pixel individually. By using Numba the right way, our code is both 5× faster and far more memory efficient.

Is Jax faster than Numba? ›

The naive approach of just substituting the jit lines clearly doesn't work well, as JAX runs very slowly (20 s vs 121 ms for numba). The Julia code is exceptionally fast: if I am interpreting the benchmark.

Is Numba faster than NumPy? ›

Numba overhead

As shown, after the first call, the Numba version of the function is faster than the Numpy version. In the same time, if we call again the Numpy version, it take a similar run time. This demonstrates well the effect of compiling in Numba .

Is Numba faster than Cython? ›

In terms of raw performance, both Numba and Cython can significantly speed up Python code. However, the choice between the two often depends on the specific use case and the type of code being optimized. Numba's Strengths: Easy to use, with a simple syntax.

How do I compile Python code to run faster? ›

25 Ways to Speed Up Python Code
  1. Embrace Django. ...
  2. Use PyPy Instead of CPython. ...
  3. Use NumPy Arrays Instead of Lists. ...
  4. Use the Built-in “timeit” Module. ...
  5. Apply Generator Expressions Instead of List Comprehensions. ...
  6. Use Multiprocessing in Python Coding. ...
  7. Apply Python Profiling. ...
  8. Optimize Loops with Code Maps.
Aug 17, 2023

Is Numba faster than Julia? ›

However, Julia is still more than 3X faster than Numba, in part due to SIMD optimizations enabled by LoopVectorization. jl. But most importantly, Numba breaks down when we add a minimal higher-level construction.

Is Numba using GPU? ›

Numba is an open-source, just-in-time compiler for Python code that developers can use to accelerate numerical functions on both CPUs and GPUs using standard Python functions.

What is the difference between Pythran and Numba? ›

Q: What's the difference in target applications of Pythran compared to Cython and Numba? Unlike Cython and Numba, Pythran tries hard to optimize high level code (no explicit loops etc). It also supports most of OpenMP3 for Python. And its inputs remain 100% backward compatible with Python.

What does jit do in JAX? ›

In this section, we will further explore how JAX works, and how we can make it performant. We will discuss the jax. jit() transformation, which will perform Just In Time (JIT) compilation of a JAX Python function so it can be executed efficiently in XLA.

Can Numba speed up Pandas? ›

In this part of the tutorial, we will investigate how to speed up certain functions operating on pandas DataFrame using Cython, Numba and pandas. eval() . Generally, using Cython and Numba can offer a larger speedup than using pandas.

Is Numba faster than Fortran? ›

This takes about 1 second to compile, and about 4.5 seconds to run, which is almost double the time of the Fortran version. I have no idea about Fortran, but the Numba code can certainly be made much, much faster.

Is Numba any good? ›

numba is good for writing loops or doing other things you can pretty much imagine as simple C code. while it is pretty cool, it's also a bit awkward thinking about machine structures and machine types in high level python. there are some gotchas with respect to the automatic type inference.

What can Numba speed up? ›

With Numba, you can speed up all of your calculation focused and computationally heavy python functions(eg loops). It also has support for numpy library!

Why not always use Cython? ›

Some downsides to Cython. Unfortunately, since Cython is in the end just a thin layer over C or C++, it inherits all the problems that those languages suffer from. And then it adds some more problems of its own.

What code is faster than Python? ›

Java is said to be much faster than python but there are different ways to run these languages (e.g. pypy for python, maybe dalvik for java).

Does Numba speed up pandas? ›

In this part of the tutorial, we will investigate how to speed up certain functions operating on pandas DataFrame using Cython, Numba and pandas. eval() . Generally, using Cython and Numba can offer a larger speedup than using pandas.

How vectorization speeds up your Python code? ›

Speed: Vectorized operations are significantly faster than traditional loop-based approaches. This is because NumPy is optimized to carry out element-wise operations on arrays efficiently. Readability: Vectorization simplifies your code and enhances its clarity.

How to increase Python code performance? ›

9 tips to improve Python performance
  1. Select correct data types.
  2. Know standard functions, methods and libraries.
  3. Find performance-focused libraries.
  4. Understand the different comprehensions.
  5. Use generator functions, patterns and expressions.
  6. Consider how to process large data.
  7. Run profiles to identify problematic code.
Oct 4, 2023

What does Numba do in Python? ›

Accelerate Python Functions

Numba translates Python functions to optimized machine code at runtime using the industry-standard LLVM compiler library. Numba-compiled numerical algorithms in Python can approach the speeds of C or FORTRAN.

Top Articles
These Video Games Are Fun for the Whole Family
Skip the Game: Concept and Implications - Skip The Games
Hub.vacation Club.com
Ogre From Halloweentown
Social Security Administration Lubbock Reviews
24 Hour Car Wash Queens Ny
Air Chat En Espanol
Between Friends Comic Strip Today
Amazon Ups Drop Off Locations Near Me
Dr Frita Mcrae Fisher Husband
Umass Medhub
Craig Woolard Net Worth
8x20, 8x40 Shipping containers storage container for rent or sale - general for sale - by dealer - craigslist
What You Need to Know About County Jails
Www.1Tamilmv.con
Rooms for rent in Pompano Beach, Broward County, FL
Pokewilds Wiki
Bullocks Grocery Weekly Ad
Skyward New Richmond Wi
Great Clips Coupons → 20% Off | Sep 2024
Word Jam 1302
Dyi Urban Dictionary
Animal Eye Clinic Huntersville Nc
Pole Barns 101: Everything You Need to Know - Big Buildings Direct
2010 Ford F-350 Super Duty XLT for sale - Wadena, MN - craigslist
Exploring Green-Wood Cemetery: New York Citys First Garden Cemetery | Prospect Park West Entrance,Brooklyn,11218,US | October 6, 2024
Space Coast Rottweilers
Ok Google Zillow
Harris Teeter Weekly Ad Williamsburg Va
Cavender's Boot City Killeen Photos
Reasonabiu
Course schedule | Fall 2022 | Office of the Registrar
Age Gabriela Moura's Evolution from Childhood Dreams to TikTok Fame - Essential Tribune
Lufthansa LH456 (DLH456) from Frankfurt to Los Angeles
Baldurs Gate 3 Igg
Lincoln Financial Field Section 110
Cbs Scores Mlb
Dumb Money Showtimes Near Maya Cinemas Salinas
Boise Craigslist Cars And Trucks - By Owner
Iconnect Seton
Americas Cardroom Promo Code For Existing Users
Dust Cornell
New York Sports Club Carmel Hamlet Photos
Watch ESPN - Stream Live Sports & ESPN Originals
China Rose Plant Care: Water, Light, Nutrients | Greg App 🌱
Tacos Diego Hugoton Ks
Kens5 Great Day Sa
The Stock Exchange Kamas
911 Active Calls Caddo
5613192063
Omaha World-Herald from Omaha, Nebraska
Truck Trader Pennsylvania
Latest Posts
Article information

Author: Dong Thiel

Last Updated:

Views: 6088

Rating: 4.9 / 5 (79 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Dong Thiel

Birthday: 2001-07-14

Address: 2865 Kasha Unions, West Corrinne, AK 05708-1071

Phone: +3512198379449

Job: Design Planner

Hobby: Graffiti, Foreign language learning, Gambling, Metalworking, Rowing, Sculling, Sewing

Introduction: My name is Dong Thiel, I am a brainy, happy, tasty, lively, splendid, talented, cooperative person who loves writing and wants to share my knowledge and understanding with you.