In this series of articles, I present some Python features that cause the execution time to be extended and some popular workarounds to make your code run faster.
In the first part, I explain why Python is slower than low-level languages such as C or C++ and other high-level languages like Java or C#. Let’s dive into the Python compilation and execution process details to understand that. It is important to remember that such processes can differ for the Python language depending on its implementation. Being ‘interpreted’ or ‘compiled’ isn’t a language characteristic at all: it’s about the language implementation. In various Python implementations, the same chunk of code can be executed differently. This article focuses on the program execution based on the most popular Python implementation in the C language – CPython.
Python is dynamic by design.
In Python, the interpreter implicitly compiles a source code into the bytecode when you run a .py file. This differs from C or Java languages, where you must first run the compiler to compile the code explicitly (and this difference is why people commonly call Python an interpreted language). The Python bytecode consists of much simpler instructions executed one after the other by the Python Virtual Machine (PVM) software, not the CPU itself. The virtual machine can execute bytecode on any platform that allows PVM to run – translate the bytecode into machine code executed directly by a specific CPU.
Why is Python bytecode execution slow? Python is a general-purpose, high-level, and DYNAMIC language by design. Dynamic means that you can change almost everything during runtime. You can declare an x variable and assign an int to it, and later on, you can override that value with a string. Or it is possible to replace the methods on objects at runtime. Consequently, Python Virtual Machine cannot introduce many optimizations to the bytecode execution as the language is dynamic, and it needs to check types correctness at all times (this involves testing whether an object has the demanded attribute defined) and allocate the memory accordingly.
Such features make optimization extremely challenging compared to statically-typed languages like Java or C#. For these languages, the types of variables stay intact, and all types can be checked at compilation time (only once), not runtime (every execution). As a result, further optimizations are possible because their nature is more static, and more information can be assumed.
Python has the Global Interpreter Lock (GIL)
In CPython, each CPU core has a separate GIL instance preventing from running more than one thread simultaneously. It means that multithreading becomes useless in certain situations for Python programs.
It is crucial to understand that modern CPUs usually have more than one core – a physical component of your processor., e.g., four cores. In this case, there will be four GILs running separately with distinct interpreters. Nevertheless, within one core, the GIL causes the execution of precisely one thread, which can be a bottleneck in many situations. All threads ‘running’ on the same interpreter need to wait for the actual running thread to dispose of the GIL.
Why has such a concept been introduced to Python? The answer is memory safety. Each object created in Python has a reference counter, the number of variables that point to this exact place in memory. As this count reaches zero, the memory occupied by the object is released.
Multiple threads share the same memory, particularly the same reference counters. Enabling them to run in parallel may lead to a race condition, where two or more threads would try to change the reference count value of a specific variable. Such situations may result in memory leaks or releasing the memory for a variable still used.
Is Python the worst?
Nowadays, we can add more hardware in many situations to make the computations run faster. On the other hand, the time spent on coding is way more costly. As a result, it is better to use Python, which is more coding friendly than C, which can perform better but needs a considerable time overhead for developers. Python syntax is simplified compared to C or C++, and even Java, making Python easier to learn and use, but offers fewer tools to deal with different problems hiding them from the developers and handling them for us.
To sum up, the design decisions made for the Python language result in speed limitations, which can be hard to mitigate in some scenarios. However, some crucial performance parts of applications can be written in other, more efficient programming languages.
The dynamic nature of Python, the GIL existence, and the overall simplicity tremendously affect its efficiency. Nevertheless, its popularity and ease of coding are substantial benefits that, in many situations, can trump the inconveniences previously mentioned.
In the following articles, we will discuss how to deal with speeding up the Python code, especially for the NumPy library.