Introduction to Multiprocessing in Python
Python is a high-level programming language that is widely used in various disciplines of science and technology. It has a vast range of libraries and packages that provide efficient and robust solutions for complex problems.
One of these packages is the multiprocessing package, which allows the simultaneous execution of multiple tasks using multiple central processing units (CPUs). In this article, we will delve into the fundamentals of multiprocessing in Python, including its definition, usage, and various classes within the module.
Understanding Multiprocessing in Python
Multiprocessing in Python refers to using multiple CPUs to execute a program in parallel. It enables the computer to use different CPU cores simultaneously, which speeds up the program’s execution time.
However, it is essential to distinguish multiprocessing from multithreading, which operates differently on the CPU. Multithreading is a technique used to parallelize a program by dividing it into multiple threads that run concurrently on a single CPU core.
It intends to improve the user experience by ensuring that the user interface remains responsive while the program is running in the background. On the other hand, multiprocessing utilizes multiple CPUs to complete the task faster by dividing it into multiple independent processes.
Classes in Multiprocessing Module
The multiprocessing package includes various classes that allow us to implement multiprocessing in our Python programs. We will explore some of these classes below.
Process Class
The Process class is the most fundamental class in the multiprocessing package. It is responsible for forking a new copy of the program as an individual process with a unique process ID (PID).
The Process class can utilize multiple CPUs by creating several processes that run simultaneously. The Process class has two primary methods: start() and join().
The start() method is responsible for initiating the process, whereas the join() method blocks the parent process until the child process completes its execution. The join() method is useful when you want to synchronize the processes’ output.
For example, if you want to print the result of a computation done across different processes, you can use the join() method to wait for each process to complete and return the result.
Lock Class
The Lock class is a synchronization mechanism provided by the multiprocessing package. It helps prevent multiple processes from accessing a shared resource simultaneously.
The Lock class is created using the Lock() function from the multiprocessing module. This class provides two primary methods: acquire() and release().
The acquire() method is used to acquire the lock, while the release() method is used to release the lock after a critical section of code completes its execution. The Lock class is useful when working with concurrent processes to ensure that shared resources are accessed one at a time.
Queue Class
The Queue class is a data structure provided by the multiprocessing package that allows multiple processes to share data more efficiently. The Queue class follows the First-In-First-Out (FIFO) method of managing data, with two primary methods: put() and get().
The put() method adds an item to the queue, while the get() method retrieves an item from the queue. The Queue class is useful when working with dependent processes to pass data between the processes and synchronize their execution.
The Queue class can also help alleviate data loss or waiting time caused by large amounts of data resulting from one process.
Pool Class
The Pool class is a convenient way to perform parallel computing in Python. It allows you to execute a function with multiple input values in parallel across multiple CPUs. The Pool class provides the map() method, which applies the function to each input value and returns the result.
The Pool class can be created using the Pool() function from the multiprocessing module. The Pool class will choose the most efficient use of the CPUs by dividing the input values into smaller sub-tasks that are assigned to different processes.
The result of each task is then collected, and the final result is returned.
Conclusion
The multiprocessing package provides efficient and robust solutions for executing multiple tasks in Python simultaneously. It enables programmers to utilize the power of multiple CPUs to speed up program execution times.
In this article, we explored some of the fundamental classes in the multiprocessing module, including the Process class, Lock class, Queue class, and Pool class. These classes provide various methods and functions to aid with parallel computing in Python.
By mastering the multiprocessing package, programmers can design programs that can handle multiple tasks concurrently, improving program execution time while maintaining performance reliability.
Examples with Code and Output
In our previous section, we have discussed the different classes involved in multiprocessing in Python. In this section, we will explore their implementation and usage through code examples.
We will provide practical examples demonstrating the use of the start() function for the Process class, the Lock class, the Queue class, and the Pool class. Example with start() Function for
Process Class
The start() function is used to initiate the process that executes the target function.
In this example, we will demonstrate the usage of the start() function by implementing a simple function that prints squares of numbers using two processes.
import multiprocessing as mp
def sqr(num):
print(num*num)
if __name__ == '__main__':
p1 = mp.Process(target=sqr, args=(2,))
p2 = mp.Process(target=sqr, args=(3,))
p1.start()
p2.start()
p1.join()
p2.join()
In this example, the sqr() function is used to print the square of numbers. We have used the Process class of the multiprocessing module to create two processes p1 and p2.
The start() function is used to initiate the execution of these processes. The join() function ensures that the parent process waits until the child processes complete their execution.
Output:
4
9
Example with
Lock Class:
The Lock class is used when multiple processes are trying to access shared resources. In this example, we implement a simple printer function that uses the Lock class to synchronize multiple processes trying to print numbers to the console.
import multiprocessing as mp
def printer(lock, nums):
lock.acquire()
for num in nums:
print(num)
lock.release()
if __name__ == '__main__':
lock = mp.Lock()
nums1 = [1, 2, 3, 4, 5]
nums2 = [6, 7, 8, 9, 10]
p1 = mp.Process(target=printer, args=(lock, nums1))
p2 = mp.Process(target=printer, args=(lock, nums2))
p1.start()
p2.start()
p1.join()
p2.join()
In this example, we have implemented a printer function that accepts a lock object and a list of numbers. The lock is acquired by each individual process when accessing the shared printer function.
The lock object prevents the two processes from accessing printer (which would cause overlapping of the printed output) by introducing a mutually exclusive block of code. Output:
1
2
3
4
5
6
7
8
9
10
Example with
Queue Class
The Queue class provides a way to share data between multiple processes. In this example, we demonstrate how the Queue class can be used to implement a parallelized calculation of a series of squares.
import multiprocessing as mp
def sqr(num, queue):
queue.put(num*num)
if __name__ == '__main__':
queue = mp.Queue()
processes = []
for i in range(1, 6):
p = mp.Process(target=sqr, args=(i, queue,))
p.start()
processes.append(p)
for p in processes:
p.join()
result = [queue.get() for p in processes]
print(result)
In this example, we have defined a sqr() function that accepts a number and a queue object. The object is primarily used to store the results following the execution of the sqr() function.
The for loop generates multiple processes to calculate the squares of the values from 1 to 5 simultaneously. The result is then saved in the queue object.
Output:
[1, 4, 9, 16, 25]
Example with
Pool Class
The Pool class provides a simplified way of performing parallel computing in Python. In this example, we have used the Pool class to implement the concurrent execution of a function on multiple parameters.
import multiprocessing as mp
def my_func(num):
return num*num
if __name__ == '__main__':
pool = mp.Pool(mp.cpu_count())
inputs = [1, 2, 3, 4, 5]
result = pool.map(my_func, inputs)
print(result)
In this example, we have implemented a simple my_func() function that accepts a parameter and performs a calculation. The Pool() function initializes a pool of workers that is equal to the CPU count of the system.
A list of inputs is created, and the pool function map() applies the function to the inputs. Output:
[1, 4, 9, 16, 25]
Conclusion
Multiprocessing in Python is an efficient way to utilize CPU cores and improve performance. In this article, we have discussed the fundamental concepts of multiprocessing in Python, including the classes involved, and their implementation with code examples.
The use of the start() function for the Process class, the Lock class, the Queue class, and the Pool class has been demonstrated through practical examples. By understanding and effectively using these classes, we can design Python programs that can handle multiple tasks concurrently, improving program execution time and performance reliability.
In conclusion, this article discussed the importance of multiprocessing in Python to utilize the power of multiple CPU cores, hence improving program execution time and performance. We delved into the fundamental classes with the most common ones being the Process class, Lock class, Queue class, and Pool class.
The classes are essential for synchronous code execution, data synchronization, and parallel computing. By implementing these Python modules, programmers can design effective programs that handle multiple tasks simultaneously, reducing waiting time while ensuring program performance reliability.
This article should have provided readers with the necessary foundation to start applying multiprocessing techniques to their Python projects.