Multithreading & Parallel Computing in Python

If you spend too much time waiting for your program to get the result of a research experiment. Or your user complains about your service all the time about waiting too long. Then, this post could provide a helpful tip for you.

You must be wondering what is Multithreading & Parallel Computing and the difference between them.

I use a simple scenario to explain it. Now, you have a bakery store in the city centre. Recently, the order increasing dramatically. So, you’re trying to improve the process.

Multithreading

You check on the baker’s task. There’re three steps, prepare flour, put the cake into the oven and decorate the cake. The single thread like a baker is waiting for the baking cake and does nothing during that time. However, multithread is different. The baker can prepare another flour during the last cake in the oven. So, the makespan is shorter compared to a single thread. However, only when there is a waiting process in the task, multithreading can reduce the makespan.

Parallel Computing

Parallel computing is like you have two bakers to bake cakes. The makespan can be even shorter compared to multithreading. However, you’ve to spend twice resources to do it. And only if you have more than one core, it will work.

You can know how many cores you have in your computer by “Task Manager”. This example is 8.

Okay, you understand what is multithreading and parallel computing. Let’s take a look at the simple python code.

Python Code

Multithreading

The code is following. There’re two functions. One is multithreading and another one is single threading. I use a “for i loop” as preparing flour and decorating cake activities. It takes around 9 seconds. And time.sleep(3) as cake in the oven. As the result, you can see the multithreading is 3 seconds less than a single thread. Because the baker can do another thing during the cake is in the oven.


import asyncio
from concurrent.futures import ThreadPoolExecutor
import time

def Multithreading(inputs):
    def fun(input):
        for i in range(100000000):#prepare flour and decorate cake
            i + 1
        time.sleep(3)#cake in oven
        return input
    return asyncio.gather(*[loop.run_in_executor(
        executor, fun, input
    ) for input in inputs])

def Singlethreading(inputs):
    def fun(input):
        for i in range(100000000):#prepare flour and decorate cake
            i + 1        
        time.sleep(3)#cake in oven       
        return input
    return [fun(input) for input in inputs]

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    executor = ThreadPoolExecutor(max_workers= 2)
    #test single thread makespan
    start_time = time.time()
    result = Singlethreading(['Apple Cake','Banana Cake'])
    end_time = time.time()
    print("the single thread makespan is {:.2f} seconds ".format(end_time - start_time))

    #test multi thread makespan
    start_time = time.time()
    result = loop.run_until_complete(Multithreading(['Apple Cake','Banana Cake']))
    end_time = time.time()
    print("the multi thread makespan is {:.2f} seconds".format(end_time - start_time))

Parallel Computing

Now, let’s try parallel computing to get the result. We compare the makespan between one baker and two bakers. The code is following. The result shows the makespan to make two cakes can be around half shorter than one baker.

from concurrent.futures import ProcessPoolExecutor
import time


def make_cake(inputs):
    def fun(input):
        for i in range(100000000):#prepare flour and decorate cake
            i + 1        
        time.sleep(3)#cake in oven       
        return input
    return [fun(input) for input in inputs]


if __name__ =='__main__':
    executor = ProcessPoolExecutor(max_workers =2)
    #test one baker makespan
    start_time = time.time()
    result = make_cake(['A Cake','B Cake'])
    end_time = time.time()    
    print("the single baker makespan is {:.2f} seconds ".format(end_time - start_time))

    #test two baker makespan
    start_time = time.time()
    result = executor.map(make_cake,[['A Cake'],['B Cake']])
    print(list(result))
    end_time = time.time()
    print("the two bakers makespan is {:.2f} seconds ".format(end_time - start_time))

Congratulations! Now, you can use these two skills to improve your program performance!