Python Concurrency#

Ref1 Ref1

  1. What is Concurrency ?

    • concurrency is simultaneous occurrence. In Python, the things that are occurring simultaneously are called by different names (thread, task, process)

    • only multiprocessing actually runs these trains of thought at literally the same time. Threading and asyncio both run on a single processor and therefore only run one at a time.

  2. Difference between threading & Asyncio?

    • The way the threads or tasks take turns is the big difference between threading and asyncio.

    • In threading, the operating system actually knows about each thread and can interrupt it at any time to start running a different thread. This is called pre-emptive multitasking since the operating system can pre-empt your thread to make the switch.

    • Asyncio, on the other hand, uses cooperative multitasking. The tasks must cooperate by announcing when they are ready to be switched out. That means that the code in the task has to change slightly to make this happen.

    • Ony 1 CPU processor used for both threading & asyncio

  3. What is Parallelism?

    • A process here can be thought of as almost a completely different program, though technically they’re usually defined as a collection of resources where the resources include memory, file handles and things like that. One way to think about it is that each process runs in its own Python interpreter.

    • Multi-processors of CPU used.

Concurrency Type

Switching Decision

Number of Processors

Pre-emptive multitasking (threading)

The operating system decides when to switch tasks external to Python.

1

Cooperative multitasking (asyncio)

The tasks decide when to give up control.

1

Multiprocessing (multiprocessing)

The processes all run at the same time on different processors.

Many

Asyncio#

  • Event Loop, managing states. Task has full control.

  • Async, Await type keywords.

  • any function that calls await needs to be marked with async. You’ll get a syntax error otherwise.

Multi-Threading#

  • A thread is an entity within a process that can be scheduled for execution. Also, it is the smallest unit of processing that can be performed in an OS (Operating System)

  • Multithreading is defined as the ability of a processor to execute multiple threads concurrently.

[1]:
# Python program to illustrate the concept
# of threading
# importing the threading module
import threading

def print_cube(num):
    """
    function to print cube of given num
    """
    print("Cube: {}".format(num * num * num))

def print_square(num):
    """
    function to print square of given num
    """
    print("Square: {}".format(num * num))

if __name__ == "__main__":
    # creating thread
    t1 = threading.Thread(target=print_square, args=(10,))
    t2 = threading.Thread(target=print_cube, args=(10,))

    # starting thread 1
    t1.start()
    # starting thread 2
    t2.start()

    # wait until thread 1 is completely executed
    t1.join()
    # wait until thread 2 is completely executed
    t2.join()

    # both threads completely executed
    print("Done!")

Square: 100
Cube: 1000
Done!

Synchronization#

  • Ref

  • race condition, lock: aquire, release

[3]:
import threading

# global variable x
x = 0

def increment():
    """
    function to increment global variable x
    """
    global x
    x += 1

def thread_task():
    """
    task for thread
    calls increment function 100000 times.
    """
    for _ in range(100000):
            increment()

def main_task():
    global x
    # setting global variable x as 0
    x = 0

    # creating threads
    t1 = threading.Thread(target=thread_task)
    t2 = threading.Thread(target=thread_task)

    # start threads
    t1.start()
    t2.start()

    # wait until threads finish their job
    t1.join()
    t2.join()

if __name__ == "__main__":
    for i in range(10):
            main_task()
            print("Iteration {0}: x = {1}".format(i,x))

Iteration 0: x = 163865
Iteration 1: x = 189329
Iteration 2: x = 200000
Iteration 3: x = 200000
Iteration 4: x = 200000
Iteration 5: x = 172399
Iteration 6: x = 200000
Iteration 7: x = 200000
Iteration 8: x = 164975
Iteration 9: x = 200000

Using Locks to perform synchronization for expected outputs fo 200000#

[5]:
import threading

# global variable x
x = 0

def increment():
    """
    function to increment global variable x
    """
    global x
    x += 1

def thread_task(lock):
    """
    task for thread
    calls increment function 100000 times.
    """
    for _ in range(100000):
            lock.acquire()
            increment()
            lock.release()

def main_task():
    global x
    # setting global variable x as 0
    x = 0

    # creating a lock
    lock = threading.Lock()

    # creating threads
    t1 = threading.Thread(target=thread_task, args=(lock,))
    t2 = threading.Thread(target=thread_task, args=(lock,))

    # start threads
    t1.start()
    t2.start()

    # wait until threads finish their job
    t1.join()
    t2.join()

if __name__ == "__main__":
    for i in range(10):
            main_task()
            print("Iteration {0}: x = {1}".format(i,x))

Iteration 0: x = 200000
Iteration 1: x = 200000
Iteration 2: x = 200000
Iteration 3: x = 200000
Iteration 4: x = 200000
Iteration 5: x = 200000
Iteration 6: x = 200000
Iteration 7: x = 200000
Iteration 8: x = 200000
Iteration 9: x = 200000

Multi-Processing#

  • At a high level, it does this by creating a new instance of the Python interpreter to run on each CPU and then farming out part of your program to run on it.

[3]:
import multiprocessing
import time


def cpu_bound(number):
    sum_=sum(i * i for i in range(number))
    # print(sum_)
    return sum_


def find_sums(numbers):
    with multiprocessing.Pool() as pool:
        pool.map(cpu_bound, numbers)


if __name__ == "__main__":
    numbers = [5_000_000 + x for x in range(20)]

    start_time = time.time()
    print(find_sums(numbers))
    duration = time.time() - start_time
    print(f"Duration {duration} seconds")
None
Duration 6.036993503570557 seconds
[9]:

[9]:
[5000000,
 5000001,
 5000002,
 5000003,
 5000004,
 5000005,
 5000006,
 5000007,
 5000008,
 5000009,
 5000010,
 5000011,
 5000012,
 5000013,
 5000014,
 5000015,
 5000016,
 5000017,
 5000018,
 5000019]