[edu] ProcessPoolExecutor (프로세스 풀 실행자)

티스토리 뷰

python lecture/concurrency

[edu] ProcessPoolExecutor (프로세스 풀 실행자)

burningrizen 2019. 2. 27. 11:29

ProcessPoolExecutor 은 기본적으로 ThreadPoolExecutor 와 동일한 기능으로 사용된다.

ThreadPoolExecutor 와 마찬가지로 Executor 클래스의 하위 클래스로 많은 메소드가 동일하게 나타난다.

[constructor]

ProcessPoolExecutor 를 생성하는 과정은 concurrent.futures 모듈에서 클래스를 불러오는 부분과 다음과 같이 생

성자 객체 인스턴스화 하는거 빼고는 ThreadPoolExecutor 와 동일하다.

from concurrent.futures import ProcessPoolExecutor
import os
import time


def task():
    print(f"executing out task on process {os.getpid()}")
    time.sleep(1)


def main():
    executor = ProcessPoolExecutor(2)
    task1 = executor.submit(task)
    task2 = executor.submit(task)


if __name__ == '__main__':
    main()

2개의 프로세스로 풀을 생성하고 2개의 테스크를 취하였다.

sys 모듈에서 getpid 함수를 통해서 각각 다른 pid(process id) 를 출력하는 것을 확인 할 수 있다.

task() 함수 안에 sleep()을 주어야 일정 시간의 간격이 생겨서 어느 정도 분산이 되어서 다른 프로세스로 작업 되는 것을 확인 할수 있다.

[context manager]

with 명령어로 컨텍스트 관리자 구현할 수 있다.

컨텍스트 관리자는 특정 자원의 할당과 릴리스를 관리하며 프로세스 풀을 다루는데 더 좋은 방법 이다.

이러한 문법은 그 밖의 메소드보다 우수하며 훨씬 가독성 좋은 코드를 작성할 수 있다.

__enter__, __exit__ 미리 정의되어 있다.

from concurrent.futures import ProcessPoolExecutor
import os
import time


def task():
    print(f"executing out task on process {os.getpid()}")
    time.sleep(1)


def main():
    with ProcessPoolExecutor(2) as executor:
        task1 = executor.submit(task)
        task2 = executor.submit(task)
    print(f"all tasks complete")

if __name__ == '__main__':
    main()

[performance]

thread pool 과 process pool 의 차이를 이해하고 어떤 용도에 사용해야 하는지 알아보자.

아무것도 하지 않은 방법과 thread pool 사용했을 때와 process pool 사용했을 때를 각각 비교해 보자

timeit 을 통해서 정확한 시간을 측정해보자.

사용할 함수는 소수를 구하는 클로저로 테스트 해봤다.

import math
import timeit
from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor


def count_prime(n):
    def is_prime(x):
        return bool([i for i in range(2, int(math.sqrt(x))+1) if not n % i])
    return len([i for i in range(2, n+1) if is_prime(i)])


def main():
    values = [i for i in range(9900, 10000)]

    start = timeit.default_timer()
    print([count_prime(value) for value in values])
    end = timeit.default_timer()
    print(f"normal case: delay={end-start}")

    start = timeit.default_timer()
    with ThreadPoolExecutor(4) as executor:
        results = executor.map(count_prime, values)
        print(list(results))
    end = timeit.default_timer()
    print(f"ThreadPoolExecutor case: delay={end-start}")

    start = timeit.default_timer()
    with ProcessPoolExecutor(4) as executor:
        results = executor.map(count_prime, values)
        print(list(results))
    end = timeit.default_timer()
    print(f"ProcessPoolExecutor case: delay={end-start}")


if __name__ == '__main__':
    main()

단일 스레드: 3.7 초

스레드 풀: 3.65초

프로세스 풀: 2.08초

스레드 풀이 단일 스레드 보다 비슷하거나 약간 느리다.

스레드 풀을 생성하는 시간이 추가로 발생해서다.

스레드는 IO 에서는 이득을 얻을 수 있으나 이외의 경우에서는

병렬성보다는 동시성을 목적으로 사용하는게 좋을 거 같다.

프로세스 풀은 약 60퍼센트 이상의 속도 향상을 기대할 수 있다.

입력되는 데이터와 알고리즘에 따라 차이는 있을 수 있고 오히려 프로세스 풀이 더 느릴수도 있다.

하지만 대량의 데이터를 다루고 연산할게 많다면 프로세스 풀을 이용해서 성능 향상을 기대해 볼수 있다.

저작자표시

'python lecture > concurrency' 카테고리의 다른 글

[edu] sync, async, multi thread (동기, 비동기) (0)	2019.03.04
[edu] multiprocessing (멀티프로세싱) (0)	2019.02.27
[edu] ThreadPoolExcutor (스레드 풀 실행자) (0)	2019.02.26
[edu] thread condition (컨디션, 스레드) (0)	2019.02.25
[edu] thread R lock (스레드 R 락) (0)	2019.02.25

공지사항

최근에 올라온 글

최근에 달린 댓글

Total

Today

Yesterday

링크

TAG more

« 2025/03 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

글 보관함

티스토리 뷰

[edu] ProcessPoolExecutor (프로세스 풀 실행자)

'python lecture > concurrency' 카테고리의 다른 글

티스토리툴바