当前位置:网站首页>multiprocessing. Detailed explanation of pool
multiprocessing. Detailed explanation of pool
2022-06-27 06:33:00 【Startled】
because python Global lock limit , If you want to use multi-core , You need to use multi process modules , But the module has many pits , This article records its usage and the pits it has trodden .
One 、map、apply、apply_async contrast
First post a comparison chart , To quote multiprocessin.pool:
| Multi-args Concurrence Blocking Ordered-results
---------------------------------------------------------------------
Pool.map | no yes yes yes
Pool.apply | yes no yes yes
Pool.apply_async | yes yes no no
Multi-args intend task Can I import different function;Ordered-results Consciousness is whether the result is orderly .
See how to use it :
apply()
import multiprocessing
import os
import time,datetime
# task
def square(n):
print(f'process id is {
os.getpid()}')
if n == 5:
time.sleep(5)
else:
time.sleep(1)
return n*n
def _apply():
pool = multiprocessing.Pool()
for i in l:
res = pool.apply(square, args=(i,))
print(res)
if __name__ == '__main__':
start_time = datetime.datetime.now()
l = [5, 0, 1, 2, 3, 4]
print(f'main process id is {
os.getpid()}')
_apply()
end_time = datetime.datetime.now()
print(' when : ',end_time-start_time)
Output :
main process id is 585033
child process id is 585034
25
child process id is 585035
0
child process id is 585036
1
child process id is 585037
4
child process id is 585038
9
child process id is 585039
16
when : 0:00:11.024689
The whole process took 11s, It takes about the same time as sequential execution , And the calculation result is consistent with the parameter transfer sequence , So we can come to the conclusion that :
pool.apply()It's blocked , Before all child processes return , Will block the main process- Multiple child processes are executed sequentially
further , We can conclude that :
pool.apply()Cannot achieve concurrency . And the reason is that , At the same time , Only one subprocess is actually running the task . therefore , This function is really chicken ribs , I can't think of any scenarios that will be applied to it
apply_async()
def division(n):
print(f'child process id is {
os.getpid()}')
time.sleep(1)
res = 10/n
return res
def _apply_async():
# must close+join, Otherwise, the main process runs out , The subprocess is not finished yet , You're going to report a mistake
pool = multiprocessing.Pool()
for i in l:
# proc_lst.append(pool.apply_async(square, args=(i,)))
pool.apply_async(division, args=(i,), callback=print)
pool.close()
pool.join()
start_time = datetime.datetime.now()
l = [5, 0, 1, 2, 3, 4]
print(f'main process id is {
os.getpid()}')
# _apply()
_apply_async()
end_time = datetime.datetime.now()
print(' when : ',end_time-start_time)
Output :
main process id is 586731
child process id is 586732
child process id is 586733
child process id is 586734
child process id is 586735
child process id is 586736
child process id is 586737
10.0
2.0
5.0
3.3333333333333335
2.5
when : 0:00:01.016798
At first glance , Total time 1s The clock , It shows that concurrency is indeed implemented . We'll find that ,l In all, there are 6 Parameters , But why is one result missing from the output ? This is it. apply_async() Where the pit is located , After in-depth study, it is found that this function has the following characteristics :
- You can see from the name , It's asynchronous . The so-called asynchronous , The object of comparison is the main process , That is, the main process does not have to wait for the results of the child process , You can move on , This feature is achieved by adding
apply_async()The function is designed to be non blocking , When callingapply_async()when , Immediately return a child process object , At this point, the subprocess may not have actually finished running , But it does not affect the main process to continue to execute . apply_async()MediumcallbackThe argument is , When the subprocess is executed , Automatically callapply_async()The function represented by , In the above example isprint, So the results will be printed out . It can also be understood from this example Callback This is the concept . And if you don't show itcallbackParameters , What to do if you want to get the result ? So you need to call thatapply_async().get()了 , But the function is blocked , That is, the main process will be blocked until the child process ends , So if you want to implement concurrency , It is best to start after all child processes , Go again get result .pool.close()andpool.join()What's the usage? ? The former means that the process pool is closed ( Do not receive new processes , But the original process does not affect ), The latter means that the block waits for all child processes to end . Why do we have tojoin? As mentioned before ,apply_async()It's non blocking , If you don't join, It is possible that the main process has finished running and the sub process has not finished yet , Then those subprocesses cannot be recycled , The program will report an error , So there must be join. Some friends still have questions , What then? join You have to close? This is actually the standard way of writing , These two must be used together .- Compared with the parameter transfer sequence , The result is disorder .
- Last question , Why is one of the results in the above example missing ? On closer inspection , Less 0 The corresponding result , because
10/0Illegal exception will pop up . But why didn't you see the error report ? This is one of the pits ,apply_async()Exception in child process of function , The main process is senseless . therefore , When debugging code , Don't feel that everything is all right when you see that there is no error reported , There may be a hidden pit waiting for you !
map()
def _map():
pool = multiprocessing.Pool()
res = pool.map(square, l)
print(res)
if __name__ == '__main__':
start_time = datetime.datetime.now()
l = [5, 0, 1, 2, 3, 4]
print(f'main process id is {os.getpid()}')
# _apply()
# _apply_async()
_map()
end_time = datetime.datetime.now()
print(' when : ',end_time-start_time)
Output :
main process id is 588059
child process id is 588060
child process id is 588061
child process id is 588062
child process id is 588063
child process id is 588064
child process id is 588065
[25, 0, 1, 4, 9, 16]
when : 0:00:06.018487
when 6s about , And the result is one-time , The following conclusions can be drawn :
- map Is to start a child process with the same number of iteratible objects at one time , Therefore, concurrency can be realized
- This function is blocked , That is, wait until all the child processes have been executed , The main process can continue to execute
- The results are orderly .
Two 、 Multi process data sharing Manager
this Manager Another big hole , Use with caution ! Fill the pit when you have time
边栏推荐
- 《汇编语言-王爽》第3章笔记及实验
- LeetCode 0086.分隔链表
- thrift
- IDEA中关于Postfix Completion代码模板的一些设置
- JVM常用指令
- 美摄云服务方案:专为轻量化视频制作场景打造
- Quick personal site building guide using WordPress
- JS to implement bidirectional data binding
- Small program of C language practice (consolidate and deepen the understanding of knowledge points)
- JVM对象组成和存储
猜你喜欢

thrift

快速实现单片机和手机蓝牙通信

Proxy-Reflect使用详解

观测电机转速转矩

JVM common instructions

美摄云服务方案:专为轻量化视频制作场景打造

Quick realization of Bluetooth ibeacn function

Proxy reflect usage details

【QT小作】使用结构体数据生成读写配置文件代码

The restart status of the openstack instance will change to the error handling method. The openstack built by the container restarts the compute service method of the computing node and prompts the gi
随机推荐
HTAP 深入探索指南
【QT小记】QT中正则表达式QRegularExpression的基本使用
机 器 学 习
【QT小点】QT下载链接
G1 and ZGC garbage collector
426-二叉树(513.找树左下角的值、112. 路径总和、106.从中序与后序遍历序列构造二叉树、654. 最大二叉树)
高斯分布Gaussian distribution、线性回归、逻辑回归logistics regression
JVM overall structure analysis
【QT小作】使用结构体数据生成读写配置文件代码
ORA-00909: 参数个数无效,concat引起
Caldera installation and simple use
Assembly language - Wang Shuang Chapter 3 notes and experiments
【入门】正则表达式基础入门笔记
Assembly language - Wang Shuang Chapter 11 flag register - Notes
JVM的垃圾回收机制
Quick realization of Bluetooth ibeacn function
Force buckle 179, max
Free SSH and telnet client putty
C Primer Plus 第11章_字符串和字符串函数_代码和练习题
How to check the frequency of memory and the number of memory slots in CPU-Z?