這篇文章主要介紹了探究Python多進程編程下線程之間變量的共享問題,多進程編程是Python學習進階中的重要知識,需要的朋友可以參考下
1、問題:
群中有同學貼了如下一段代碼,問為何 list 最後打印的是空值?
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 from multiprocessing import Process, Manager import os manager = Manager() vip_list = [] #vip_list = manager.list() def testFunc(cc): vip_list.append(cc) print 'process id:', os.getpid() if __name__ == '__main__': threads = [] for ll in range(10): t = Process(target=testFunc, args=(ll,)) t.daemon = True threads.append(t) for i in range(len(threads)): threads[i].start() for j in range(len(threads)): threads[j].join() print "------------------------" print 'process id:', os.getpid() print vip_list其實如果你了解 python 的多線程模型,GIL 問題,然後了解多線程、多進程原理,上述問題不難回答,不過如果你不知道也沒關系,跑一下上面的代碼你就知道是什麼問題了。
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 python aa.py process id: 632 process id: 635 process id: 637 process id: 633 process id: 636 process id: 634 process id: 639 process id: 638 process id: 641 process id: 640 ------------------------ process id: 619 []將第 6 行注釋開啟,你會看到如下結果:
?
1 2 3 4 5 6 7 8 9 10 11 12 13 process id: 32074 process id: 32073 process id: 32072 process id: 32078 process id: 32076 process id: 32071 process id: 32077 process id: 32079 process id: 32075 process id: 32080 ------------------------ process id: 32066 [3, 2, 1, 7, 5, 0, 6, 8, 4, 9]2、python 多進程共享變量的幾種方式:
(1)Shared memory:
Data can be stored in a shared memory map using Value or Array. For example, the following code
http://docs.python.org/2/library/multiprocessing.html#sharing-state-between-processes
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 from multiprocessing import Process, Value, Array def f(n, a): n.value = 3.1415927 for i in range(len(a)): a[i] = -a[i] if __name__ == '__main__': num = Value('d', 0.0) arr = Array('i', range(10)) p = Process(target=f, args=(num, arr)) p.start() p.join() print num.value print arr[:]結果:
?
1 2 3.1415927 [0, -1, -2, -3, -4, -5, -6, -7, -8, -9](2)Server process:
A manager object returned by Manager() controls a server process which holds Python objects and allows other processes to manipulate them using proxies.
A manager returned by Manager() will support types list, dict, Namespace, Lock, RLock, Semaphore, BoundedSemaphore, Condition, Event, Queue, Value and Array.
代碼見開頭的例子。
http://docs.python.org/2/library/multiprocessing.html#managers
3、多進程的問題遠不止這麼多:數據的同步
看段簡單的代碼:一個簡單的計數器:
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 from multiprocessing import Process, Manager import os manager = Manager() sum = manager.Value('tmp', 0) def testFunc(cc): sum.value += cc if __name__ == '__main__': threads = [] for ll in range(100): t = Process(target=testFunc, args=(1,)) t.daemon = True threads.append(t) for i in range(len(threads)): threads[i].start() for j in range(len(threads)): threads[j].join() print "------------------------" print 'process id:', os.getpid() print sum.value結果:
?
1 2 3 ------------------------ process id: 17378 97也許你會問:WTF?其實這個問題在多線程時代就存在了,只是在多進程時代又杯具重演了而已:Lock!
?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 from multiprocessing import Process, Manager, Lock import os lock = Lock() ma