Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
915 views
in Technique[技术] by (71.8m points)

locking - python multiprocessing lock issue

I want to add a list of dicts together with python multiprocessing module.

Here is a simplified version of my code:

#!/usr/bin/python2.7
# -*- coding: utf-8 -*-

import multiprocessing
import functools
import time

def merge(lock, d1, d2):
    time.sleep(5) # some time consuming stuffs
    with lock:
        for key in d2.keys():
            if d1.has_key(key):
                d1[key] += d2[key]
            else:
                d1[key] = d2[key]

l = [{ x % 10 : x } for x in range(10000)]
lock = multiprocessing.Lock()
d = multiprocessing.Manager().dict()

partial_merge = functools.partial(merge, d1 = d, lock = lock)

pool_size = multiprocessing.cpu_count()
pool = multiprocessing.Pool(processes = pool_size)
pool.map(partial_merge, l)
pool.close()
pool.join()

print d
  1. I get this error when running this script. How shall I resolve this?

    RuntimeError: Lock objects should only be shared between processes through inheritance

  2. is the lock in merge function needed in this condition? or python will take care of it?

  3. I think what's map supposed to do is to map something from one list to another list, not dump all things in one list to a single object. So is there a more elegant way to do such things?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The following should run cross-platform (i.e. on Windows, too) in both Python 2 and 3. It uses a process pool initializer to set the manager dict as a global in each child process.

FYI:

  • Using a lock is unnecessary with a manager dict.
  • The number of processes in a Pool defaults to the CPU count.
  • If you're not interested in the result, you can use apply_async instead of map.
import multiprocessing
import time

def merge(d2):
    time.sleep(1) # some time consuming stuffs
    for key in d2.keys():
        if key in d1:
            d1[key] += d2[key]
        else:
            d1[key] = d2[key]

def init(d):
    global d1
    d1 = d

if __name__ == '__main__':

    d1 = multiprocessing.Manager().dict()
    pool = multiprocessing.Pool(initializer=init, initargs=(d1, ))

    l = [{ x % 5 : x } for x in range(10)]

    for item in l:
        pool.apply_async(merge, (item,))

    pool.close()
    pool.join()

    print(l)
    print(d1)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...