Disclamer: I will help you and will provide you with a working example but I am not an expert in this topic.
Point 1 has been answered here to some extent.
Point 2 has been answered here to some extent.
I used different options in the past for CPU-bound tasks in python and here is one toy example for you to follow:
from multiprocessing import Process, Queue
import time, random
def do_something(n_order, x, queue):
time.sleep(5)
queue.put((idx, x))
def main():
data = [1,2,3,4,5]
queue = Queue()
processes = [Process(target=do_something, args=(n,x,queue)) for n,x in enumerate(data)]
for p in processes:
p.start()
for p in processes:
p.join()
unsorted_result = [queue.get() for _ in processes]
result = [i[1] for i in sorted(unsorted_result)]
print(result)
You can write the same but in a loop instead of using queues and check the time consumed (in this silly case is the sleep, for testing purposes) and you will realized that you shortened the time approximately by the number of processes that you run, as expected.
In fact, this is the results in my computer for the exact script that I provide you with (first multiprocess and the second loop):
[1, 2, 3, 4, 5]
real 0m5.240s
user 0m0.397s
sys 0m0.260s
[1, 4, 9, 16, 25]
real 0m25.104s
user 0m0.051s
sys 0m0.030s
With respect to read_only
or read and write
objects, I will need more information to provide help. What type of objects are those? Are they indexed?
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…