Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
4.5k views
in Technique[技术] by (71.8m points)

python : Multiprocess and requests

This post makes reference to my previous post :

In the previous post, I was explaining that I have a class object called Item. This Item class has a method, call make_request which makes a GET request on a server. Now, I have implemented X Item objects which call make_request. The Item objects gonna call the method every X minutes, but these make_requests must be called independently.

Example with 3 Items:

  • 14:00 - Item0.make_request
  • 14:01 - Item1.make_request
  • 14:02 - Item2.make_request
  • 14:03 - Item0.make_request
  • 14:04 - Item1.make_request
  • 14:05 - Item2.make_request
  • 14:06 - Item0.make_request
  • 14:07 - Item1.make_request
  • 14:08 - Item2.make_request
  • 14:09 - Item0.make_request
  • 14:10 - Item1.make_request
  • 14:11 - Item2.make_request
  • 14:12 - Item0.make_request
  • 14:13 - Item1.make_request
  • 14:14 - Item2.make_request ... etc

The principle is simple, the make_request method of the object Item_X must be called independently from the previous make_request method of the object Item_X-1. Indeed, the make_request method must start at M minute (every minute), 30 seconds for example. If it takes more than 30 seconds to get the result of the method, then it mustn't delay the next make_request (Multiprocess and queue?)

The answer to my previous post works but isn't enough robust :)

What do I need are possible solutions. I wonder if you have an idea to do it in python3. Can you provide me some advice (modules?) ?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can use apscheduler to achieve this. There are a number of setups you can do to do this, depending on your exact use case. For example, splitting each item as its own job:

import time
from datetime import datetime, timedelta

from apscheduler.schedulers.blocking import BlockingScheduler
from apscheduler.triggers.interval import IntervalTrigger


class Item:
    def __init__(self, name):
        self.name = name

    def make_request(self):
        print(f"{datetime.now()}: {self.name}")


items = [
    Item(name="Item0"),
    Item(name="Item1"),
    Item(name="Item2"),
]

scheduler = BlockingScheduler()  # blocks the main thread
for i, item in enumerate(items):
    scheduler.add_job(item.make_request,
                      IntervalTrigger(minutes=len(items)),  # every item is executed once per minute
                      next_run_time=datetime.now() + timedelta(minutes=i),  # delay items execution by a minute
                      )
scheduler.start()

Sample output:

2021-01-19 23:27:27.926250: Item0
2021-01-19 23:28:27.928920: Item1
2021-01-19 23:29:27.924316: Item2
2021-01-19 23:30:27.927307: Item0
2021-01-19 23:31:27.927305: Item1
2021-01-19 23:32:27.931385: Item2

If the make_request takes more minutes to finish than there are items, it's possible to re-trigger the job once it finishes or else allowing concurrent executions of the same job. Another possibility is to schedule jobs using CRON triggers instead of using the next_run_time delay.

If you absolutely want items to be synchronised, then you can have a single job running every minute with max_instances=len(items), & maintaining flags to see which item is currently running. However, this may be prone to race conditions & in general, more mechanical code is required.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...