I am encountering an issue when I run differential evolution with multiple workers. Normally, it can be solved by if __name__ == '__main__':
, but this is not possible here due to the structure of the project. The simplified version of the problem is as follows:
The class GlobalRestrictedSUR
is located in the file called functions.py
located in the same folder as main.py
. It has a method gfit()
that fits a system of regressions by minimizing the objective function -- the sum of squared residuals -- using differential evolution (the real problem is not convex).
When I am in the main.py
file, import the class and call the gfit()
method, differential_evolution
like this:
#### Estimate GlobalRestrictedSUR
from functions import GlobalRestrictedSUR
lhs = ...
rhs = ...
grmodel = GlobalRestrictedSUR(lhs, rhs,
beta_bound=(-100, 100),
maxiter=int(1e3))
grmodel.gfit(ftol=1e-9)
it works only for workers=1
. For workers>1
, I get the error:
Process SpawnPoolWorker-210:
Traceback (most recent call last):
File "/opt/anaconda3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/anaconda3/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/anaconda3/lib/python3.8/multiprocessing/pool.py", line 114, in worker
task = get()
File "/opt/anaconda3/lib/python3.8/multiprocessing/queues.py", line 358, in get
return _ForkingPickler.loads(res)
ModuleNotFoundError: No module named 'functions'
If everything were in the main.py
file, I could solve the issue, by using if __name__ == '__main__':
. But this is not possible here as the class is in functions.py
. I tried to use if __name__ == 'functions':
, but that did not help. For completeness, I provide the code for GlobalRestrictedSUR
.
class GlobalRestrictedSUR():
def __init__(self, lhs, rhs, beta_bound=(-20,20) maxiter=100):
# To be filled with results
self.result = None
# Declare data
self.lhs = lhs
self.rhs = rhs
self.beta_bound = beta_bound
self.maxiter = maxiter
# Declare dimensions
self.T_dep, self.N = self.lhs.shape # T is time. N are dependent variables
self.T_ind, self.R = self.rhs.shape # R is the number of regressors
# Convenient shapes for optimization
self.y = lhs.unstack().values
self.x = np.kron(np.eye(self.N), self.rhs)
def ssr(self, params, y, x):
# get residuals
res = y - x @ params
return np.sum(res ** 2)
def gfit(self, ftol=1e-6):
# if __name__ == '__main__':
# Get a DE solution
bounds = [self.beta_bound] * (self.N*self.R)
result = differential_evolution(self.ssr, bounds,
args=(self.y, self.x),
maxiter=self.maxiter,
tol=ftol,
workers=2,
seed=0,
disp=True)
self.result = result
question from:
https://stackoverflow.com/questions/65626818/how-to-run-scipy-differential-evolution-in-parallel-when-loaded-from-a-module-c