Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
504 views
in Technique[技术] by (71.8m points)

python - Fitting a curve to a power-law distribution with curve_fit does not work

I am trying to find a curve fitting my data that visually seem to have a power law distribution.

enter image description here

I hoped to utilize scipy.optimize.curve_fit, but no matter what function or data normalization I try, I am getting either a RuntimeError (parameters not found or overflow) or a curve that does not fit my data even remotely. Please help me to figure out what I am doing wrong here.

%matplotlib inline
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit

df = pd.DataFrame({
            'x': [ 1000, 3250, 5500, 10000, 32500, 55000, 77500, 100000, 200000 ],
            'y': [ 1100, 500, 288, 200, 113, 67, 52, 44, 5 ]
        })
df.plot(x='x', y='y', kind='line', style='--ro', figsize=(10, 5))

def func_powerlaw(x, m, c, c0):
    return c0 + x**m * c

target_func = func_powerlaw

X = df['x']
y = df['y']

popt, pcov = curve_fit(target_func, X, y)

plt.figure(figsize=(10, 5))
plt.plot(X, target_func(X, *popt), '--')
plt.plot(X, y, 'ro')
plt.legend()
plt.show()

Output

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-243-17421b6b0c14> in <module>()
     18 y = df['y']
     19 
---> 20 popt, pcov = curve_fit(target_func, X, y)
     21 
     22 plt.figure(figsize=(10, 5))

/Users/evgenyp/.virtualenvs/kindle-dev/lib/python2.7/site-packages/scipy/optimize/minpack.pyc in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, bounds, method, **kwargs)
    653         cost = np.sum(infodict['fvec'] ** 2)
    654         if ier not in [1, 2, 3, 4]:
--> 655             raise RuntimeError("Optimal parameters not found: " + errmsg)
    656     else:
    657         res = least_squares(func, p0, args=args, bounds=bounds, method=method,

RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800.
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

As the traceback states, the maximum number of function evaluations was reached without finding a stationary point (to terminate the algorithm). You can increase the maximum number using the option maxfev. For this example, setting maxfev=2000 is large enough to successfully terminate the algorithm.

However, the solution is not satisfactory. This is due to the algorithm choosing a (default) initial estimate for the variables, which, for this example, is not good (the large number of iterations required is an indicator of this). Providing another initialization point (found by simple trial and error) results in a good fit, without the need to increase maxfev.

The two fits and a visual comparison with the data is shown below.

x = np.asarray([ 1000, 3250, 5500, 10000, 32500, 55000, 77500, 100000, 200000 ])
y = np.asarray([ 1100, 500, 288, 200, 113, 67, 52, 44, 5 ])

sol1 = curve_fit(func_powerlaw, x, y, maxfev=2000 )
sol2 = curve_fit(func_powerlaw, x, y, p0 = np.asarray([-1,10**5,0]))

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...