It is desired to fit a non-linear multivariate function of the form y' = f(x, c0, c1,..., c_k) through the points (x, y)
by least squares with the unknown parameters $$c_0, ... c_{k-1}$$ such that
$$T = \sum_{i=0}^{n-1}([y_i- f(x, c_0, c_1, ..., c_{k-1})]^2$$
is minimized. Scipy has an easy to use function from the optimize module to find optimal values of the
fitting parameters by the Levenberg-Marquardt algorithm. It has the following calling format:
curve_fit(f, xdata, ydata, p0=None, sigma=None, **kw)
where the parameters are:
f - The model function, f(x, ...). xdata - an array of predetermined values of length n. ydata : an array of dependentdent data. p0 : None, scalar, or M-length sequence Initial guess for the parameters with a default value of 1 if not given. sigma : standard deviation of dependent data which is used as weighting f. If not specified, uniform weights will be applied.
The function returns a pair, (popt, pcov) where
popt : output array of optimal fitted parameters pcov : variance-covariance matrix of fitted parameters. the variances of the parameters are along the diagonal.
Here is the built-in example (with known parameters! but with added normal errors) for an exponential function $$y =a e^{-b x} +c$$
with three fitting parameters. It is interesting to see how the curve_fit() function can recover the parameters (not exactly the original
values because of random deviations added to the data).
import numpy as np from scipy.optimize import curve_fit def func(x, a, b, c): return a*np.exp(-b*x) + c if __name__ == "__main__": x = np.linspace(0,4,50) y = func(x, 2.5, 1.3, 0.5) yn = y + 0.2*np.random.normal(size=len(x)) popt, pcov = curve_fit(func, x, yn) print "Fitted parameters approximations:", popt print "Variance-covariance matrix:" print pcov
When the above example is run,it prints the following:
$ python curvefit.py
Fitted parameters approximations: [ 2.29258582 1.34085286 0.58836951]
Variance-covariance matrix:
[[ 0.01242794 0.00585958 -0.00061056]
[ 0.00585958 0.01945042 0.00477259]
[-0.00061056 0.00477259 0.00225576]]
~/Blogs/scipy/fitting$
But there is a catch for Ubuntu 10.04 users. The version of scipy installed is only at 0.7.0-2 while the curve_fit function
is available only with a later version 0.8.0. One has to temporarily remove the scipy-numpy and associated modules
and reinstall them again. I have difficulty installing matplotlib, so let's hope that Ubuntu package maintainers will be able to incorporate the latest statble versions of scipy, numpy and matplotlib in the next Maverick Meerkat version which is available on Monday!