Note: There was failure in running the example in my system consisting of Python 2.7 and the 0.3.1 version of AlgoPy.
Introduction
I had a problem determining the gradient of the ordinal logistic model and I could not find online references on the formula for the gradient. You see it takes around 6 seconds to compute the parameters of model using the BFGS (Broyden, Fletcher, Goldfarb, Shanno) optimizer.
I have heard of ADOL-C before for automatic differentiation, in fact I was browsing the second hand Book Sale and I found a reference book with Java-Forte programs on a CD. That turned me off, since Forte is relative old IDE and I decided NOT to buy it. So lets try an automatic differentiation package for Python.
The home page is at http://packages.python.org/algopy/ homepage.
The documentation at the homepage states the purpose of AlgoPy nicely:
The purpose of AlgoPy is the evaluation of higher-order derivatives in the forward and reverse mode of Algorithmic Differentiation (AD) of functions that are implemented as Python programs. Particular focus are functions that contain numerical linear algebra functions as they often appear in statistically motivated functions. The intended use of AlgoPy is for easy prototyping at reasonable execution speeds. More precisely, for a typical program a directional derivative takes order 10 times as much time as time as the function evaluation. This is approximately also true for the gradient.
Capabilities of AlgoPy
AlgoPy promises to compute the following if possible the following:- evaluation of derivatives useful for nonlinear continuous optimization
- gradient
- Jacobian
- Hessian
- Jacobian vector product
- vector Jacobian product
- Hessian vector product
- vector Hessian vector product
- higher-order tensors
- Taylor series evaluation
Installation
I am using the latest Ubuntu OS (Oneiric Ocelot) on my laptop which serves as my main computer. Try to see the version of your installed default Python by issuing python --version. I see that I have 2.7.2+ Now try this:
sudo easy_install-2.7 algopyThat's all! Here are the terminal output messages
Actually easy_install has three versions in my machine, namely easy_install easy_install-2.6 easy_install-2.7 I got the following error for instance by typing easy_install only:sudo easy_install-2.7 algopy Searching for algopy Reading http://pypi.python.org/simple/algopy/ Reading http://packages.python.org/algopy Reading http://www.github.com/b45ch1/algopy Best match: algopy 0.3.1 Downloading http://pypi.python.org/packages/source/a/algopy/algopy-0.3.1.zip#md5=84edf4722e0f67ae254d663906893ff5 Processing algopy-0.3.1.zip Running algopy-0.3.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-SvuPun/algopy-0.3.1/egg-dist-tmp-8pSEhA .dev warning: manifest_maker: MANIFEST.in, line 1: 'recursive-include' expects... zip_safe flag not set; analyzing archive contents... algopy.__init__: module references __file__ algopy.tracer.tests.environment: module references __file__ Adding algopy 0.3.1 to easy-install.pth file Installed /usr/local/lib/python2.7/dist-packages/algopy-0.3.1-py2.7.egg Processing dependencies for algopy Finished processing dependencies for algopy
It is much better to install easy_install to your system than downloading the tarred package from the homepage and unpacking and installing it via python setup.py installsudo easy_install algopy [sudo] password for toto: Traceback (most recent call last): File "/usr/local/bin/easy_install", line 5, infrom pkg_resources import load_entry_point File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 2676, in parse_requirements(__requires__), Environment() File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 552, in resolve raise DistributionNotFound(req)
Trying out an example
When the above is executed in my laptop, it prints to the console or terminal the following:import numpy, algopy from algopy import UTPM, exp def eval_f(x): """ some function """ return x[0]*x[1]*x[2] + exp(x[0])*x[1] # forward mode without building the computational graph # ----------------------------------------------------- x = UTPM.init_jacobian([3,5,7]) y = eval_f(x) algopy_jacobian = UTPM.extract_jacobian(y) print 'jacobian = ',algopy_jacobian # reverse mode using a computational graph # ---------------------------------------- # STEP 1: trace the function evaluation cg = algopy.CGraph() x = algopy.Function([1,2,3]) y = eval_f(x) cg.trace_off() cg.independentFunctionList = [x] cg.dependentFunctionList = [y] # STEP 2: use the computational graph to evaluate derivatives print 'gradient =', cg.gradient([[3.,5,7]]) print 'Jacobian =', cg.jacobian([[3.,5,7]]) print 'Hessian =', cg.hessian([[3.,5.,7.]]) print 'Hessian vector product =', cg.hess_vec([[3.,5.,7.]],[[4,5,6]])
. >>> import numpy, algopy >>> from algopy import UTPM, exp >>> >>> def eval_f(x): ... """ some function """ ... return x[0]*x[1]*x[2] + exp(x[0])*x[1] ... >>> # forward mode without building the computational graph ... # ----------------------------------------------------- ... x = UTPM.init_jacobian([3,5,7]) >>> y = eval_f(x) >>> algopy_jacobian = UTPM.extract_jacobian(y) >>> print 'jacobian = ',algopy_jacobian jacobian = [ 135.42768462 41.08553692 15. ] >>> >>> # reverse mode using a computational graph ... # ---------------------------------------- ... >>> # STEP 1: trace the function evaluation ... cg = algopy.CGraph() >>> x = algopy.Function([1,2,3]) >>> y = eval_f(x) >>> cg.trace_off()>>> cg.independentFunctionList = [x] >>> cg.dependentFunctionList = [y] >>> >>> # STEP 2: use the computational graph to evaluate derivatives ... print 'gradient =', cg.gradient([[3.,5,7]]) gradient = [array([ 135.42768462, 41.08553692, 15. ])] >>> print 'Jacobian =', cg.jacobian([[3.,5,7]]) Jacobian = Traceback (most recent call last): File " ", line 1, in File "/usr/local/lib/python2.7/dist-packages/algopy-0.3.1-py2.7.egg/algopy/tracer/tracer.py", line 262, in jacobian self.pushforward(utpm_x_list) File "/usr/local/lib/python2.7/dist-packages/algopy-0.3.1-py2.7.egg/algopy/tracer/tracer.py", line 103, in pushforward f.__class__.pushforward(f.func, f.args, Fout = f) File "/usr/local/lib/python2.7/dist-packages/algopy-0.3.1-py2.7.egg/algopy/tracer/tracer.py", line 637, in pushforward out = func(*args) File "/usr/local/lib/python2.7/dist-packages/algopy-0.3.1-py2.7.egg/algopy/utpm/utpm.py", line 73, in __getitem__ tmp = self.data.__getitem__((slice(None),slice(None)) + tuple(sl)) IndexError: invalid index >>> print 'Hessian =', cg.hessian([[3.,5.,7.]]) Hessian = Traceback (most recent call last): File " ", line 1, in File "/usr/local/lib/python2.7/dist-packages/algopy-0.3.1-py2.7.egg/algopy/tracer/tracer.py", line 349, in hessian self.pushforward(utpm_x_list) File "/usr/local/lib/python2.7/dist-packages/algopy-0.3.1-py2.7.egg/algopy/tracer/tracer.py", line 103, in pushforward f.__class__.pushforward(f.func, f.args, Fout = f) File "/usr/local/lib/python2.7/dist-packages/algopy-0.3.1-py2.7.egg/algopy/tracer/tracer.py", line 637, in pushforward out = func(*args) File "/usr/local/lib/python2.7/dist-packages/algopy-0.3.1-py2.7.egg/algopy/utpm/utpm.py", line 73, in __getitem__ tmp = self.data.__getitem__((slice(None),slice(None)) + tuple(sl)) IndexError: invalid index >>> print 'Hessian vector product =', cg.hess_vec([[3.,5.,7.]],[
Ouch, we have fatal errors! We do have the latest version of AlgoPy (0.3.1). But you may be luckier than me.
We are a little sorry for this post. But we will come back soon, we need to visit the mailing list or the algopy group (if there is). We will come back soon!
Wednesday, March 7, 2012
Its about time Python get serious with Statistics! Introducing Scikits.statsmodels (Linux Ubuntu)), Part 1
You can of course perform powerful statistical routines using Python, Rpy module and R. But greater joy is now coming to pure Python users with the Scikits.statsmodels package.
Download the software scikits.statsmodels module
In a terminal type wget -ct 0 http://pypi.python.org/packages/source/s/scikits.statsmodels/scikits.statsmodels-0.3.1.tar.gz#md5=1f55b53d161544b95ca2709c9731c00c Untar it,then descend to the directory. Test the installed module
Lets try this from ipython(Again, do a sudo apt-get install ipython if it is not yet installed. Additional resources
Home page of pystatsmodels: http://statsmodels.sourceforge.net/ Subscribe to the Google group: http://groups.google.com/group/pystatsmodels
The regression example from the homepage
It is time to show what statsmodels can do. Here is a regression example from the home page: http://statsmodels.sourceforge.net/
- Install the scikits.statsmodels module Be sure that your versions of scipy and numpy are the latest or later than the following:
Python >= 2.5 NumPy >= 1.4.0 SciPy >= 0.7You can install the latest versions by typing sudo apt-get install python python-setuptools python-numpy python-scipy
tar -xvvf scikits.statsmodels-0.3.1.tar.gz sudo python setup.py install
import scikits.statsmodels.api as sm help(sm)The following will be printed:
Help on module scikits.statsmodels.api in scikits.statsmodels: NAME scikits.statsmodels.api - Statistical models FILE /home/toto/testsoftware/scikits.statsmodels-0.3.1/scikits/statsmodels/api.py DESCRIPTION - standard `regression` models - `GLS` (generalized least squares regression) - `OLS` (ordinary least square regression) - `WLS` (weighted least square regression) - `GLASAR` (GLS with autoregressive errors model) - `GLM` (generalized linear models) - robust statistical models - `RLM` (robust linear models using M estimators) - `robust.norms` estimates - `robust.scale` estimates (MAD, Huber's proposal 2). - sandbox models - `mixed` effects models - `gam` (generalized additive modelWhat is available by importing sm? Type dir(sm) and the following functions or attibutes will be listed:
In [17]: dir(sm) Out[17]: ['GLM', 'GLS', 'GLSAR', 'Logit', 'MNLogit', 'OLS', 'Poisson', 'Probit', 'RLM', 'WLS', '__builtins__', '__doc__', '__file__', '__name__', '__package__', 'add_constant', 'categorical', 'datasets', 'families', 'iolib', 'nonparametric', 'regression', 'robust', 'test', 'tools', 'tsa', 'version']
import numpy as np import scikits.statsmodels.api as sm # get data nsample = 100 x = np.linspace(0,10, 100) X = sm.add_constant(np.column_stack((x, x**2))) beta = np.array([1, 0.1, 10]) y = np.dot(X, beta) + np.random.normal(size=nsample) # run the regression results = sm.OLS(y, X).fit() # look at the results print results.summary()Here is the results by running the above published example:
Summary of Regression Results ======================================= | Dependent Variable: ['y']| | Model: OLS| | Method: Least Squares| | Date: Wed, 07 Mar 2012| | Time: 17:58:02| | # obs: 100.0| | Df residuals: 97.0| | Df model: 2.0| ============================================================================== | coefficient std. error t-statistic prob. | ------------------------------------------------------------------------------ | x1 1.060 0.1353 7.8344 0.0000 | | x2 0.09398 0.01309 7.1802 0.0000 | | const 9.848 0.2927 33.6490 0.0000 | ============================================================================== | Models stats Residual stats | ------------------------------------------------------------------------------ | R-squared: 0.9729 Durbin-Watson: 1.803 | | Adjusted R-squared: 0.9724 Omnibus: 3.108 | | F-statistic: 1742. Prob(Omnibus): 0.2114 | | Prob (F-statistic): 9.776e-77 JB: 2.791 | | Log likelihood: -139.9 Prob(JB): 0.2477 | | AIC criterion: 285.8 Skew: -0.4176 | | BIC criterion: 293.6 Kurtosis: 2.927 | ------------------------------------------------------------------------------So far so good. We hope that the statsmodels module be more robust, easier to use and inspiring enough for Python users users to contribute code!
Subscribe to:
Posts (Atom)