Thursday, August 19, 2010

Matplotlib installing in a web server.

Matplotlib is the most complete 2d plotting package in Python you can find at the moment.
You will not encounter much problems installing and running matplotlib on a desktop but on a remote server, you might encounter problems, like missing or unconfigured tz packages.

So we will document installing from sources the matplotlib plotting package.

Step 1. Get the sources!
wget wget http://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.0/matplotlib-1.0.0.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fmatplotlib%2Ffiles%2F&ts=1282255399&mirror=nchc


Resolving downloads.sourceforge.net... 216.34.181.59
Connecting to downloads.sourceforge.net|216.34.181.59|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://surfnet.dl.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.0/matplotlib-1.0.0.tar.gz [following]
--2010-08-19 22:11:56-- http://surfnet.dl.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.0/matplotlib-1.0.0.tar.gz
Resolving surfnet.dl.sourceforge.net... 130.59.138.21, 2001:620:0:1b::21
Connecting to surfnet.dl.sourceforge.net|130.59.138.21|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12918842 (12M) [application/x-gzip]
Saving to: `matplotlib-1.0.0.tar.gz'

100%[==========================================================================================================================================>] 12,918,842 3.26M/s in 6.3s

2010-08-19 22:12:03 (1.95 MB/s) - `matplotlib-1.0.0.tar.gz' saved [12918842/12918842]

Step 2. Untar the package.

tar xzvvf matplotlib-1.0.0.tar.gz. This will create the matplotlib subdirectory.

Step 3. Install and test.

Cd to the matplotlib subdirectory and issue sudo python setup.py install
Don't be alarmed by the many C/C++ warnings you may see.
Now run python and import the matplotlib module. In my system, there are no errors.

I resorted to installing from sources after spending almost a half-day trying to do it from
apt-get and even from easy_install. I guess there are some destination directory mismatches!

Now if you get a display not defined error after installation, issue a export DISPLAY=:1.0, and reinstall again.

We will add our attempts to generate plots and graphs on a web server from matplotlib.
Stay tuned!

Finer details on the installation of matplotlib in a server environment with an example of generating a line graph: Please visit http://linux.byexamples.com/archives/404/python-generating-graphs-with-matplotlib/

Wednesday, August 18, 2010

NLopt on the Rosenbrock function

Here is our first attempt at using nlopt on the Rosenbrock banana function which we
used last time using the Nelder-Meade solver implemented in scipy.optimize.fmin function



import nlopt
from numpy import *

def nlopt_rosen_fg(x, grad):
   return sum(100.0*(x[1:]-x[:-1]**2.0)**2.0 + (1-x[:-1])**2.0)


x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
n    = len(x0)
grad = array([0] * n)

opt = nlopt.opt(nlopt.LN_NELDERMEAD, len(x0))
opt.set_min_objective(nlopt_rosen_fg)
opt.set_xtol_rel(1.0e-4)
opt.set_maxeval(500)
opt.optimize(x0)
opt_val= opt.last_optimum_value()
opt_result = opt.last_optimize_result()
print 
print opt_val
print x0
print opt_result

When the above program runs, the following output is obtained:

python rosenbrock-nlopt.py 

1.22270197534e-08
[1.3, 0.69999999999999996, 0.80000000000000004, 1.8999999999999999, 1.2]
4

The return code of 4 means that the xtol has been achieved. We notice that the optimized vector is far from all ones although the function value is near zero. Doubtless we still have much to learn in using nlopt for this simple optimization without derivatives and we welcome any insightful remarks.

In the docs, it is not stated how many function evaluations were actually taken at the point of termination. It is obvious but the optimal vector is the final value of the initial vector variable.

Scipy optimize

Scipy


Scipy is a big Python library often compared with Matlab. Its homepage is at http://www.scipy.org
It is sponsored by the company Enthought but the software itself is open-source. The major field of application is numerical computing, in mathematics, science and engineering.

Scipy depends on NumPy (related to Numeric) which provides fast array manipulation. Yet it is not a pure Python library. Rather than rewriting from scratch in pure Python the reliable algorithms already implemented in Fortran in EISPACK, LAPACk, ATLAS, FFTW libraries, Scipy instead offers Python interfaces to these compiled "native machine" routines. The result is that Scipy is fast!. Scipy installs and runs in most architectures and what's more, it is free. More and more scientific computing people are using Scipy each day. Most users have encountered problems in the installation of Scipy and it will be compounded if you don't have enough memory and hard disk space to install and your chosen OS distribution does not have a good package manager.

People wonder why Scipy is not well known such or as widely adopted as R. I guess we can blame it more on freedom of choice. There is no standard gui for Scipy for example, and there is no SINGLE plotting package for Scipy. You can use MATPLOTLIB, chaco, and others with scipy.

Scipy Optimize optimization library

Well in this article, we shall focus on the optimization library called "optimize". Various algorithms are available from Nelder Mead to BGSS to Newton conjugate gradient and others.
Lets try the example in the tutorial which uses the Nelder Mead algorithm and requires function evaluations only.


The Rosenbrock function and fmin (Nelder-Mead)

The definition of the Rosenbrock function is given by
$$!\sum_{i=1}^{n-1] 100(X_i-X_{i-1}^2)^2 + (1 - x_{i-1})^2$$
Here is the implementation code from the tutorial:

from scipy.optimize import fmin

def rosen(x): # The Rosenbrock function
   return sum(100.0*(x[1:]-x[:-1]**2.0)**2.0 + (1-x[:-1])**2.0)



def rosen_der(x):
   xm = x[1:-1]
   xm_m1 = x[:-2]
   xm_p1 = x[2:]
   der = zeros(x.shape,x.typecode())
   der[1:-1] = 200*(xm-xm_m1**2) - 400*(xm_p1 - xm**2)*xm - 2*(1-xm)
   der[0] = -400*x[0]*(x[1]-x[0]**2) - 2*(1-x[0])
   der[-1] = 200*(x[-1]-x[-2]**2)
   return der


x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
xopt = fmin(rosen, x0)
print xopt

Function rosen() evaluates the famous Rosenbrock "Banana" function. The dimension of the function is implied by the size of the input x numpy array. Notice that the function is written in vector form which may not be familiar to most Python programmers outside of scientific computing.
Function rosen_der() returns the derivative or gradient vector which we shall use in the future. Again vector processing is used resulting in more compact code and we hope more understandable as one is more exposed to this kind of programming. When the above code is run it prints in my computer the following output:

python rosenbrock.py
Optimization terminated successfully.
Current function value: 0.000066
Iterations: 141
Function evaluations: 243
[ 0.99910115 0.99820923 0.99646346 0.99297555 0.98600385]

Contrast this with the output from the tutorial:

x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
>>> xopt = fmin(rosen, x0)
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 516
Function evaluations: 825
>>> print xopt
[ 1. 1. 1. 1.
1.]

why the difference. We think that a lot of things have changed like the default values for input arguments since the tutorial. If you look at the help file for fmin under IPython,

Documentation for fmin

fmin(func, x0, args=(), xtol=0.0001, ftol=0.0001, maxiter=None, maxfun=None, full_output=0, disp=1, retall=0, callback=None)
    Minimize a function using the downhill simplex algorithm.
    
    :Parameters:
    
      func : callable func(x,*args)
          The objective function to be minimized.
      x0 : ndarray
          Initial guess.
      args : tuple
          Extra arguments passed to func, i.e. ``f(x,*args)``.
      callback : callable
          Called after each iteration, as callback(xk), where xk is the
          current parameter vector.
    
    :Returns: (xopt, {fopt, iter, funcalls, warnflag})
    
      xopt : ndarray
          Parameter that minimizes function.
      fopt : float
          Value of function at minimum: ``fopt = func(xopt)``.
      iter : int
          Number of iterations performed.
      funcalls : int
          Number of function calls made.
      warnflag : int
          1 : Maximum number of function evaluations made.
          2 : Maximum number of iterations reached.
      allvecs : list
          Solution at each iteration.
    
    *Other Parameters*:
    
      xtol : float
          Relative error in xopt acceptable for convergence.
      ftol : number
          Relative error in func(xopt) acceptable for convergence.
      maxiter : int
          Maximum number of iterations to perform.
:


Calling the fmin function with both xtol and ftol set to 1.0e-5 resulted in much closer matching output:

python rosenbrock.py
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 267
         Function evaluations: 444
[ 1.00000002  0.99999975  0.99999956  0.9999988   0.9999975 ]