Free Software Explorations

Friday, February 11, 2011

Hooray for Libre Office 3.3!

Take a look at the impressive list of new features of the LibreOffice, the opensource and still free fork of the OpenOffice package or OO. OO is now under the control of the new owner of Sun Microsystems and software products and we are glad the fork will develop into a more open and inviting office productivity software.

The impressive list of features and fixes (to the last OO version) is available in
http://www.libreoffice.org/download/new-features-and-fixes/

We are still downloading the software as we write this. Should be able to add more to this blog post.

Update: You dont have to download manually any deb packages if you are using the latest Ubuntu! I wasted time for the downloading and manual install via dpkg -i install command.
Instead the reader should consult https://wiki.ubuntu.com/LibreOffice. LibreOffice is part of the latest Ubuntu OS! The first step is to add the package repository so we would be able to get any latest updates.

Here are the steps:

Install the repository for libreoffice.

sudo add-apt-repository ppa:libreoffice/ppa

Next perform an update.

sudo apt-get update

Then do the package install

sudo apt-get install libreoffice

sudo apt-get install libreoffice-gnome

Wednesday, September 22, 2010

Scipy curve_fit for easy nonlinear least squares fitting.

Assume we are given input or independent vector x and the repsonse or dependent vector y.
It is desired to fit a non-linear multivariate function of the form y' = f(x, c0, c1,..., c_k) through the points (x, y)
by least squares with the unknown parameters $$c_0, ... c_{k-1}$$ such that

$$T = \sum_{i=0}^{n-1}([y_i- f(x, c_0, c_1, ..., c_{k-1})]^2$$

is minimized. Scipy has an easy to use function from the optimize module to find optimal values of the
fitting parameters by the Levenberg-Marquardt algorithm. It has the following calling format:

curve_fit(f, xdata, ydata, p0=None, sigma=None, **kw)

where the parameters are:

f - The model function, f(x, ...).  
    xdata - an array of predetermined values of length n.
    ydata : an array of dependentdent data.
    p0 : None, scalar, or M-length sequence
        Initial guess for the parameters with a default value of 1 if not given.
    sigma : standard deviation of dependent data which is used as weighting f.
        If not specified,  uniform weights will be applied.

The function returns a pair, (popt, pcov) where

popt : output array of optimal fitted parameters  
    pcov : variance-covariance matrix of fitted parameters. the variances of the parameters are
           along the diagonal.

Here is the built-in example (with known parameters! but with added normal errors) for an exponential function $$y =a e^{-b x} +c$$
with three fitting parameters. It is interesting to see how the curve_fit() function can recover the parameters (not exactly the original
values because of random deviations added to the data).

import numpy as np
from scipy.optimize import curve_fit

def func(x, a, b, c):
    return a*np.exp(-b*x) + c
  
if __name__ == "__main__":  
   x = np.linspace(0,4,50)
   y = func(x, 2.5, 1.3, 0.5)
   yn = y + 0.2*np.random.normal(size=len(x))
   popt, pcov = curve_fit(func, x, yn)

   print "Fitted parameters approximations:", popt
   print "Variance-covariance matrix:"
   print pcov

When the above example is run,it prints the following:

$ python curvefit.py
Fitted parameters approximations: [ 2.29258582 1.34085286 0.58836951]
Variance-covariance matrix:
[[ 0.01242794 0.00585958 -0.00061056]
[ 0.00585958 0.01945042 0.00477259]
[-0.00061056 0.00477259 0.00225576]]
~/Blogs/scipy/fitting$

But there is a catch for Ubuntu 10.04 users. The version of scipy installed is only at 0.7.0-2 while the curve_fit function
is available only with a later version 0.8.0. One has to temporarily remove the scipy-numpy and associated modules
and reinstall them again. I have difficulty installing matplotlib, so let's hope that Ubuntu package maintainers will be able to incorporate the latest statble versions of scipy, numpy and matplotlib in the next Maverick Meerkat version which is available on Monday!

Thursday, August 19, 2010

Matplotlib installing in a web server.

Matplotlib is the most complete 2d plotting package in Python you can find at the moment.
You will not encounter much problems installing and running matplotlib on a desktop but on a remote server, you might encounter problems, like missing or unconfigured tz packages.

So we will document installing from sources the matplotlib plotting package.

Step 1. Get the sources!
wget wget http://downloads.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.0/matplotlib-1.0.0.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Fmatplotlib%2Ffiles%2F&ts=1282255399&mirror=nchc

Resolving downloads.sourceforge.net... 216.34.181.59
Connecting to downloads.sourceforge.net|216.34.181.59|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://surfnet.dl.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.0/matplotlib-1.0.0.tar.gz [following]
--2010-08-19 22:11:56-- http://surfnet.dl.sourceforge.net/project/matplotlib/matplotlib/matplotlib-1.0/matplotlib-1.0.0.tar.gz
Resolving surfnet.dl.sourceforge.net... 130.59.138.21, 2001:620:0:1b::21
Connecting to surfnet.dl.sourceforge.net|130.59.138.21|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12918842 (12M) [application/x-gzip]
Saving to: `matplotlib-1.0.0.tar.gz'

100%[==========================================================================================================================================>] 12,918,842 3.26M/s in 6.3s

2010-08-19 22:12:03 (1.95 MB/s) - `matplotlib-1.0.0.tar.gz' saved [12918842/12918842]

Step 2. Untar the package.

tar xzvvf matplotlib-1.0.0.tar.gz. This will create the matplotlib subdirectory.

Step 3. Install and test.

Cd to the matplotlib subdirectory and issue sudo python setup.py install
Don't be alarmed by the many C/C++ warnings you may see.
Now run python and import the matplotlib module. In my system, there are no errors.

I resorted to installing from sources after spending almost a half-day trying to do it from
apt-get and even from easy_install. I guess there are some destination directory mismatches!

Now if you get a display not defined error after installation, issue a export DISPLAY=:1.0, and reinstall again.

We will add our attempts to generate plots and graphs on a web server from matplotlib.
Stay tuned!

Finer details on the installation of matplotlib in a server environment with an example of generating a line graph: Please visit http://linux.byexamples.com/archives/404/python-generating-graphs-with-matplotlib/

Wednesday, August 18, 2010

NLopt on the Rosenbrock function

Here is our first attempt at using nlopt on the Rosenbrock banana function which we
used last time using the Nelder-Meade solver implemented in scipy.optimize.fmin function

import nlopt
from numpy import *

def nlopt_rosen_fg(x, grad):
   return sum(100.0*(x[1:]-x[:-1]**2.0)**2.0 + (1-x[:-1])**2.0)


x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
n    = len(x0)
grad = array([0] * n)

opt = nlopt.opt(nlopt.LN_NELDERMEAD, len(x0))
opt.set_min_objective(nlopt_rosen_fg)
opt.set_xtol_rel(1.0e-4)
opt.set_maxeval(500)
opt.optimize(x0)
opt_val= opt.last_optimum_value()
opt_result = opt.last_optimize_result()
print 
print opt_val
print x0
print opt_result

When the above program runs, the following output is obtained:

python rosenbrock-nlopt.py 

1.22270197534e-08
[1.3, 0.69999999999999996, 0.80000000000000004, 1.8999999999999999, 1.2]
4

The return code of 4 means that the xtol has been achieved. We notice that the optimized vector is far from all ones although the function value is near zero. Doubtless we still have much to learn in using nlopt for this simple optimization without derivatives and we welcome any insightful remarks.

In the docs, it is not stated how many function evaluations were actually taken at the point of termination. It is obvious but the optimal vector is the final value of the initial vector variable.

Scipy optimize

Scipy

Scipy is a big Python library often compared with Matlab. Its homepage is at http://www.scipy.org
It is sponsored by the company Enthought but the software itself is open-source. The major field of application is numerical computing, in mathematics, science and engineering.

Scipy depends on NumPy (related to Numeric) which provides fast array manipulation. Yet it is not a pure Python library. Rather than rewriting from scratch in pure Python the reliable algorithms already implemented in Fortran in EISPACK, LAPACk, ATLAS, FFTW libraries, Scipy instead offers Python interfaces to these compiled "native machine" routines. The result is that Scipy is fast!. Scipy installs and runs in most architectures and what's more, it is free. More and more scientific computing people are using Scipy each day. Most users have encountered problems in the installation of Scipy and it will be compounded if you don't have enough memory and hard disk space to install and your chosen OS distribution does not have a good package manager.

People wonder why Scipy is not well known such or as widely adopted as R. I guess we can blame it more on freedom of choice. There is no standard gui for Scipy for example, and there is no SINGLE plotting package for Scipy. You can use MATPLOTLIB, chaco, and others with scipy.

Scipy Optimize optimization library

Well in this article, we shall focus on the optimization library called "optimize". Various algorithms are available from Nelder Mead to BGSS to Newton conjugate gradient and others.
Lets try the example in the tutorial which uses the Nelder Mead algorithm and requires function evaluations only.

The Rosenbrock function and fmin (Nelder-Mead)

The definition of the Rosenbrock function is given by
$$!\sum_{i=1}^{n-1] 100(X_i-X_{i-1}^2)^2 + (1 - x_{i-1})^2$$
Here is the implementation code from the tutorial:

from scipy.optimize import fmin

def rosen(x): # The Rosenbrock function
   return sum(100.0*(x[1:]-x[:-1]**2.0)**2.0 + (1-x[:-1])**2.0)



def rosen_der(x):
   xm = x[1:-1]
   xm_m1 = x[:-2]
   xm_p1 = x[2:]
   der = zeros(x.shape,x.typecode())
   der[1:-1] = 200*(xm-xm_m1**2) - 400*(xm_p1 - xm**2)*xm - 2*(1-xm)
   der[0] = -400*x[0]*(x[1]-x[0]**2) - 2*(1-x[0])
   der[-1] = 200*(x[-1]-x[-2]**2)
   return der


x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
xopt = fmin(rosen, x0)
print xopt

Function rosen() evaluates the famous Rosenbrock "Banana" function. The dimension of the function is implied by the size of the input x numpy array. Notice that the function is written in vector form which may not be familiar to most Python programmers outside of scientific computing.
Function rosen_der() returns the derivative or gradient vector which we shall use in the future. Again vector processing is used resulting in more compact code and we hope more understandable as one is more exposed to this kind of programming. When the above code is run it prints in my computer the following output:

python rosenbrock.py
Optimization terminated successfully.
Current function value: 0.000066
Iterations: 141
Function evaluations: 243
[ 0.99910115 0.99820923 0.99646346 0.99297555 0.98600385]

Contrast this with the output from the tutorial:

x0 = [1.3, 0.7, 0.8, 1.9, 1.2]
>>> xopt = fmin(rosen, x0)
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 516
Function evaluations: 825
>>> print xopt
[ 1. 1. 1. 1.
1.]

why the difference. We think that a lot of things have changed like the default values for input arguments since the tutorial. If you look at the help file for fmin under IPython,

Documentation for fmin

fmin(func, x0, args=(), xtol=0.0001, ftol=0.0001, maxiter=None, maxfun=None, full_output=0, disp=1, retall=0, callback=None)
    Minimize a function using the downhill simplex algorithm.
    
    :Parameters:
    
      func : callable func(x,*args)
          The objective function to be minimized.
      x0 : ndarray
          Initial guess.
      args : tuple
          Extra arguments passed to func, i.e. ``f(x,*args)``.
      callback : callable
          Called after each iteration, as callback(xk), where xk is the
          current parameter vector.
    
    :Returns: (xopt, {fopt, iter, funcalls, warnflag})
    
      xopt : ndarray
          Parameter that minimizes function.
      fopt : float
          Value of function at minimum: ``fopt = func(xopt)``.
      iter : int
          Number of iterations performed.
      funcalls : int
          Number of function calls made.
      warnflag : int
          1 : Maximum number of function evaluations made.
          2 : Maximum number of iterations reached.
      allvecs : list
          Solution at each iteration.
    
    *Other Parameters*:
    
      xtol : float
          Relative error in xopt acceptable for convergence.
      ftol : number
          Relative error in func(xopt) acceptable for convergence.
      maxiter : int
          Maximum number of iterations to perform.
:

Calling the fmin function with both xtol and ftol set to 1.0e-5 resulted in much closer matching output:

python rosenbrock.py
Optimization terminated successfully.
         Current function value: 0.000000
         Iterations: 267
         Function evaluations: 444
[ 1.00000002  0.99999975  0.99999956  0.9999988   0.9999975 ]

Tuesday, July 27, 2010

PyEphem: an ephemeris computation library

I wanted a sunrise-sunset for the day information to be posted in my weather page at Digital Explorations and at Life-Research-Education blogs. I originally wanted to implement some algorithms used by say the US Naval Observatory, but on second thought, I was thinking there might be already Python libraries for doing that or something similar!

PyEphem is a Python library for Python versions < 3.0(Python 3.0 users should install ephem instead). The homepage is at http://rhodesmill.org/pyephem/. Here is our attempt to install pyephem on our Ubuntu 64 bit Lucid Lynx:

sudo easy_install pyephem
/usr/lib/python2.6/dist-packages/setuptools/package_index.py:156: UserWarning: Unbuilt egg for setuptools [unknown version] (/usr/lib/python2.6/dist-packages)
Environment.__init__(self,*args,**kw)
/usr/lib/python2.6/dist-packages/setuptools/command/easy_install.py:216: UserWarning: Unbuilt egg for setuptools [unknown version] (/usr/lib/python2.6/dist-packages)
self.local_index = Environment(self.shadow_path+sys.path)
Searching for pyephem
Reading http://pypi.python.org/simple/pyephem/
Reading http://rhodesmill.org/pyephem/
Best match: pyephem 3.7.3.4
Downloading http://pypi.python.org/packages/source/p/pyephem/pyephem-3.7.3.4.tar.gz
Processing pyephem-3.7.3.4.tar.gz
Running pyephem-3.7.3.4/setup.py -q bdist_egg --dist-dir /tmp/easy_install-aGswju/pyephem-3.7.3.4/egg-dist-tmp-ZGk2Ek
libastro-3.7.3/magdecl.c: In function ‘E0000’:
libastro-3.7.3/magdecl.c:150: warning: ignoring return value of ‘fgets’, declared with attribute warn_unused_result
libastro-3.7.3/magdecl.c:153: warning: ignoring return value of ‘fgets’, declared with attribute warn_unused_result
In file included from libastro-3.7.3/deep.c:5:
libastro-3.7.3/satspec.h:16: warning: function declaration isn’t a prototype
In file included from libastro-3.7.3/sdp4.c:7:
libastro-3.7.3/satspec.h:16: warning: function declaration isn’t a prototype
In file included from libastro-3.7.3/sgp4.c:6:
libastro-3.7.3/satspec.h:16: warning: function declaration isn’t a prototype
zip_safe flag not set; analyzing archive contents...
ephem.tests.test_jpl: module references __file__
ephem.tests.test_usno: module references __file__
ephem.tests.test_rst: module references __file__
Adding pyephem 3.7.3.4 to easy-install.pth file

Installed /usr/local/lib/python2.6/dist-packages/pyephem-3.7.3.4-py2.6-linux-x86_64.egg

Note the warnings.

$ python
Python 2.6.5 (r265:79063, Apr 16 2010, 13:57:41) 
[GCC 4.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyephem
Traceback (most recent call last):
  File "", line 1, in 
ImportError: No module named pyephem
>>>

Crazy!! Why is pyephem not installed?

To be continued ......

Ah ok. it should be import ephem and not pyephem! Nuts!

Ok. Here is an example from http://scienceoss.com/calculate-sunrise-and-sunset-with-pyephem/
to display graphically the day length for the whole year of 2008. (difference of sunset and sunrise times).

import ephem
import datetime

obs = ephem.Observer()
obs.lat = '38.8'
obs.long= '-75.2'

start_date = datetime.datetime(2008, 1, 1)
end_date = datetime.datetime(2008, 12, 31)
td = datetime.timedelta(days=1)

sun = ephem.Sun()

sunrises = []
sunsets = []
dates = []

date = start_date
while date < end_date:
date += td
dates.append(date)
obs.date = date

rise_time = obs.next_rising(sun).datetime()
sunrises.append(rise_time)

set_time = obs.next_setting(sun).datetime()
sunsets.append(set_time)

from pylab import *
daylens = []
for i in range(len(sunrises)):
timediff = sunsets[i] - sunrises[i]
hours = timediff.seconds / 60. / 60.  # to get it in hours
daylens.append(hours)

plot(dates, daylens)

# if you have an older version of matplotlib, you may need
# to convert dates into numbers before plotting:
# dates = [date2num(i) for i in dates]

xlabel('Date')
ylabel('Hours')
title('Day length in 2008')
show()

And here is the nice output courtesy of matplotlib:

Saturday, July 17, 2010

Exploring NLOPT , Part 1

To our readers, the NLOPT software has actually a fairly complete Python reference in

http://ab-initio.mit.edu/wiki/index.php/NLopt_Python_Reference.

I don't know why I missed that reference location and other documentation when I first visited the ab-initio site, hence this article is being severely revised.

The original article stressed the use of IPython to quickly explore what is available in the library.
If in doubt, refer to the reference and official documentation.

We will be back.

Pages