Tuesday, December 30, 2008

Sip vs Swig?....Swig

Yannick suggested me to try sip before getting too far with swig. I have followed religiously the doc but I haven't been able to make it working with my lib. I stopped at the same level of complexity of what I have reached with swig (<=1.5 hrs to make it working with flayers). sip -c . pytrainer.sip
sip: Trainer is undefined

Basically, sip can't find my class Trainer and I haven't find any doc about it neither clear mailing list answers.

I have try the configure.py too but I get the error that I don't get with their example (in both case .sdf file doesn't exist).

Error: Unable to open "pytrainer.sbf": [Errno 2] No such file or directory:

Swig is my choice at this point because it is simple to use, well documented and simply works.

Friday, December 26, 2008

Why I have switch to Python?

In 2005, by the time I was working as a "Machine Learning Specialist" at Coradiant, I had to do lot of preprocessing. At that time, I was doing more data-mining then machine learning. The fundamental difference is that machine learning use model to do prediction and data-mining use mostly model to understand links between variables (no prediction is required). In order to do my work, I chose Matlab because I love their visualization functionality and it contains a machine learning package that includes a Decision Tree module that was perfectly suits my need to do root causes analysis. Unfortunately, Matlab is a disaster to do preprocessing and this task was representing the biggest part of my work on that project and C++ isn't better. Doug Eck, a Music and Machine Research consultant, introduced me to Python to help me doing the preprocesssing work. What a discovery! A year later, I released the first alpha version of mlboost.
Later, I worked on a real machine learning task, building a bot detector that give a bot probability base on sessions information. Without python and mlboost, our real-time bot detector prototype would had never be a reality within a so tight schedule.

Later, with the discovery of numpy, scipy, matplotlib (with pylab interface, you get the same api as matlab), python-mysql and ipython, I don't need anymore matlab. With pydev eclipse plugging, I have a powerful python development environment.

So basically, I have switch to python because:
  • I can do extreme prototyping at a speed that I have never seen
  • I have an amazing community (matplotlib, pydev, scipy etc.)
  • The language amazing syntax allows me to write code with minimal number of lines in an elegantly and readable way (ex: list comprehension, *argv, **args, mocking etc.)
  • The simplicity I can do wrapper with non python libraries (ex: swig)

Here is interesting links:
Matematica vs Matlab vs Python (same fct comparison code)
How to Install Python as a Replacement to Matlab
pythonxy package Python(x,y)
I like this figure to represents to community over python.

Tuesday, December 23, 2008

Swig is just allsum!

Today, it only took me 2 hrs to integrate load and train function to my new flayers python interface. I am just impress that it is working so easily. I am simply convince that swig guys have done an allsum work! Thanks guys!

svn co https://mlboost.svn.sourceforge.net/svnroot/mlboost/flayers flayers
cd flayers

sudo python setup.py build_ext --inplace
In [1]: import flayers
In [2]: trainer = flayers.loadTrainer('test.save')
In [3]: trainer.train(10)

python, matplotlib, ipython, scipy and now swig are making my life much easier.

Sunday, December 21, 2008

How I have created a python module from a C++ lib with swig

Today, I have completed the first step to create a python module from my C++ library. The first function I want to call from flayers is the main wrapper (fexp.cpp:main(int argc, char** argv)). From the new flayers python module, I expect to be able to call it that way :

import flayers

Unfortunately, it is a little bit more complicated then expected. In order to make it working, I have looked at this great documentation (doc). Here are a summary of issues I have encounter.
1) Extension module doesn't work automatically with C++. You have to add "-c++" option to swig call (ex: swig -python -c++ flayers.i).
2) I added the .cxx file generated from swig instead of .i file to the source file in the setup extension instance (variant from my previous blog post).
3) I have added flayers dependancy classes to the extension sources to ensure I didn't get error like : ImportError: ./_flayers.so: undefined symbol: XXXX and ensure the lib is recompiled at the installation.
4) How to pass the list of options without argc and not get error like "in method 'fexp', argument 2 of type 'char **'": see section "30.9.2 Expanding a Python object into multiple arguments" ? look at my setup.py

You can take a look at my files here (flayers.i, setup.py, flayers.h,flayers.cpp)
I am impress, swig and python are just allsum.

If you want to try it, download flayers and do:
svn co https://mlboost.svn.sourceforge.net/svnroot/mlboost/flayers flayers
cd flayers
python setup.py build_ext --inplace (or python setup.py install)
ipython (or python)
import flayers

Sunday, December 14, 2008

swig and python with Extension module

There is a much simpler way to create interface to C/C++ code. Simply use Extension module from distutils.core (setup.py).
You simply have to type sudo python setup.py install and you can access your module in python. (import example; example.cube(3))

Thanks to PA recommendation.
My next steps are to create a python interface to flayers (my C++ neural network library) and convert arguments. PA suggest boost.python for arguments conversions.

I have also experience setup module from setuptools to create mlboost setup.py file. setuptools use indirectly easy_install that doesn't work as well as aptitude on ubuntu. Unfortunatly, matplotlib dependency doesn't work but it is included in scipy through mlab interface.

Saturday, December 13, 2008

Swig and Python first experimentation

Today I tried swig to interface python to my C++ library.
According to this example it seems very simple but of course it doesn't work.
I have followed exactly the proposed steps and I got 3 problems:
  1. can't find Python.h: solution->sudo aptitude install python-dev
  2. ld -shared -o _example.so example.o example_wrap.old: example.o: relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object; recompile with -fPIC : solution-> do it
  3. ld -bundle -flat_namespace -undefined suppress -o _example.so example.o example_wrap.o
    ld: -f may not be used without -shared :->remove -bundle and add -share option
Here is my config:
AMD Athlon X2 5400+ Dual-Core Processor
unbunto 8.4.10
Python 2.5.2

Fast track solution:
svn co https://mlboost.svn.sourceforge.net/svnroot/mlboost/demo/swig swig-example
cd swig-example
source create_example_python_interface.sh

Now here is exactly what you should do to make it working:
1) create example.c and example.i

2) create python interface to example module that contains cube function
sudo aptitude install swig
sudo aptitude install python-dev
swig -python example.i
gcc -c example.c example_wrap.c -I/usr/include/python2.5 -I/usr/lib/python2.5 -fPIC
ld -shared -o _example.so example.o example_wrap.o
ld -share -flat_namespace -undefined suppress -o _example.so example.o example_wrap.o

3) try python interface

fpieraut@fraka7:~/swig$ python
Python 2.5.2 (r252:60911, Oct 5 2008, 19:29:17)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import example
>>> example.cube(3)