Friday, May 29, 2009

Is Python really slow? A practical comparison with C++

The common perception is that Python's implementation is slow, but you can often write fast Python if you know how to profile your code effectively.

I have tried it. I have compared a hight cpu intensive algorithm, the training of a simple one hidden neural network. To do so, I have used my old C++ NeuralNetwork library (flayers) and an implementation in python with Numpy. I have wrote a simple neural net in python and optimize all loops with numpy as suggested in a profiling presentation saw in Pycon2009.

I have compare the training time of a simple fully connected NeuralNetwork will 100 hidden neurones for 10 iteration on letters dataset (cost function = mean square error).
Here is the time to do 10 iteration with flayers (c++):
./fexp / -h 100 -l 0.01 --oh -e 10
...
Optimization: Standard
Creating Connector [16|100] [inputs | hiddens]
Creating Connector [100|26] [hiddens | outputs]
...

real 0m11.187s
user 0m10.837s
sys 0m0.012s
Here is the time to do 10 iteration on the full letters dataset with python:

time ./bpnn.py -e 10 --h 100 -f letters.dat -n
Creation of an NN <16:100:26>
...
real 85m48.646s
user 85m9.163s
sys 0m1.632s
Here is the time to do 10 iteration on the full letters dataset with python and numpy:time
./bpnn.py -e 10 --h 100 -f letters.dat
Creation of an NN <16:100:26>
...
real 1m37.066s
user 1m36.026s
sys 0m0.100s

So if you do the math:
  • The numpy implementation is 60 time faster then a basic python implementation.
  • My C++ implementation is a little more then 10 time faster then my simply python numpy implementation.
Numpy implementation definitly worth it because it reduce the code and has a significant performance impact, the C++ might be required for extreme performance but the trade off of code complexity and time my not work it. Now that I have the choice, I will still use my C++ lib.


Sunday, May 10, 2009

Short Essay: Engineering vs Scientist

Engineer vs scientist difference is obscure for many people, me included at the beginning.
Already, during a chat with a research professor in 2001 about the competence war between the engineering and the science department of university of Montreal, I got an initial hint about it, he told me that part of computer science department tension with software engineers was about placement rate. Engineers was much higher than computer scientists. I should have request an explanation. What is this war about, the computer science department request engineer to do some normalization courses even if they have great grades and vis-versa.

In my short carrier, I have observe a major difference that I will try to expose with examples.
After my degree in engineering, I have done most of my graduates courses with scientists. In my first final exam, after some discussions with classmates, I realized that I was the only one that have used an estimation to solve a bottleneck calculus in a problem. Most of the other have lost more than half an hour on solving it exactly and couldn't finish the full exam.
Several times, I add to step in projects to set schedule in order to force people to cut some corners. Perfection is an endless path. If you have to fill a vase with rocks, sand and water, you will most likely start with the rocks then the sand and then with the water if the effort is inverse proportional to the size no? Yes it might be more interesting to optimize cool things and advances features but most likely they aren't part of the basic blocks required to allow your project to move to the next step.

What is the point of having half a perfect solution if you could have done a imperfect full solution. I am not saying that all scientists need of perfection leads them to lose the big picture and engineer aren't perfectionist but simply that bias engineer system vision leads them to gage better when it is time to look for perfection. Time kill projects and over-perfection can drag too much of it...but perfection should remain the goal.
Yes I am generalizing and simplifying but you get the big picture. Define your priorities and assign resource accordingly. So, am I a engineer or a scientist or a little of both? since my manager told my I was more a scientist, I should be a little of both...