The common perception is that Python's implementation is slow, but you can often write fast Python if you know how to profile your code effectively.
I have tried it. I have compared a hight cpu intensive algorithm, the training of a simple one hidden neural network. To do so, I have used my old C++ NeuralNetwork library (flayers) and an implementation in python with Numpy. I have wrote a simple neural net in python and optimize all loops with numpy as suggested in a profiling presentation saw in Pycon2009.
I have compare the training time of a simple fully connected NeuralNetwork will 100 hidden neurones for 10 iteration on letters dataset (cost function = mean square error).
Here is the time to do 10 iteration with flayers (c++):
./fexp / -h 100 -l 0.01 --oh -e 10
...
Optimization: Standard
Creating Connector [16|100] [inputs | hiddens]
Creating Connector [100|26] [hiddens | outputs]
...
real 0m11.187s
user 0m10.837s
sys 0m0.012s
Here is the time to do 10 iteration on the full letters dataset with python:
Here is the time to do 10 iteration on the full letters dataset with python and numpy:timetime ./bpnn.py -e 10 --h 100 -f letters.dat -nCreation of an NN <16:100:26>...
real 85m48.646s
user 85m9.163s
sys 0m1.632s
./bpnn.py -e 10 --h 100 -f letters.datCreation of an NN <16:100:26>...real 1m37.066suser 1m36.026ssys 0m0.100s
So if you do the math:
- The numpy implementation is 60 time faster then a basic python implementation.
- My C++ implementation is a little more then 10 time faster then my simply python numpy implementation.
Numpy implementation definitly worth it because it reduce the code and has a significant performance impact, the C++ might be required for extreme performance but the trade off of code complexity and time my not work it. Now that I have the choice, I will still use my C++ lib.