- Core: Theano http://deeplearning.net/tutorial/ (Torch vs Theano)
- Framework:
- Keras: Theano-based Deep Learning library http://keras.io/
- Blocks and Fuel: Frameworks for deep learning (article) mila-udem · GitHub
- theanets (numpy+sklearn+theano)
- NervanaSystems/neon (build on top of numpy + YAML config file like cafee + leverage nervana cpu)
- uaca/deepy · GitHub (build on top of theano)
Some thoughts of a Machine Learning Practitioner on Software Development, Management, Team Building, Startups, Python, Agile Development, Data visualization... that will distract you from your end goals by making you less efficient but are critical to manage in order to succeed. Don't forget that long time adaptation to inefficient approaches can become your enemy. Let's try to empower others by sharing knowledge & personal experiences.
Showing posts with label python. Show all posts
Showing posts with label python. Show all posts
Thursday, October 15, 2015
Summary of deeplearning libs available in python
Saturday, April 4, 2015
How to deploy a python datascience python app on Heroku? numpy+scipy+pandas+sklearn+matplotlib
- time out
- numpy and scipy incompatibilities
After several interation, here is a script to do it:
https://github.com/fraka6/trading-with-python/blob/master/create_heroku_datascience.sh
Here is what you will get:
Btw, I am currently experimenting my datascience stuff on sense,io (Sense.io is a collaborative platform to accelerate data science from exploration to production.)
Friday, May 2, 2014
Looking for a simple way to add columns based on the other ones?
Are you looking for a simple way to add columns based on some other ones?
Let's say you want to add the column C where C=A+B?
you can do cat *.tsv | ./coladd.py -a C=A+B
if you do add more fields, separate your equations with ',' like this:
cat *.tsv | ./coladd.py -a C=A+B, D=(A/C)
eval() and csv.DictReader have been leveraged to achieve this task.
code: coladd.py
Let's say you want to add the column C where C=A+B?
you can do cat *.tsv | ./coladd.py -a C=A+B
if you do add more fields, separate your equations with ',' like this:
cat *.tsv | ./coladd.py -a C=A+B, D=(A/C)
eval() and csv.DictReader have been leveraged to achieve this task.
code: coladd.py
Monday, July 15, 2013
greping zip/bz2 files is annoying: -H option doesn't work
grep is a quite useful command line but some options don't leave well with zip files....like:
-H, --with-filename print the filename for each match
bzgrep nore zgrep solve the problem, it is making the same effect as:
zcat *.bz2 | grep -H "something"
it generate this
(standard input):
not:
filename:
So here is my zgrep.py (example: ls *.gz | zgrep.py "a <.*> (.*)")
-H, --with-filename print the filename for each match
bzgrep nore zgrep solve the problem, it is making the same effect as:
zcat *.bz2 | grep -H "something"
it generate this
(standard input):
not:
filename:
So here is my zgrep.py (example: ls *.gz | zgrep.py "a <.*> (.*)")
#!/usr/bin/env python ''' allow grep -H option of bzip & zip files ''' import sys import gzip import bz2 import re exp = sys.argv[1] expExtractor = re.compile(exp) for filename in sys.stdin: filename = filename.strip() if filename.endswith('.gz'): freader = gzip.open(filename,'r') elif filename.endswith('.bz2'): freader = bz2.BZ2File(filename) else: freader = open(filename, 'r') for i, line in enumerate(freader): line = line.strip() if expExtractor.search(line): print "%s:%i <%s>" %(filename, i, line)
Thursday, May 16, 2013
The simplest python server example ;)
Today, I was looking for a simple python server template but couldn't find a good one so here is what I was looking for (yes the title is a little arrogant ;):
#!/usr/bin/env python
''' simple python server example;
output format supported = html, raw or json ''' import sys import json from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer FORMATS = ('html','json','raw') format = FORMATS[0] class Handler(BaseHTTPRequestHandler): #handle GET command def do_GET(self): if format == 'html': self.send_response(200) self.send_header("Content-type", "text/plain") self.send_header('Content-type','text-html') self.end_headers() self.wfile.write("body") elif format == 'json': self.request.sendall(json.dumps({'path':self.path})) else: self.request.sendall("%s\t%s" %('path', self.path)) return def run(port=8000): print('http server is starting...')
#ip and port of server server_address = ('127.0.0.1', port) httpd = HTTPServer(server_address, Handler)
print('http server is running...listening on port %s' %port) httpd.serve_forever() if __name__ == '__main__':
from optparse import OptionParser op = OptionParser(__doc__) op.add_option("-p", default=8000, type="int", dest="port",
help="port #") op.add_option("-f", default='json', dest="format",
help="format available %s" %str(FORMATS)) op.add_option("--no_filter", default=True, action='store_false',
dest="filter", help="don't filter") opts, args = op.parse_args(sys.argv) format = opts.format run(opts.port)
Saturday, April 6, 2013
Finding the optimal K in kmean: a incremental kmeans in python
I was looking for an good implementation of an incremental k-means where I don't have to set the optimal K. There are interesting papers (x-means, gmeans etc.) but couldn't find any python implementation.
I have decided to write a incremental version on top of sklearn.
The idea is simple:
A special thanks to scikit-learn lib to let me prototype this version so fast.
I have decided to write a incremental version on top of sklearn.
The idea is simple:
- Start at K=x
- identify worst cluster based on an unsupervised measure (ex: silhouette)
- Split the worst cluster into 2 clusters
- measure the global improvement with the new clusters
- if you get an improvement continue adding clusters
A special thanks to scikit-learn lib to let me prototype this version so fast.
Friday, November 23, 2012
Real-time face recognition experiment packages

In Autumn 2009, I have been lecturer at ETS for a Machine Learning introduction class. In order to ensure the class could get a real feeling about machine learning, I have repackaged the digipy demo used for my presentation "Machine Learning Empowered by Python" for their final project. The latest code is here: https://bitbucket.org/fraka6/digiface
The digiface package was their recommended starting point. It is a real-time face recognition package so they could focus on extracting the best features, train easily a single neural net and experiment live or on the dataset picture.
The idea was simple, they will compete on the best real-time live face recognitions of the student faces themselves. Every student had to sit in front of each other face recognition system. The best system had to be quite robust in order to consider light, background et hair changes.
We had to make a pictures sessions and build the dataset.
One team built their own package called digijava. Here is a snapshot.

It was quite an interesting teaching experiment. I am glad to see that some of my student have followed my path and join Yoshua Bengio lisa great lab.
If you are looking for a great talk about the latest state of the art in machine learning, look at that Hinton "Brains, Sex, and Machine Learning" youtube video and Yoshua Bengio slides "DeepLearning of Representations"(Google talk).
Labels:
face recognitions,
java,
machine learning,
python,
teaching
Monday, January 10, 2011
How to create standalone python apps?
You might have to run your applications in your customer infrastructure but you might not want to give your recipes (python source code) so here are the alternatives depending on your OS:
- windows = py2exe
- mac = py2app
- linux = pyinstaller (freeze doesn't work->compile errors*)
On linux, pyinstaller works quite well but you have to generate it on the same distribution.
Here are the steps:
- download latest version
- python Configure.py
- python Makespec.py /path/to/yourscript.py
- python Build.py /path/to/yourscript.spec
- start app: yourscript/dist/yourscript/yourscript(binary executable)
(*) Freeze instructions:
- svn checkout http://svn.python.org/projects/python/trunk/Tools/freeze/
- python freeze/freeze.py yourscript.py
- make
Wednesday, October 13, 2010
simple multivariate classifier example using python & numpy
I was wondering how long it could take to write a multivariate classifier in python.
With python and numpy it isn't long. We simply need to be able to compute the covariance matrix, the determinant and to inverse a matrix (covariance matrix). Even if the matrix is singular, which mean it can't inverse it, you can compute the pseudo-inverse (Moore-Penrose) easily (i.e.: numpy.linalg.pinv).
As expected, assuming too much about the data lead to poor classification.
You can find a simple python program of 75 lines here.
Wednesday, July 21, 2010
patching class function in python
Today we had to patch a class function in production. Monkey patching can become tricky if reference are kept at several place like pointer in C and C++.
Here is a simple example on how to make sure all references will use the new definition.class Foo:This is another reason why interpreter language like python are so powerful.
def f(self):
print "default f"
def newf(self):
print "newf"
Foo.f.im_func.func_code = newf.func_code
Friday, January 8, 2010
matplotlib & python for powerful data visualization
Here is an example of data that isn't obvious to analyze:
What is the gain and lost effect of percentage of seats in a point of view of proportional representation? Percentage of seats is usually chosen in legislative assemblies. It is the process used in Canadian and Québec elections.
Powerful visualization allow you to see easily the effect. Python & matplotlib is an amazing combination to do so. It took me 20 minutes to allow me to visualize the effect in federal and Quebec election of 2008.
Upper graph (seats vs votes) shows the lost of proportional vote % if you use a seats approach. As an example, liberals gain ~11% and ADQ lost of ~11%.
Lower graph (lost seats vs votes). The real impact of party is the ratio of this lost on their real vote proportion. In this example, it is a gain of ~25% for each Liberals votes (11/(66/125)) and a lost of 66% for the ADQ and ~88% for QS.
Basically:
What is the gain and lost effect of percentage of seats in a point of view of proportional representation? Percentage of seats is usually chosen in legislative assemblies. It is the process used in Canadian and Québec elections.
Powerful visualization allow you to see easily the effect. Python & matplotlib is an amazing combination to do so. It took me 20 minutes to allow me to visualize the effect in federal and Quebec election of 2008.
Upper graph (seats vs votes) shows the lost of proportional vote % if you use a seats approach. As an example, liberals gain ~11% and ADQ lost of ~11%.Lower graph (lost seats vs votes). The real impact of party is the ratio of this lost on their real vote proportion. In this example, it is a gain of ~25% for each Liberals votes (11/(66/125)) and a lost of 66% for the ADQ and ~88% for QS.
Basically:- In Canadian election, PC & BQ gain power but BQ way more in proportion and Greens lost everything
- In Quebec election: QS & ADQ lost lot of power and PQ and LIB gain it: it might explain why they aren't talking of changing election formula
- Matplot lib and python is an amazing combination to automate data visualization
svn co https://mlboost.svn.sourceforge.net/svnroot/ mlboost/elections
python elections/seats_vs_prop.py
Gerrymandering Explained (youtube;
Gerrymandering - another reason why rep democracy is fundamentally corrupt http://bit.ly/qO4mpH)
Subscribe to:
Posts (Atom)
