EC2 setup and SSH access from and to ubuntu 10.04 boxes


step 0: X.509 certs

  • AWS console >> Security Credentials >> ” X.509 Certificates ” tab
  • Save private key and certs in ~/.ec2
  • $ chmod go-rwx ~/.ec2/*.pem # set view rights on file
  • Get account id @ page bottom

step 1: install EC2 terminal tools

  • $ sudo apt-add-repository ppa:awstools-dev/awstools # update Personal Package Archive
  • $ sudo apt-get update # update the apt-get program
  • $ sudo apt-get install ec2-api-tools # install tools
  • create new ~/.ec2 dir
  • AWS console >> create new private key >> save in above dir

step 2: setup paths

step 3: fire up instance

  • http://uec-images.ubuntu.com/releases/10.04/release/ # list of published ubuntu AMIs
  • $ ec2-run-instances ami-xxxxx -k ${EC2_KEYPAIR} -t <instance type> –region <aws region>
  • $ ec2-run-instances ami-349b495d -k ${EC2_KEYPAIR} –instance-type t1.micro –region us-east-1
  • $ ec2-run-instances ami-809a48e9 -k ${EC2_KEYPAIR} –instance-type m1.small –region us-east-1

step 4: Log in!

  • $ ec2-describe-instances # get the name of your instance from here
  • $ ec2-authorize default -p 22 # allow default SSH port
  • $ ssh -i ~/.ec2/ec2.pem ubuntu@ec2-23-20-123-214.compute-1.amazonaws.com

step 5: Install shit!

step 6: Kill the server

from: https://help.ubuntu.com/community/EC2StartersGuide

 

Tastypie API for Django


STEP 0: Installing pre-requisites (setup on python 2.6.5)

Note: Don’t name your project directory ‘api’

STEP 1: Djangin-Tastypie ( dev style bc you’re a good contributing programmer )

  • $ cd ~/python_projects/
  • $ git clone https://github.com/toastdriven/django-tastypie.git
  • $ cp -r django-tastypie/ ~/django_projects/api/

STEP 2: Implement

STEP 3: Add the api

  • $ python manage.py startapp sequence-api
  • add this code to /sequence-api/models.py https://gist.github.com/1770411
  • $ python manage.py sql sequence_api
  • $ python manage.py syncdb

STEP 4: Verify installation

  • $ python manage.py shell
  • >> from sequence_api.models import Gene
  • >> Gene.objects.all()

 

Python, Ruby and Simple Montecarlo Simulations


The above posts raises a slight issue with the Ruby script as the function being called will return a random float between 0 inclusive and 1 exclusive. Thus with a determining value being Heads for the set [0 , 0.5] and Tails for (0.5, 1) the script is slightly skewed towards Heads.

With the random.uniform(0,1) in Python we can get a little closer.. but we’re still not at a true random set of [0,1]. An additional workaround would be to scale the random seed according to the number of possible outcomes. It is safe to assume that uniform(0,1) is more biased than uniform(0,100000). Scale it to your set size and you’re good to go!

 

Learning Statistics from ground zero: Python + Khan Academy


I think this is going to be a cool way to go about learning statistics. Started on the first few problems @ khan and they’re doing an awesome job explaining the breakdown of statistics!

so far I’ve written some short programs for sample mean, sample standard deviation, population mean, and population standard deviation

http://bpaste.net/show/22474/

 

A good app is a FAST app


Getting Django on Heroku prancing 8 times faster.

 

Troubleshooting RESTful API calls on HMMER


STEP 0: Attempt to use example API call:

  • $ curl -L -H ‘Expect:’ -H ‘Accept:text/xml’ -F seqdb=pdb -F algo=phmmer -F seq=’
  • $ curl: (26) failed creating formpost data

So we’re working with a broken API call here … first off Google that error and we get back a mention of a curl “strace” operation

STEP 1: Run Curl strace

  • $ strace curl -L -H ‘Expect:’ -H ‘Accept:text/xml’ -F seqdb=pdb -F algo=phmmer -F seq=’
  • returns: http://pastie.org/3036474

Substantial amount of text there but lets work up from the bottom:

STEP 2: Problem?

  • If i had to guess ln:448
  • open(“test.seq”, O_RDONLY) = -1 ENOENT (No such file or directory)

STEP 3: HMMER expects test.seq file as input

STEP 4: Run it again!

  • $ curl -L -H ‘Expect:’ -H ‘Accept:text/xml’ -F seqdb=pdb -F algo=phmmer -F seq=’
  • Beautiful ! We get the expected output!

Note: additionally the sequence can be appended directly to the Curl HTTP request: http://pastie.org/3036798

 

Haml, Sass and Compass for Django development on Ubuntu


STEP 0: Install HamlPy

  • $ sudo easy_install hamlpy

STEP 1: Install Sass

  • $ cd ~/
  • $ git clone git://github.com/nex3/sass.git
  • $ cd sass/
  • $ rake install
  • $ cd ~/
  • $ rm -rf sass/

STEP 2: Install Compass ( for this directory structure ) NOTE: ubuntu doesnt require quotes around the dir names

  • $ gem install compass
  • $ cd django_projects/mysite/static/
  • $ compass install compass . --syntax sass --sass-dir sass --css-dir css --javascripts-dir javascripts --images-dir images
  • youll get a console output like this: http://pastie.org/3028330
  • Haml and Pythonize: http://pastie.org/3028383

STEP 3: Run Haml watcher

  • $ cd django_projects/mysite/
  • $ hamlpy-watcher templates/

STEP 4: Run Sass/Compass watcher ( in new terminal )

  • $ cd ~/django_projects/mysite/
  • $ compass watch

Notes:

  • This setup all files need to be explicitly generated from their .haml and .sass formats using the watchers
  • Uniike typical Ruby deployment run-time compilation of the css wont happen unless you’re running Ruby on your production servers
  • Using two code bases: Compass/Sass with Django and HamlPy

 

Install NumPy 1.6.1 on Ubuntu 10.04 via command line


STEP 0: Only run this if you’ve attempted previous installations ( this will clean out the needed dirs )

  • sudo rm -rf /usr/local/lib/python2.6/dist-packages/matplotlib*
  • sudo rm -rf /usr/local/lib/python2.6/dist-packages/pylab*
  • sudo rm -rf /usr/local/lib/python2.6/dist-packages/mpl_toolkits/mplot3d
  • sudo rm -rf /usr/local/lib/python2.6/dist-packages/mpl_toolkits/axes_grid
  • sudo rm -rf /usr/local/lib/python2.6/dist-packages/mpl_toolkits/axes_grid1
  • sudo rm -rf /usr/local/lib/python2.6/dist-packages/mpl_toolkits/axisartist
  • sudo rm /usr/local/lib/python2.6/dist-packages/mpl_toolkits/*.py

STEP 1: Tricky workaround to install the dependencies

  • sudo apt-get build-dep python-matplotlib

STEP 2: Now we remove the old NumPy, while the installed dependencies remain

  • sudo apt-get remove python-numpy

STEP 3: Download NumPy 1.6.1

  • cd ~/
  • wget http://downloads.sourceforge.net/project/numpy/NumPy/1.6.1/numpy-1.6.1.tar.gz

STEP 4: Install NumPy 1.6.1

  • tar xzvf numpy-1.6.1.tar.gz
  • cd numpy-1.6.1
  • python setup.py build
  • sudo python setup.py install

… And you’re all done!

Thanks to the original post on the ubuntu forums for this clever little work around and now to the hacking!

http://ubuntuforums.org/showthread.php?t=1573925

 

Python GHMM on Ubuntu 10.04 installation


STEP 0: Install prerequisite packages

  • $ sudo apt-get install python-dev libxml2-dev swig libtool

STEP 1: Install GHMM

  • $ svn co https://ghmm.svn.sourceforge.net/svnroot/ghmm/trunk/ghmm
  • $ cd ghmm/
  • $ sh autogen.sh
  • $ ./configure
  • $ make
  • $ sudo make install

additional troubleshooting here:
http://www.linuxquestions.org/questions/linux-software-2/ghmm-library-277690/

original library:
http://ghmm.sourceforge.net/installation.html

Keywords: General Hidden Markov Model, Python, Artificial intelligence

 

The evolution of the high-throughput biology lab


The evolution of the biology lab:
- entirely automated
- no human interaction with physical equipment
- run from a UI or an API
- remove human error through compiling / error checking before running a protocol

Think of it in terms of a computer: to interact with it, and use it to some ends.. you don’t need to be working INSIDE of it. Humans in a lab only create errors.

The above information I see as being clear, if you have some say on how I might be wrong, please voice it.

However, my primary assertion is this:

The evolution of the most powerful systems in biological research ( and thus the individuals who do the most damage ) will abstract scientific protocols to a programming language.

Yes, scientists today like pretty interfaces ( they don’t want to have to think… and this is a self-detrimental post coming from a designer ).  However when you factor in the complex nature of biological research and the need for multiplexing. Why do we have 96 well microplates, which now are being replaced with 384 or even 1536 well plates?

Multiplexing.

The systems as they are engineered today exist to make small steps towards automation, and slowly automate the research process.  The moment we have a system capable of cross-automating all of the biological research done in a lab…

The very best biologists will be programmers.

Current downfall:

Entirely cross-automating / designing all systems to work together seamlessly.

http://images.sciencedaily.com/2009/04/090402143451-large.jpg