view python/cheeseshop.txt @ 280:c738405d0d6c

mkdir -p
author Jeff Hammel <jhammel@mozilla.com>
date Thu, 02 May 2013 11:21:30 -0700
parents a599feff7c93
children
line wrap: on
line source

Making your own cheeseshop

It is a common misconception that the cheeseshop (pypi.python.org) is
*the* place to go for python packages.  This is only true by
convention.  I won't go into the history of
distutils/setuptools/distribute/pip packaging fiasco.  Suffice it to
say that (I hope) things are slowly converging and that both pip and
easy_install, the two major ways of installing python software, both
support a -i option to specify the URL of the package index (which is,
by default http://pypi.python.org/simple).

In its base form, easy_install and pip just crawl links.  You look at
the base URL (see above) /<package>/<package>-<version>-<extension>
and download this, unzip, run `python setup.py install` on it and
you're done.  So if you want to make a cheeseshop, there are two
essential tasks:

 1. Generating e.g. tarballs for a package and all of its dependencies
 2. Putting these tarballs on the web with some appropriate parent
    directory

Not rocket science...barely computer science, really.

For generating packages and their dependencies, I used pip.  pip is
really great for this.  I only used the command line interface, though
if I was smarter, I probably should have looked at the API and figured
out what pip is doing internally and I could have avoided a few
steps.  Basically, using the --no-install option downloads the package
and its dependencies for you and lets you do what you want with it. 

I made a program for this, see http://k0s.org/hg/stampit . It's a
python package, but it doesn't really do anything python-y.  It was
just easier to write than a shell script for my purposes.  Basically
it makes a virtualenv (probably overkill already), downloads the
packages and their dependencies into it, runs `python setup.py sdist`
on each package so that you have a source distribution, and prints out
the location of each tarball.

The source distribution is very important as we want packages that
will work independent of platform.  These should.  If they don't, we
can make them.

So problem #1 solved.  Let's move on to problem #2: putting them
somewhere on the web.

Mozilla is so kind as to have given me a URL space on
people.mozilla.org. Since easy_install and pip are really dumb and
basically just crawl links, and since Apache is smart enough to
generate index pages for directories that don't have index.html files
in them, the hard part is already solved.  I will note that
people.mozilla.org is not intended as a permanant place for these
tarballs, just an interim instance until we decide where we really
want to put them.

Since I like to write scripts, I wrote a script that will run stampit
and copy the resulting tarballs to a place appropriate to a
cheeseshop.  You can see the code here:

http://k0s.org/mozilla/package-it.sh

The variables are pretty specialized to my setup, but of course that's fixable.

Does it really work?

Yes!  You can try it for yourself. Try:

``easy_install -i http://people.mozilla.org/~jhammel/packages/ mozmill``

Watch where the links come from.  Surprise!  They're all from
http://people.mozilla.org/~jhammel/packages/ !
I would *highly advise* doing this (and just about everything else in
python) in a virtualenv so that you don't pollute your global
site-packages.

Why am I doing this?

The Firefox buildslaves are supposed to fetch data only from mozilla
URLs for stability.  So, if python packages need to be installed, they
need to be available internal to Mozilla.  If a package didn't have
dependencies, then this is a no-brainer.  But packages do have
dependencies. Mozmill depends jsbridge, simplejson, and mozrunner.
While this is a lot of work for just one package, if we want more
python stuff in our buildbot tests, we'll need to do more of this, and
I'd rather have a good solid methodology to do so.  I also imagine
this growing as a place to put all of our python packages for internal
Mozilla needs.

I will note that I did this in a few hours from basically knowing the
problem space but never having actually done it.  None of this is
supposed to be a clean and polished solution.  But really, its not
bad.  We did something similar but less functional at my last job, The
Open Planning Project, for similar reasons, so its not like I tackled
this blindly. This is not as fully functional as the cheeseshop.  A maintainer
needs to run the package-it.sh script for each package (and its deps)
they want installed.  There's no accounts or any of the other features
the cheeseshop has.  But for a simple prototype and a way to move the
discussion forward, its actually not that bad of a solution.  There
are more robust ways of really doing the cheeseshop, such as
http://github.com/ask/chishop , but for a package dumping ground, this
solution works and its really not even that hacky (in my opinion
anyway).