It is a common misconception that the cheeseshop (pypi.python.org) is the place to go for python packages. This is only true by convention. I won't go into the history of distutils/setuptools/distribute/pip packaging fiasco. Suffice it to say that (I hope) things are slowly converging and that both pip and easy_install, the two major ways of installing python software, both support a -i option to specify the URL of the package index (which is, by default http://pypi.python.org/simple).
In its base form, easy_install and pip just crawl links. You look at the base URL (see above) /<package>/<package>-<version>-<extension> and download this, unzip, run python setup.py install on it and you're done. So if you want to make a cheeseshop, there are two essential tasks:
- Generating e.g. tarballs for a package and all of its dependencies
- Putting these tarballs on the web with some appropriate parent directory
Not rocket science...barely computer science, really.
For generating packages and their dependencies, I used pip. pip is really great for this. I only used the command line interface, though if I was smarter, I probably should have looked at the API and figured out what pip is doing internally and I could have avoided a few steps. Basically, using the --no-install option downloads the package and its dependencies for you and lets you do what you want with it.
I made a program for this, see http://k0s.org/hg/stampit . It's a python package, but it doesn't really do anything python-y. It was just easier to write than a shell script for my purposes. Basically it makes a virtualenv (probably overkill already), downloads the packages and their dependencies into it, runs python setup.py sdist on each package so that you have a source distribution, and prints out the location of each tarball.
The source distribution is very important as we want packages that will work independent of platform. These should. If they don't, we can make them.
So problem #1 solved. Let's move on to problem #2: putting them somewhere on the web.
Mozilla is so kind as to have given me a URL space on people.mozilla.org. Since easy_install and pip are really dumb and basically just crawl links, and since Apache is smart enough to generate index pages for directories that don't have index.html files in them, the hard part is already solved. I will note that people.mozilla.org is not intended as a permanant place for these tarballs, just an interim instance until we decide where we really want to put them.
Since I like to write scripts, I wrote a script that will run stampit and copy the resulting tarballs to a place appropriate to a cheeseshop. You can see the code here:
http://k0s.org/mozilla/package-it.sh
The variables are pretty specialized to my setup, but of course that's fixable.
Does it really work?
Yes! You can try it for yourself. Try:
easy_install -i http://people.mozilla.org/~jhammel/packages/ mozmill
Watch where the links come from. Surprise! They're all from http://people.mozilla.org/~jhammel/packages/ ! I would highly advise doing this (and just about everything else in python) in a virtualenv so that you don't pollute your global site-packages.
Why am I doing this?
The Firefox buildslaves are supposed to fetch data only from mozilla URLs for stability. So, if python packages need to be installed, they need to be available internal to Mozilla. If a package didn't have dependencies, then this is a no-brainer. But packages do have dependencies. Mozmill depends jsbridge, simplejson, and mozrunner. While this is a lot of work for just one package, if we want more python stuff in our buildbot tests, we'll need to do more of this, and I'd rather have a good solid methodology to do so. I also imagine this growing as a place to put all of our python packages for internal Mozilla needs.
I will note that I did this in a few hours from basically knowing the problem space but never having actually done it. None of this is supposed to be a clean and polished solution. But really, its not bad. We did something similar but less functional at my last job, The Open Planning Project, for similar reasons, so its not like I tackled this blindly. This is not as fully functional as the cheeseshop. A maintainer needs to run the package-it.sh script for each package (and its deps) they want installed. There's no accounts or any of the other features the cheeseshop has. But for a simple prototype and a way to move the discussion forward, its actually not that bad of a solution. There are more robust ways of really doing the cheeseshop, such as http://github.com/ask/chishop , but for a package dumping ground, this solution works and its really not even that hacky (in my opinion anyway).
A few existing pypi servers worth noting:
And related tools: