python packaging

sys.path

python looks in sys.path in order to determine what modules to load. While the order may be changed, the usual case is to look for module imports in the following order:

in the current working directory
in PYTHONPATH
in site packages loaded by .pth files

You can also alter sys.path programmatically, but this in general is not a good packaging solution.

PYTHONPATH

PYTHONPATH is an environment variable which specifies an ordered list of directories to look for imports:

> echo "print 'how are you gentlemen?!?'" >> /tmp/allyourbase.py
> PYTHONPATH=/tmp python -c "import allyourbase"
how are you gentlemen?!?

Subdirectories relative to each element of PYTHONPATH are also recursed into if they contain an __init__.py file:

> mkdir /tmp/foo
> echo "import bar" >> /tmp/foo/__init__.py
> echo "print 'i am a tomato'" >> /tmp/foo/bar.py
> PYTHONPATH=/tmp python -c "import foo"
i am a tomato

Example of sys.path differences:

echo $PYTHONPATH; diff <(PYTHONPATH=/tmp python -c "import sys; print '\n'.join(sys.path)") <(python -c "import sys; print '\n'.join(sys.path)")
/home/jhammel/python:
2c2,3
< /tmp
---
> /home/jhammel/python
> /home/jhammel

.pth files

site.py searches it's distribution directory for .pth files and adds packages that it finds to sys.path. site.py and the associated .pth files are responsible for loading most packaged code (that is, installed python packages that aren't part of python's standard library).

site.py is imported automatically upon python initialization (unless the -S switch is passed). site.py is searched for in lib/python<version>/site.py relative to sys.prefix and sys.exec_prefix (if different). While any code can be put in site.py, generally this module tells python to load the packages in sys.prefix + 'lib/python<version>/site-packages' and sys.prefix + 'lib/python<version>/dist-packages' or maybe a paths depending on the OS vendor.

python looks for its lib directory and site.py relative to PYTHONHOME if it is set, then relative to the path to the python binary. virtualenv relies on the latter behaviour. Since PYTHONHOME is coded in python at the C level, virtualenv will not function correctly with PYTHONHOME set.

Example of a .pth file:

> cat python-support.pth
/usr/lib/pymodules/python2.6
gtk-2.0
/usr/lib/pymodules/python2.6/gtk-2.0

The specified directories are added to sys.path in order. If a line is not an absolute path, it is relative to the location of the .pth file.

PYTHONHOME

The PYTHONHOME environment variable controls where python looks for its lib and site.py (and therefore site-packages) as well as it's include and bin directories. So unlike PYTHONPATH, which is only consumed by sys.path, PYTHONHOME is also respected when installing software by any of the installers. So it is much more pervasive.

PYTHONHOME is respected at the the C level in the (standard) CPython implementation, so it is a high level override to the standard behaviour. Typically, PYTHONHOME is used to maintain parallel python installations in an alternate way to virtualenv.

Example of PYTHONHOME:

> python -c "import sys; print sys.path"
['', '/home/jhammel/python', '/home/jhammel/mozilla/src/mozilla-central', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/usr/lib/python2.6/lib-old', '/usr/lib/python2.6/lib-dynload', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages/PIL', '/usr/lib/python2.6/dist-packages/gst-0.10', '/usr/lib/pymodules/python2.6', '/usr/lib/python2.6/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.6/gtk-2.0', '/usr/lib/python2.6/dist-packages/wx-2.8-gtk2-unicode', '/usr/local/lib/python2.6/dist-packages']
> PYTHONHOME=/tmp python -c "import sys; print sys.path"
'import site' failed; use -v for traceback
['', '/home/jhammel/python', '', '/tmp/lib/python2.6/', '/tmp/lib/python2.6/plat-linux2', '/tmp/lib/python2.6/lib-tk', '/tmp/lib/python2.6/lib-old', '/tmp/lib/python2.6/lib-dynload']

The null string in sys.path represents the current working directory. Note that, because the standard site.py isn't found with respect to the location of PYTHONHOME when it is set, the paths it would load are not included in sys.path in that case.

Reference

A Poor Man's Virtual Environment

Python Module Installers

Several packages provide functionality to install python packages. The basic functionality of an installer is:

add a package to sys.path using pth files to make it available independent of your current working directory
providing metadata for the package
providing a way to share packages via pypi.python.org or other package index

All the standard python installers make use of a setup.py file which calls a setup() function providing the package metadata and detailing what should be installed. setup() installs the package(s). The standard way of installing python packages is, in the package's directory, running python setup.py install. The package's python files will be copied to python's lib directory from which they will be importable. By convention, python files for the package are kept in subdirectories relative to setup.py. If other initialization or checks are required for installation of the python package, these may (and should) be done in setup.py as well.

distutils

distutils is the only installation module that is part of the python standard library. distutils is a basic packaging system. It does not provide for dependencies or web installation or the like. The main reason to use distutils is that it will be present on platforms running python. Documentation on the distutils setup function may be read via python -c 'from distutils.core import setup; help(setup)'

setuptools

setuptools improves over distutils in a number of ways:

package dependencies: setuptools allows packages to depend on each other and to install dependencies of packages
allowing a development mode so you can edit .py files in place and still have them installed: python setup.py develop
easy_install and network-aware installation
entry points: setuptools setup() function takes an entry_points keyword argument. Using pkg_resources' (part of setuptools, basically) iter_entry_points function, this provides a cheap way to implement the interface pattern in python (an important design strategy that python is otherwise natively without).

However, setuptools is not part of the standard library (and never will be). Via-net automagic use is available using ez_setup.py.

easy_install

easy_install is a widely-used program for installing python software over the web. Instead of the traditional way of navigating to the package on pypi, downloading the software (and dependencies, if any), unpacking, and running python setup.py install, easy_install allows package installation from the command line: easy_install <package-name>. This will install <package-name> and its dependencies to python's lib location. easy_install will look to pypi for its package index by default, but this may be overridden by specifying an index with the -i switch.

Deficiencies of setuptools

setuptools may not be installed: despite the reality that setuptools is the de facto standard for python packaging, it may not be present on a user's system. While ez_setup.py may be used to ensure correct behaviour, this methodology is not only somewhat invasive, it also depends on python setup.py being run on a computer with net access. [Note: I personally do not use ez_setup.py]
often ill-maintained: while right now (May, 2010, setuptools seems to work more or less correctly, setuptools (particularly the web package retrieval portions) have historical often been brittle (for instance, when SVN 1.5 was released).
difficult to debug: when there is a problem with package installation, it is often quite hard and cryptic to figure out what exactly went wrong since setuptools does many things behind the scenes. pip helps with this (but is neither part of the standard library nor setuptools but yet another third party package).
find_links do not recurse; this means that the parent package has to know about the child's dependencies

distribute

distribute is a fork of setuptools. Due to lack of maintainence, setuptools was forked. distribute started in order to have a maintained and forward-facing package management solution for python. While setuptools is still more commonly used, distribute is likely to be included in a future release of python and has received Guido's blessing. distribute behaves otherwise like setuptools but aimed at compatability and streamlining going forward.

Example package

Directory Structure

.
|-- foo
|   |-- bar.py
|   `-- __init__.py
`-- setup.py

[Note: I removed the setup.cfg and the egg-info that PasteScript creates. The former is unnecessary and the latter should not be versioned.]

Importing foo

Once it is installed (via python setup.py install), you can import foo in the usual way independent of your current directory (as its location is provided by pth files automatically loaded by site.py):

import foo
import foo.bar
from foo import bar

setup.py

The setup.py is a bit more complicated than it needs to be since it was created using a template. But it works and all the basic pieces are there:

from setuptools import setup, find_packages
import sys, os

version = '0.0'

setup(name='foo',
      version=version,
      description="the foo package",
      long_description="""\
""",
      classifiers=[], # Get strings from http://pypi.python.org/pypi?%3Aaction=list_classifiers
      keywords='',
      author='Jeff Hammel',
      author_email='jhammel@example.com',
      url='',
      license='MPL',
      packages=find_packages(exclude=['ez_setup', 'examples', 'tests']),
      include_package_data=True,
      zip_safe=False,
      install_requires=[
          # -*- Extra requirements: -*-
      ],
      entry_points="""
      # -*- Entry points: -*-
      """,
      )

This is a setuptools setup.py file, though a distutils or distribute setup.py would look similar.

Creating the foo package

> paster create foo
Selected and implied templates:
  PasteScript#basic_package  A basic setuptools-enabled package

Variables:
  egg:      foo
  package:  foo
  project:  foo
Enter version (Version (like 0.1)) ['']:   
Enter description (One-line description of the package) ['']: the foo package
Enter long_description (Multi-line description (in reST)) ['']: 
Enter keywords (Space-separated keywords/tags) ['']: 
Enter author (Author name) ['']: Jeff Hammel
Enter author_email (Author email) ['']: jhammel@example.com
Enter url (URL of homepage) ['']: 
Enter license_name (License name) ['']: MPL
Enter zip_safe (True/False: if the package can be distributed as a .zip file) [False]: 
Creating template basic_package
Creating directory ./foo
  Recursing into +package+
    Creating ./foo/foo/
    Copying __init__.py to ./foo/foo/__init__.py
  Copying setup.cfg to ./foo/setup.cfg
  Copying setup.py_tmpl to ./foo/setup.py
Running /home/jhammel/stage/bin/python setup.py egg_info
> echo 'print "hello world"' > foo/foo/bar.py
> cd foo; python setup.py develop > /dev/null # install in place
running develop
running egg_info
writing foo.egg-info/PKG-INFO
writing top-level names to foo.egg-info/top_level.txt
writing dependency_links to foo.egg-info/dependency_links.txt
writing entry points to foo.egg-info/entry_points.txt
reading manifest file 'foo.egg-info/SOURCES.txt'
writing manifest file 'foo.egg-info/SOURCES.txt'
running build_ext
Creating /home/jhammel/stage/lib/python2.6/site-packages/foo.egg-link (link to .)
foo 0.0dev is already the active version in easy-install.pth

Installed /home/jhammel/stage/foo
Processing dependencies for foo==0.0dev
Finished processing dependencies for foo==0.0dev
> python -c 'from foo import bar' # your directory location is unimportant

If you want to repeat this experiment, I recommend using virtualenv so as to not pollute your global site packages.

References

virtualenv

virtualenv is a virtual python implementation. This means, a virtualenv provides a separate environment for the installation of python packages. This is a big boon for python development, as it allows easy installation of various pieces of python code without modification of the system site-packages.

On running virtualenv <directory>, activation scripts are created in <directory>/bin. These scripts change the PATH to add this bin directory and provide a deactivate() function. However, giving the full path to executables in $VIRTUAL_ENV/bin will work without having to use the activate scripts (unless PYTHONHOME is set, in which case the scripts must be run or PYTHONHOME otherwise unset before using the virtualenv). Since the activate scripts set environment variables, they must be sourced in bash rather than open in a subshell: . <directory>/bin/activate.

virtualenv works by copying the system binary (the one used to invoke virtualenv, unless otherwise specified) to the bin subdirectory of the given target as well as symlinking the standard library to the lib/python<version> subdirectory. By default, the system site-packages will be included as well, but this may be overridden with the --no-site-packages switch. Because of the way python looks for its lib directory relative to the binary location, the virtual environment may be used to install and load packages without impacting system python. Since PYTHONHOME is examined before resolving relative to the binary, a virtualenv will not work when PYTHONHOME is set. Unsetting PYTHONHOME or running the activate scripts must be utilized before attempting to use the virtualenv's executables.

Example usage of virtualenv:

> virtualenv.py foo
New python executable in foo/bin/python
Installing setuptools............done.
> tree foo -L 2
foo
|-- bin
|   |-- activate
|   |-- activate_this.py
|   |-- easy_install
|   |-- easy_install-2.6
|   |-- pip
|   `-- python
|-- include
|   `-- python2.6 -> /usr/include/python2.6
`-- lib
    `-- python2.6

> foo/bin/python -c "import sys; print sys.prefix; print sys.path"
/home/jhammel/foo
['', '/home/jhammel/foo/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg', '/home/jhammel/foo/lib/python2.6/site-packages/pip-0.7.1-py2.6.egg', '/home/jhammel/python', '/home/jhammel', '/home/jhammel/foo/lib/python2.6', '/home/jhammel/foo/lib/python2.6/plat-linux2', '/home/jhammel/foo/lib/python2.6/lib-tk', '/home/jhammel/foo/lib/python2.6/lib-old', '/home/jhammel/foo/lib/python2.6/lib-dynload', '/usr/lib/python2.6', '/usr/lib/python2.6/plat-linux2', '/usr/lib/python2.6/lib-tk', '/home/jhammel/foo/lib/python2.6/site-packages', '/usr/local/lib/python2.6/site-packages', '/usr/local/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages', '/usr/lib/python2.6/dist-packages/PIL', '/usr/lib/python2.6/dist-packages/gst-0.10', '/usr/lib/pymodules/python2.6', '/usr/lib/python2.6/dist-packages/gtk-2.0', '/usr/lib/pymodules/python2.6/gtk-2.0', '/usr/lib/python2.6/dist-packages/wx-2.8-gtk2-unicode']

virtualenv comes bundled with setuptools, distribute, and pip (and of course has distutils since that is part of the standard library).

virtualenv.py may be used as a single file to create a virtual environment:

curl http://bitbucket.org/ianb/virtualenv/ | python - myenvironment

When used in this way, virtualenv.py will download setuptools, etc., from the network. If the whole virtualenv directory is downloaded, even if it is not installed, virtualenv.py will not touch net.

pip

pip is a python installer. It is compatible with setuptools and provides additional functionality. pip is intended as a more intelligent version of easy_install. Like easy_install, pip looks by default to http://pypi.python.org/simple as its package index, but this may be overridden with the pip install -i switch.

In addition to being a standalone package, pip (like setuptools and distribute) comes bundled with virtualenv.

References

Takeaways

When deciding how to treat an intra-dependent set of .py files, especially with respect to deployment, there are several options:

Package them

That is to say, add a setup.py appropriate to one (or more) of the installers whose function is to install your software to site-packages (either the system's, or a virtualenv's, or a location specified by PYTHONHOME or with a prefix argument dictated to your installer). This has the advantage that modules in your package will be importable within python's sys.path and that console scripts and other components setup by setup.py will be installed correctly.

There are several pieces of software that make this easy. For example, the basic_package paste template will create a package skeleton including a setup.py file where you can put your code:

paster create MyPackage; cp mypythonfiles/* MyPackage/mypackage

paster will ask for various bits of metadata concerning the package (these can be saved in a config file for ease of reuse). See a more verbose example if desired.

The normal case is to have python files in a subdirectory with respect to setup.py (since setup.py is not part of your package as it shouldn't be installed). This creates a directory structure one level different than unpackaged python, so for version control systems, going from unpackaged to packaged is a radical rearrangement of the tree. This does not have to be the case. The py_modules keyword argument to setup() may be used to specify modules relative to the location of setup.py.

As Ian Bicking correctly pointed out, for single files, there is considerable overhead in packaging with any of the existing python packaging solutions. However, with this overhead comes the ability for your package to be shared on pypi, to be used as a dependency for other packages, and to be installed in a usual way. [Note: while perhaps python is below average in this respect, it is hardly unique. For example, most version control systems demand a directory structure instead of allowing versioning of a single file.

Put them in the same directory

.py files will look for imports relative to their location on the filesystem:

> echo 'import bar' > /tmp/foo.py
> for i in foo bar; do echo "print '$i'" >> /tmp/$i.py; done
> python /tmp/foo.py 
bar
foo

Keeping all your .py files in the same directory allows these relative imports from any of them. This is the simplest solution and also the least appropriate for complex software. To give a few examples as to why

your software cannot meaningfully depend on other software
other software cannot meaningfully depend on your software
you cannot distribute your software on a package index

In addition, putting all your .py files in the same directory tends towards vague and ambiguous software whose purpose becomes "stuff I need to do". While people care to differing degrees on writing modular and reusable code, it is generally agreed (and observed in practice) having very specific code becomes brittle and difficult to maintain. Software is a manifestation of intent, and without asking the question "what do I really want to do?", software tends towards the perceived need at the time. Since solutions to this problem tend to be very specific, and since needs evolve in time, this approach not only demands often and pervasive rewrites, since each of these rewrites tend to take the least short-term effort, the code quickly becomes spaghetti. Before solving any problem, software or otherwise, ask yourself: What do I really want to do? [Note: This is not a defense of unnecessary abstraction or an overly top-down approach. Like any generalization, going too far the other way is also bad. You can meditate indefinitely on what the real problem is without writing a single line of code, and instead of spending all your time maintaining unmaintainable code, you've spent all your time not writing anything. There must be a balance.

Fix PYTHONPATH

For the case where you want multiple directories (for purposes of this section, "packages"), but don't want to have a setup.py file, you may run uninstalled separate directories out of the same root directory as long as you set PYTHONPATH to point to this root directory

Example

Let's say you have packages foo and bar that you want to run unpackaged (which is to say, having the .py files and associated resources in directories named foo and bar). If they are both subdirectories of /path/to/my/python, you can write a shell script that will set the python:

#!/bin/bash
PYTHONPATH=/path/to/my/python:$PYTHONPATH
exec $@ # the argument to the script is the command to run

If you don't know the absolute location of /path/to/my/python, you can put a more clever script at the parent directory of foo and bar

#!/bin/bash
cd dirname $0
PYTHONPATH=$(pwd):$PYTHONPATH
cd -
exec $@

Another approach is to have a script that is sourced, rather than having the script execute the desired python:

path_to_this=$(history | tail -1 | awk '{print $3}')
directory=$(dirname $path_to_this)

# bash won't tilde-expand in a variable, so do it manually (optional)
directory=${directory/'~'/$HOME}

cd $directory
directory=$(pwd)
cd - > /dev/null
PYTHONPATH=${directory}:$PYTHONPATH

The script is then used like

. /path/to/activate && command-you-want --and arguments

The more complicated example is used where the absolute path of the script is unknown. In the case where ${directory} is known, a one-liner PYTHONPATH=${directory}:PYTHONPATH suffices. You could conceivably fix up $PATH in this script too, if desired. Note that this method duplicates the intent of virtualenv's activate scripts.

In this approach, you must provide an __init__.py file in the directory with your python files (that is, in all the subdirectories of PYTHONPATH that you wish to import from). If this is not found, python will not recurse into these directories to locate importable modules. __init__.py may be a blank file.

This approach has the same disadvantage as putting all files in the same directory, but at least you can separate files by common intent and functionality.

Hybrid Approach

For various reasons, it may be desirable or it may be perceived desirable to run uninstalled python as part of a deployment. If it is also desirable to package and share the code, it is possible to have the same code-base usable packaged and unpackaged.

The usual way of doing installable packages is to have python code in a subdirectory relative to setup.py. The subdirectory name is (again, by convention), the top-level namespace for the package. In this case, when you want to deploy uninstalled, you may take only this subdirectory of the package for your installation, fixing PYTHONPATH if necessary (e.g. if you have multiple packages or need to import these modules when your current working directory is other than the location of this subdirectory).

When taking a hybrid approach, there are several nuances you will want to take to mind:

One file == One console_script : for python files to be invoked from the command line the standard pattern is to include a function (often called main()) that will handle such invocation:
```
def main(args=sys.argv[1:]):
  """command line entry point"""
  # maybe parse the options with OptionsParser
  # ...
  # do the rest of your stuff

if __name__ == '__main__':
  main()
            
```
Then in your setup() function, note the python path to this entry point (example given for setuptools, for distutils use keyword scripts=['path/to/script']):
```
entry_points="""
[console_scripts]
my-command-line-program = mypackage.mymodule:main
"""
            
```
differences in import paths: in the same-directory approach, all module names == filenames: from foo import bar . If you use multiple (top-level) subdirectories, all of the import paths are different: from directory_name.foo import bar. In other words, one strategy that won't work is to have inter-dependent directory-namespaced python and wanting to run it all from the same directory. [Note: you could conceivably do this as an academic exercise, but it would be needlessly complicated and provide no benefit]

The type of packaging you decide on depends on a number of factors:

audience: who will be/who do you want to be using your code?
For a truly open source project, the audience == the world. Everyone should be able to use your code. Whether they want to or not or whether it is useful to them is, of course, up to them

The other extreme is code targeted at a purely internal audience, a single platform, and/or even a single computer. In this case, code may be written that will not work or is not useful outside of internal needs.

In many (maybe even most) cases, the desired audience is somewhat in between. Software is targetted for internal needs, but outside involvement is encouraged or at least permitted.

Even in the case of purely internal deployments, it is still desirable to separate functionality (read: intent) from (system-specific) configuration. Very little code is purely configuration (that is to say, purely non-functional). Separated configuration from functionality leads to more maintainable software, as the functionality has a professed intent. As internal needs evolve, functionality may meaningfully evolve to respect these needs instead of merely patched to further conflate system-specific needs and functionality. (See also: the discussion on keeping all files in a single directory.)

A much more maintainable approach to purely internal software is to keep code that manifests functionality and separate code or configuration that is system specific. While it may not always be worthwhile to do truly general purpose coding, this separation of concerns keeps you honest (when respected) as well as providing more control. If multiple configurations are now desired, instead of forking the whole code base or adding complicated logic into the code, the configuration may instead be forked. The overhead of having two separate locations is pretty trivial.
intent: does your code share a common purpose?
While there is nothing that prevents you, of course (computers can't reason), from putting a melange of unrelated python in the same file, directory, or package, there is also no reason to do so and it makes for code that is difficult to decipher (as humans are meaning-seeking creatures) and therefore difficult to maintain. Usually code is grouped by logical intent. This matches how the human mind learns and processes information.
external constraints:
the desired deployment platform or imposed policy may prohibit certain packaging strategies. For instance, it may be desirable for a given deployment strategy not to touch net. If full python packages are distributed with the deployment, not touching net is difficult to ensure with easy_install, it may be done with pip. The technologies and strategies should be assessed to find a solution appropriate to your deployment needs. It can be done.

Use virtualenv

For any serious python developer, I advocate putting as little as possible/practical in the system site-packages directory. Since the usual case is to be experimenting with several different versions of several different (often inter-dependent) code bases, using a single site-package base does not allow the necessary isolation of code to transparently develop and test software. While it is a deficiency of python (or at least it's existing installers) to allow more than one version of a package in any given site-packages, it is not the normal case of software development in any language to install software and libraries globally during the development process. virtualenv allows an easy solution to this problem. Other strategies may instead be used, but in my experience, these mostly go some way towards reinventing virtualenv.

Python Packages vs. the System Packages

There is an ongoing debate on whether to use the system packaging solution (e.g. apt for debian and ubuntu) or python's solution. While I won't attempt to give a concrete solution to a debate in which there are many gray areas, I will give my approach.

For software that is part of the system, I use the system installer. If I'm using, say, synaptic to install software, it depends on the needed python being installed in the system way. No need to fight this. I also occasionally do this for packages that depend on C libraries, such as python-ldap or lxml, as I want a stable dependable installation. If any of my software depended on, say, a development version of lxml, I probably wouldn't do this.

Other than that, I don't put anything in the system site packages. Other software is almost always software I want to experiment with or develop on. In these cases, I don't want to pollute the system's site packages with my works in progress. So I use virtualenv.

This works well for me, being a software developer. If I was purely on the user side (that is, I did not want to ever edit any python files and did not demand or care about encapsulation of sotware I used), I would probably be more lax about putting software into the system site-packages

For servers or shared development boxes, it may be more appropriate to put needed shared packages in the global site-packages so that all users will be using these (presumedly known good) pieces of software. For instance, if a deployment strategy depends on virtualenv, then it is probably a good idea to install this globally.

Single File Packages

It may be desirable to have a single file that is also a python package. This need usually arises when you are dealing with multiple design constraints:

You are dealing with a module which may be depended on by other python packages (or depend on them)
Your module must be in a single file

Constraint 2 normally occurs as a constraint on deployment, as it is easier, clearer, or otherwise important to move a single file around versus an entire module.

This can be done.

However, you must be aware that there is an additional constraint due to (e.g.) setuptools: in order for a package to be easy_install-able, it must have the installer be named setup.py.

Note that I'm not imposing the constraint that all auxilliaries must live in the same file. It is assumed that, as part of good development practices, the canonical version of the file lives in version control, and along side the module of interest may live a README, tests, and other supporting files. The requirement is that the module be able to install itself into site-packages. In this case, the solution is to maintain an auxilliary setup.py that invokes the module of interest.

Example: ManifestDestiny

As an example, consider ManifestDestiny: http://hg.mozilla.org/automation/ManifestDestiny . The mercurial repository contains tests, a README, etc, but the entireity of the code logic lives in manifeparser.py . The reason that it is desirable to have manifestparser as a single file is because it needs to be synchronized with a copy in mozilla-central (the code that makes Firefox ) for the various test harnesses to consume, but the ManifestDestiny repository is the canonical location.

It also has to be a python package that works with setuptools and live on pypi.python.org as it is dependent on by another setuptools package, Mozmill . (Mozmill is also synced to mozilla-central, but because of how existing test harnesses modify PYTHONPATH to import modules, versus putting them in site-packages, it is more convenient and sensible to deploy manifestparser to mozilla-central as a single file.)

As said before, this can be done. manifestparser works on a command syntax:

 manifestparser [options] [command] [command-arguments]

where command is one of several handlers. So we add a new handler, SetupCLI, and put the setup code there:

class SetupCLI(CLICommand):
    """
    setup using setuptools
    """
    usage = '%prog [options] setup [setuptools options]'

    def __call__(self, options, args):
        sys.argv = [sys.argv[0]] + args
        assert setup is not None, "You must have setuptools installed to use SetupCLI"
        here = os.path.dirname(os.path.abspath(__file__))
        try:
            filename = os.path.join(here, 'README.txt')
            description = file(filename).read()
        except:
            description = ''
        os.chdir(here)

        setup(name='ManifestDestiny',
              version=version,
              description="universal reader for manifests",
              long_description=description,
              classifiers=[], # Get strings from http://pypi.python.org/pypi?%3Aaction=list_classifiers
              keywords='mozilla manifests',
              author='Jeff Hammel',
              author_email='jhammel@mozilla.com',
              url='https://wiki.mozilla.org/Auto-tools/Projects/ManifestDestiny',
              license='MPL',
              zip_safe=False,
              py_modules=['manifestparser'],
              install_requires=[
                  # -*- Extra requirements: -*-
                  ],
              entry_points="""
              [console_scripts]
              manifestparser = manifestparser:main
              """,
              )

The various attributes and function signatures are because of the CLI API in manifestparser.py (see source for details). setuptools is imported conditionally at the top of the file so that also is not a requirement. The result: you can run manifestparser.py setup develop with the same effect for a typical package you would do python setup.py develop.

This isn't the entirety of the story however. Uploading this package to pypi (e.g. using python manifestparser.py egg_info -RDb "" sdist register upload' ) will not result in a viable package! You can download the resultant tarball, unzip it and install it in the usual way, fine. But for an upstream package -- that is, Mozmill -- that depends on ManifestDestiny, the package will be downloaded successfully but it will not successfully install because (e.g.) easy_install will complain that a setup.py is not found

Python Packaging

sys.path

PYTHONPATH

.pth files

PYTHONHOME

Reference

Python Module Installers

distutils

setuptools

easy_install

Deficiencies of setuptools

distribute

Example package

Directory Structure

Importing foo

setup.py

Creating the foo package

References

virtualenv

pip

References

Takeaways

Package them

Put them in the same directory

Fix PYTHONPATH

Example

Hybrid Approach

Use virtualenv

Python Packages vs. the System Packages

Single File Packages

Example: ManifestDestiny

Links