distributing python (and more!)

distutils, setuptools, and you

dan buch

Credentials

why you should pay attention to Dan Buch:


in short: you probably shouldn't listen to Dan Buch

A Brief History of Distributing Python

some projects with which you should be familiar:

distutils

this has been in the standard library for quite awhile now...

http://docs.python.org/dist/intro.html

setuptools

distutils done right or whatever...

http://peak.telecommunity.com/DevCenter/setuptools

Paste

so what doesn't Paste do?

http://pythonpaste.org

thing of which we'll speak

what you can expect to get out of this talk

distutils

only as much as necessary

setuptools

not as much as the official documentation covers, but a healthy dose

NOT Paste

well, at least not in any meaningful way...

moving along

ONWARD

reasons

metadata

How is metadata included? In your setup.py file, of course!

Here's a really minimal example in which I'm using the setup function from distutils.core:

from distutils.core import setup

setup(name='FrizzleFry', version='0.1.0',
      py_modules=['frizzlefry'])

metadata

what does setup.py do?

$ python setup.py bdist

( ... bunch of output)

  $ ls -F
  build/ dist/ frizzlefry.py setup.py

  $ ls dist/
  FrizzleFry-0.1.0.linux-i686.tar.gz

metadata where?

so far, so good... but where's my metadata?

distutils decided to bury it in the distributable package

  $ # first we decompress...
  $ tar xzf FrizzleFry-0.1.0.linux-i686.tar.gz

  $ # take a peek...
  $ ls -RF .

(bunch of empty dirs...)

  $ ./usr/lib/python2.5/site-packages:
  FrizzleFry-0.1.0.egg-info  frizzlefry.py  frizzlefry.pyc

metadata mystery

what's this egg-info file all about?

  $ ls ./usr/lib/python2.5/site-packages
  FrizzleFry-0.1.0.egg-info  frizzlefry.py  frizzlefry.pyc
  
  $ file FrizzleFry-0.1.0.egg-info
  FrizzleFry-0.1.0.egg-info: ASCII text

metadata files

let's take a peek

  $ cat ./usr/lib/python2.5/site-packages/FrizzleFry-0.1.0.egg-info
  Metadata-Version: 1.0
  Name: FrizzleFry
  Version: 0.1.0
  Summary: UNKNOWN
  Home-page: UNKNOWN
  Author: UNKNOWN
  Author-email: UNKNOWN
  License: UNKNOWN
  Description: UNKNOWN
  Platform: UNKNOWN

metadata UNKNOWN??

So aside from the fact that most of our metadata is UNKNOWN, the fact that the output is buried inside the distributable package is not terribly useful...

setuptools fixes the setup function:

from setuptools import setup

setup(name='FrizzleFry', version='0.1.0',
      py_modules=['frizzlefry'])

make better metadata

Running the setup script now produces an additional directory:

$ python setup.py bdist

(bunch of output...)

  $ ls -F
  build/  dist/  FrizzleFry.egg-info/  frizzlefry.py  setup.py
  
  $ ls -F FrizzleFry.egg-info/
  dependency_links.txt  PKG-INFO  SOURCES.txt  top_level.txt

metadata via PKG-INFO

lo and behold:

  $ cat FrizzleFry.egg-info/PKG-INFO
  Metadata-Version: 1.0
  Name: FrizzleFry
  Version: 0.1.0
  Summary: UNKNOWN
  Home-page: UNKNOWN
  Author: UNKNOWN
  Author-email: UNKNOWN
  License: UNKNOWN
  Description: UNKNOWN
  Platform: UNKNOWN

huh??

So what's the deal with that??

whatsthedealwiththat

distutils==busted

For the most part, distutils is old and busted.

The mythical Distutils2 has been on the horizon for a few years. (see http://wiki.python.org/moin/DistUtils20)

setuptools, while certainly not anything like new hotness, does quite a bit to make distutils work like one would hope.

installation

moving on to installation as a reason to distribute, as well as more reasons why setuptools fixes distutils

installation

Did I forget to mention the concept of distutils commands?

I think I did...

distutils commands

enter the distutils command:

$ python setup.py --help-commands

...which will show you a slew of standard commands as well as any number of additional commands supplied via entry_points, the concept of which we will touch on in a bit...

distutils commands

standard distutils commands (may differ by platform)

  build             build everything needed to install
  build_py          "build" pure Python modules (copy to build directory)
  build_ext         build C/C++ extensions (compile/link to build directory)
  build_clib        build C/C++ libraries used by Python extensions
  build_scripts     "build" scripts (copy and fixup #! line)
  clean             clean up temporary files from 'build' command
  install           install everything from build directory
  install_lib       install all Python modules (extensions and pure Python)
  install_headers   install C/C++ header files
  install_scripts   install scripts (Python or otherwise)
  install_data      install data files
  sdist             create a source distribution (tarball, zip file, etc.)
  register          register the distribution with the Python package index
  bdist             create a built (binary) distribution
  bdist_dumb        create a "dumb" built distribution
  bdist_rpm         create an RPM distribution
  bdist_wininst     create an executable installer for MS Windows

installation

okay, so the only command we're worried about right now is install:

  $ python setup.py install

(...bunch of output)

and presto! the package is built and installed in your sys.path, by default in the site-packages subdirectory of your python library

installation: why??

Okay, so what's so great about that??

For starters, this means that you don't have to fret about hardcoding things like your executable path, library path, etc.

Essentially, the install command works like one would expect an installer to work.

installation: are you nutty?

seriously.... what is so great about that???

what is so great about that?

installation - what about eggs?

Wait a second! I just unzipped this here '.egg' file and I don't see a setup.py file anywhere!

That's right.

Python eggs are created with the bdist command, which is one of the core distutils commands.

Eggs are special in that they only contain metadata and the files you wish to distribute.

Let's not get ahead of ourselves...

installation - ACK!

Isn't this a bit too much trouble for what it's worth??

All I want to do is make my little bitty script available.

Why would I mess with an installer when I can just copy the script to my /usr/bin dir?

installation - example

Not a bad question...

Here's one good reason

While writing this presentation, which is in the mostly-awesome S5 format, the author got reeeal tired reeeal quickly of writing the same markup repeatedly and making the same changes in multiple places.

installation - example

Rather than finish writing the presentation, the author decided to get sidetracked and write a little script that used his favorite templating language, Mako.

The end result, MakoPrint, contains a single entry point of type console_script.

Upon being installed, the name of the entry point gets written to the exec path as a file of the same name.

makoPrint setup.py

import sys
from setuptools import setup

VERSION = '0.1.0'
ENTRY_POINTS = \
"""[console_scripts]
makoprint = makoprint:main
"""

def main():
    setup(name='MakoPrint', version=VERSION,
      description="tiny script to render mako templates",
      long_description="", #TODO
      author='Dan Buch',
      author_email='daniel.buch@gmail.com',
      url='', #TODO
      license='MIT',
      py_modules=['makoprint'],
      include_package_data=True,
      zip_safe=False,
      install_requires=['Mako'],
      entry_points=ENTRY_POINTS)
    return 0


if __name__ == '__main__':
    sys.exit(main())

generated executable

#!/usr/bin/python
# EASY-INSTALL-ENTRY-SCRIPT: 'MakoPrint==0.1.0dev','console_scripts','makoprint'
__requires__ = 'MakoPrint==0.1.0dev'
import sys
from pkg_resources import load_entry_point

sys.exit(
   load_entry_point('MakoPrint==0.1.0dev', 'console_scripts', 'makoprint')()
)

not too shabby

Yes, the idea that one can define an entry point and automatically have an executable installed is quaint.

Tell me more about these "entry points".

entry points==wonderful magic

Our first encounter with entry points in this presentation was actually well before that used with MakoPrint.

distutils, setuptools and many many others (like Paste) make extensive use of entry points.

entry points

Remember the install command?

$ python setup.py install

...and the bdist command?

$ python setup.py bdist

Well that's only the beginning....

entry points gone wild

distutils uses entry points for all manner of pluggability, which is much of the reason why setuptools is able to work its magic.

Take, for instance the following command:

$ python setup.py bdist_egg 

The bdist_egg command is defined inside setuptools as an entry point of type distutils.commands.

entry points for all

Perhaps one of the best-known entry points among Python developers is easy_install.

One peek inside the source code (not recommended) will show that easy_install began life as a lowly distutils.commands entry point, and that the easy_install script one can find in the executable path is a console_scripts entry point which performs a bit of sorcery to sys.argv, artificially inserting the first few arguments one would normally find (on next slide).

easy_install growed up

old:

$ python setup.py easy_install SomeDist

new:

$ easy_install SomeDist

big difference, eh?

entry points who?

So where and how are these entry points found and used??

Short answer: pkg_resources

Okay, so this is cheating... because pkg_resources is actually part of setuptools rather than distutils.

entry points made simple

Remember our example FrizzleFry app?

from setuptools import setup, find_packages

setup(name='FrizzleFry', version='0.1.0',
      description="frizzles you fries for you",
      author='Jeebus Takethewheel', author_email='jeebus@porkfeets.com',
      url='http://porkfeets.com/projects/FrizzleFry/',
      license='MIT', packages=find_packages(exclude=['ez_setup', 'tests']),
      entry_points="""
[console_scripts]
frizzle = frizzlefry.base:main

[frizzlers]
frizzbase = frizzlefry.base:frizzler
      """)

entry points made happy

Let's look closer at the entry_points kwarg:


[console_scripts]
frizzle = frizzlefry.base:main

[frizzlers]
frizzbase = frizzlefry.base:frizzler
      

While the console_scripts section will be acted upon by setuptools, the frizzlers section contains no magic (of which I am aware) and will need to be acted upon by pkg_resources.

pkg_resources pour vous

So, considering these entry points:


[console_scripts]
frizzle = frizzlefry.base:main

[frizzlers]
frizzbase = frizzlefry.base:frizzler
      

here's how we grab frizzbase:

>>> import pkg_resources

>>> frizzler = pkg_resources.load_entry_point('FrizzleFry', 'frizzlers', 'frizzbase')

>>> frizzler('some text go here')
<generator object at 0x82cac8c>

entry points > imports

How are entry points better than plain ol' imports??

how better, huh?

entry point eat import

BECUZ IT IS BETTR OKAYS

cat mad at import

entry point != panacea

no no no...

You aren't supposed to use entry points instead of imports.

BUT.

The can be awfully handy in certain scenarios.

got versions?

Take the following snippet from a listing of one's site-packages directory:

FrizzleFry-0.1.0-py2.5.egg
FrizzleFry-0.1.0dev-py2.5.egg
FrizzleFry-0.1.1dev-py2.5.egg
FrizzleFry-0.1.4dev-py2.5.egg
FrizzleFry-0.2.0-py2.5.egg
FrizzleFry-0.2.0dev-py2.5.egg
FrizzleFry-0.2.8dev-py2.5.egg

This is rather messy, yes? ...but there isn't yet an easy_uninstall command...

i wants this one

Having a ton of different egg versions present could make for a rather confusing sys.path were it not for the magic performed by setuptools.

Inside one's site-packages directory, there are special files ending in .pth which dynamically alter the sys.path when the python interpreter spins up.

  $ ls /usr/lib/python2.5/site-packages/ | grep .pth
  easy-install.pth
  setuptools.pth

pth-ing oneself off

Just for kicks, here's part the guts of a .pth file.

import sys; sys.__plen = len(sys.path)
./setuptools-0.6c8-py2.5.egg
./virtualenv-1.0-py2.5.egg
./ipython-0.8.2-py2.5.egg
./cElementTree-1.0.5_20051216-py2.5-linux-i686.egg
./feedparser-4.1-py2.5.egg
./BeautifulSoup-3.0.5-py2.5.egg
./PasteScript-1.6.1.1-py2.5.egg
./PasteDeploy-1.3.1-py2.5.egg
./Paste-1.6-py2.5.egg
./CowsayMOTD-0.1.0dev-py2.5.egg
./buildutils-0.3-py2.5.egg
./SQLAlchemy-0.4.3-py2.5.egg
./MakoPrint-0.1.0dev-py2.5.egg
./Mako-0.1.10-py2.5.egg
./Beaker-0.9.3-py2.5.egg
import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)

pth no repete

You noticed that there were no repeated distributions on that list, yes?

Upon easy_install-ing, setuptools rewrites the .pth file to keep one's sys.path relatively clean.

re-entry

Getting back to pkg_resources, you will see that the three-argument function "load_entry_point" takes as its first argument a "Requirement String".

>>> frizzler = pkg_resources.load_entry_point(\
    'FrizzleFry>=0.1.1', 'frizzlers', 'frizzbase')

>>> #           ^_____  lookie here

If the required version is not present, ImportError is raised, which is (IMNSHO) waaaay better than having one's app work halfway decently until some corner case gets borked because of version incompatibility.

entry point good yes

Another good reason to use entry points:

Requirement Strings are strings rather than eval-able code, and carry with them more meaning than simple import statements.

Consider the following:

[server:main]
use = egg:Paste#http
host = 0.0.0.0
port = 5000

[app:main]
use = egg:FrizzleServ

enter these dragons

See the something-like-a-Requirement String for app:main.use?

 use = egg:FrizzleServ 

Okay, this isn't a true Requirement String, but Paste would understand what to do with it if it contained a version requirement such as:

 use = egg:FrizzleServ>=0.2.1 

you confuse me

This is getting ugly.

ugly on outside

escape!

We're about to evacuate... but before we do...

It's worth touching on Python Extensions

surrrpriiissssse

extensions

How to distribute a Python Extension:

The old way:

Use the "ext_modules" keyword argument along with the Extension type from distutils.extension in the following manner:

from setuptools import setup, find_packages
from distutils.extension import Extension

setup(name='FrizzleFry', version='0.2.2',
      ext_modules=[Extension('speedfrizzle', ['speedfrizzle.c'])])

extensions evolved

How to distribute a Python Extension:

The new way:

Do just as with the old way, but instead of distutils.extension.Extension, use setuptools.extension.Extension, which also allows one to specify Pyrex files as the extension sources:

from setuptools import setup, find_packages
from setuptools.extension import Extension

setup(name='FrizzleFry', version='0.2.3',
    ext_modules=[Extension('speedfrizzle', ['speedfrizzle.pyx']))

moving along

Here are links of interest:

Python Build Utilities (buildutils)

http://buildutils.lesscode.org/

nifty little extensions to distutils.commands

Yolk command line utility

http://tools.assembla.com/yolk/

distribution inspection utility

Pyrex extension language

http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/

make an extension without having to write loads of C crap

things left out

never gonna die

outta here okays

complaints department:

mailto:daniel.buch@gmail.com