1 of 63

Python Packaging

and Releasing

2 of 63

Outline

  • Some terminology
  • Why?
  • A brief history of… (a.k.a. why Python packaging is confusing)
  • How to (with hands on)
  • Versioning and making releases / uploading to PyPI (more hands on)

3 of 63

Some terms...

  • Software Packaging
    • Very broad term
    • Typically involves bundling a single piece of software (a program or library) such that a user can easily install it on their system in some standard way
    • Often also involves grouping together multiple pieces of software that depend on each other in such a way that the user can build a system of many disparate software packages that all work together (i.e. don’t have version conflicts)

4 of 63

More terms… (Python specific)

  • Package (horribly overloaded)
    • A Python package, in the sense of a directory that's also a module containing other modules
    • An archive file used by a package management system to install a piece of software
  • Distribution
    • A collection of Python packages/modules as well as other source/non-source files distributed together as a single product, along with a setup.py (commonly referred to as a “package”, but that name is already taken in Python)

5 of 63

More terms...

  • Project/Product
    • My preferred terms for a single software product with its own name, distributed by itself (though not necessarily without dependencies) (i.e. Astropy, Numpy, imexam)
    • Typically synonomous with “distribution”, though a “distribution” may sometimes include multiple products bundled together

6 of 63

And more terms...

  • Release
    • A specific version of a product, as the result of a development cycle (i.e. Astropy v1.0.2) that has been packaged and released for public consumption
    • May have multiple source and/or binary distributions

7 of 63

Why package?

  • Python (and other languages and/or Oses) have standard places to install software)
  • Mindless installation of your software for users
    • No “copy this file to /usr/bin, then copy this file to /usr/lib/python2.7/site-packages, no not that site-packages, the other one, wait you already have a file called that in /usr/bin? Just put it somewhere under your home directory and add it to your PYTHONPATH

8 of 63

~$ echo $PYTHONPATH

/usr/local/lib/python2.7:/usr/local/lib/python2.7/site-packages:/usr/lib/python2.7/mystuff:~/program_from_bob:~/temp/numpy‑1.8.1:/home/embray/my_code/test_program:/home/embray/my_code/astropy:/home/embray/mycode/astropy_old:/opt/python3.4/lib/python3.4:/opt/python3.3/lib/python3.4/site-packages…

9 of 63

Why package?

  • The previous problem is alleviated in large part by using one or more custom Python environments (outside the scope of this tutorial)
    • Virtualenv
    • Conda
  • Also works with --user site-packages
  • Hence easier reproduction of production environments

10 of 63

More reasons to package?

  • Python-specific:
    • By far the easiest way to build Python C modules
    • By extension, this includes Cython—Cython has distutils extensions for easily building Cython modules

11 of 63

But I just write code for myself / one-off scripts

  • Packaging provides some useful conveniences even for single-file scripts that you use frequently
    • Also nice to have already set up if/when you do need to share with someone
  • Mindless system-side “installation” of your own scripts without manual $PATH, $PYTHONPATH entries
  • Enables “setup.py develop” (more on this later)
  • See for example astropy-tools

12 of 63

But I just write code for myself / one-off scripts

  • As your personal library of code grows, some routines will be used over and over in different code—packaging makes it easier to gather frequently used routines together in one place to use in all your code

13 of 63

History: distutils

  • The original Python packaging/distribution system library
  • Included in the stdlib
    • Stagnant
    • “The stdlib is where packages go to die”
  • Familiar—implements the core of setup.py scripts
  • Still the basis for most Python packaging

14 of 63

setuptools

  • Third-party packaging system built on top of distutils
    • Has more rapid development cycle
  • Looks mostly like distutils, but setup.py uses: from setuptools import setup�instead of:�from distutils.core import setup
  • Includes easy_install for automatic installation of distributions from the Cheeseshop (now known formally as PyPI)

15 of 63

setuptools (cont.)

  • Adds support for 'eggs'—a type of binary distribution of a Python release meant for easy drop-in installation (never quite lived up to their promise, though are useful in some areas)
  • ./setup.py install and easy_install create eggs out of all Python distributions (even ones that don't use setuptools by default)

16 of 63

easy_install

  • The original Python package installer
  • Capable of downloading and installing packages from the internet, and their dependencies
  • Exists as both a “stand-alone” executable and a setup.py command
  • Ad-hoc design; not favored anymore for the vast majority of use cases

17 of 63

pip

  • Not a packaging system, just a downloader/installer for Python distributions
  • Replacement for easy_install, has some advantages:
    • Ensures all dependencies can be installed before installing a package; can roll back on failure
    • Built in uninstall support
    • Doesn't install eggs
    • Development is more community/standards driven

18 of 63

pip (cont.)

  • More advantages:
    • Somewhat cleaner/less noisy output than easy_install
    • Support for installation from more VCSs
    • Now also the only officially support installer for Python
  • Again, just an installer—projects must still be built with distutils/setuptools

19 of 63

Tumult

  • Distribute
    • Was a fork of setuptools created out of frustration/uncooperativeness
    • For a time was more popular than setuptools
    • At PyCon NA 2013 finally agreed to merge back into setuptools
    • Now setuptools is endorsed as the de facto standard—use setuptools, even over plain distutils
    • s/distribute/distutils

20 of 63

More tumult

  • Distutils2 a.k.a. “packaging” (confusingly)
    • Failed effort to replace distutils, setuptools, distribute, easy_install, pip entirely
    • Didn’t seek enough community feedback
    • Design was somewhat ad-hoc, not standard-driven
    • Had some good ideas but didn’t learn from past mistakes

21 of 63

Other Python packaging projects

  • d2to1
  • Bento
  • numpy.distutils
  • (Anything I’m forgetting that people use…?)
  • All of the above are still based around or at least compatible with setuptools and the user interface presented by ./setup.py command

22 of 63

Summary

  • Python build / packaging
    • Use setuptools
  • Python distribution installers
    • Use pip, except when installing from source code repository/developing
      • Even then, using pip isn’t a bad idea—won’t do egg install

23 of 63

How to

  • In Python, “packaging” mostly means putting your code in a directory, and putting a (working) setup.py script at the root of that directory
  • Useful for:
    • Single Python modules
    • Singe Python packages
    • Stand-alone executable scripts
    • Combinations of all of the above

24 of 63

How to (cont.)

  • When putting code together in a distribution, it is also good to include
    • A license
    • A readme file
    • Possibly a changelog or other history
    • Documentation
    • Tests
    • Sometimes copies of third-party software—”vendoring”

25 of 63

Start by making a directory for your distribution

  • Yes, even if it’s a single Python module, because we’re going to add a setup.py, so now there will be two files that need to be grouped together
  • Even if you just have a bunch of one-off scripts, group them into a single directory and make a single setup.py for all of them (a la astropy-tools)
  • May want to make into a VCS repo one day

26 of 63

Make a directory

  • $ mkdir ~/name-of-your-project
  • $ cd ~/name-of-your-project

27 of 63

Make an empty setup.py in the directory, set it executable

  • $ touch setup.py
  • $ chmod +x setup.py
    • Makes it clear that this a script, not an importable module
    • Convenient—allows `./setup.py`

28 of 63

Bare minimum setup.py

  • Open setup.py in your favorite editor and add:

#!/usr/bin/env python

from setuptools import setup

setup()

29 of 63

Try it out

  • setup() is like the main() function for our setup.py script—since we import it from setuptools it comes with most of the functionality we need already baked in:
  • Try
    • $ ./setup.py --help
    • $ ./setup.py --help-commands
    • $ ./setup.py --name
    • $ ./setup.py --version

30 of 63

First we need some actual code to install

  • Go to:
    • https://github.com/embray/pyastro-packaging-tutorial
    • Grab simcluster.py
      • Make sure to use the “Raw” download
      • Place this in the directory you made for this project

31 of 63

Let’s add some bare minimum metadata to our setup

  • Metadata about our project / distribution is provided as arguments to the setup()
  • Most arguments are optional, and only show up as information PyPI (the Python Package Index née “The Cheese Shop”)

32 of 63

Essential metadata

  • name: The name of your project—this is what people will use when they type `pip install your-project`
    • Often the same as the name of the Python package/module included in this distribution, (ex: astropy)
    • Doesn’t have to be though
  • version: So that users (including yourself) can be sure you’re installing the right version

33 of 63

Go ahead and put a name and version in your setup.py

  • Edit the setup.py and change it to read:

#!/usr/bin/env python

from setuptools import setup

setup(

name=‘MakeUpAUniqueName’,

version=‘0.1’

)

34 of 63

Now try

  • $ ./setup.py --name
  • $ ./setup.py --version
  • Now setuptools will know, for example, how to name a source tarball for your project, and what to register it under on PyPI, among other things

35 of 63

Other metadata

  • There’s a bunch of other metadata we can add to our distributions—most of it is optional and will only show up on PyPI; examples include
    • Author
    • Author contact (e-mail address)
    • License
    • Description
    • “Classifiers”
  • See distutils docs for more

36 of 63

Now tell setup() what actual code is part of our distribution

  • setuptools does not automatically assume any Python files it finds in this directory are a part of your project that should be installed; we need to explicitly tell it what code is important
    • We can list individual Python modules that are part of our distribution
    • Or we can list entire packages
    • Also scripts (we’ll get to this)
    • Non-code data files (probably not today)

37 of 63

Adding single modules

  • When distributing single Python modules, pass setup():

py_modules=[

‘name_of_module’,

‘another_module’

]

# without the .py

38 of 63

Adding packages

  • For adding one or more Python packages (directories containing multiple modules and __init__.py):

packages=[‘name_of_package’]

  • Convenience:

from setuptools import find_packages

setup(…

packages=find_packages()

)

39 of 63

Adding scripts

  • Where “scripts” mean executables that should be copied to the same bin/ as the Python being used to do the installation—not Python specific
  • Two methods
    • Old: scripts=[‘name_of_script.py’]
      • Simple copy
    • New: console_scripts entry point
      • Setuptools-specific

40 of 63

console_scripts

  • setuptools automatically generates the actual executable “script” file, with any name you want to give it
  • Ensures when running the script that the correct version of the underlying Python package is installed
  • Also generates executables on Windows

41 of 63

console_scripts syntax

  • In setup.py:

setup(

...

entry_points = {

‘console_scripts’: [

‘simcluster = simcluster:main’

]

},

...

)

42 of 63

__version__ convention

  • In addition to the version= in setup.py, it is conventional (sometimes proposed as a PEP) to put the package/module version in a __version__ variable
  • Should only be one __version__; sub-packages should not have their own (except in case of namespace packages)
  • However, leads to unfortunate duplication

43 of 63

Version de-duplication

  • There are a few common techniques to maintain distribution version in one place:
    • from package_name import __version__ in setup.py (we’ll cover this)
    • pkg_resources.get_distribution()
    • version.py (we’ll also look at this when we cover astropy_helpers, as it’s what Astropy uses)
    • version.txt

44 of 63

Put version in __init__

  • If you have a single module distribution just put __version__ in that module. If a package, put __version__ in the __init__ of that package
  • In setup.py do:
    • from package_name import __version__
    • setup(..., version=__version__, ...)
  • Tricky part: package must be importable at build-time (especially tricky with extension modules)

45 of 63

How to manage version number

  • Don’t change the version for every single change; you can if you make changes infrequently, but in general that’s what version control is for
  • In general, a version number is associated with a release—a specific build of your project that has been packaged and made available for public consumption

46 of 63

Two broad categories of releases

  • Patch / bug fix releases
    • Don’t change anything except to fix known bugs. In other words don’t change any expected behavior of the software (so long as that behavior is assumed “correct”)
  • Feature releases
    • Contains entirely new features or APIs from the previous feature release
    • May be backwards-incompatible

47 of 63

Version schemes

  • Semver
    • Based on existing common practices
    • Version string is X.Y.Z
      • X = “major version”
      • Y = “minor version”
      • Z = “patch version”
    • Minor version is always backwards-compatible with previous minor version
    • Major versions always break backwards-compat

48 of 63

X.Y scheme

  • Some big projects (ex: Firefox) have found it easier to stick to an “X.Y” scheme where Y is patch versions and X are new feature releases.
  • Python PEP 440
    • Version string standard for all Python distributions
    • You should make sure to adhere to this
    • Is compatible with semver and other common version update schemes

49 of 63

Dev versions

  • Common, between releases, to append something like “.dev” to version string
  • During development, between releases, helps distinguish from actual released version
  • Ex: After releasing “1.1.0” set the version to “1.1.1.dev”. Then set the version to “1.1.1” just before making a 1.1.1 release.

50 of 63

Releasing Python distributions

  • Mostly easy—basic process is
    • Prepare for release (set correct version, add a changelog, etc.)
    • Tag the release in version control, if applicable
    • Register the release on PyPI, if desired
    • Build a source tarball
    • setuptools does the latter two for you

51 of 63

Prepare for release

  • If using a .dev version, remove that from the version string—make sure whatever version is passed to setup() is the one you want for your release
  • A changelog is a nice thing to have for projects with more than a few users—gives a higher level view of what changed between versions (VCS log is usually too fine-grained)

52 of 63

Changelog format

  • There is no standard for this
  • Usually a list of versions (and their release dates are often nice to have) and within that a list of the changes
  • Can sub-categorize in whatever ways are appropriate, if at all
  • If using Sphinx, nice to put in rst format for rendering as part of the Sphinx docs too

53 of 63

Test source dist

  • Don’t necessarily always need to do this if you know your project is set up correctly, but it can’t hurt to test your source tarball before tagging the release or registering it
  • The setup.py script has a sdist command which will automatically bundle your project into a source tarball
  • Try it: $ ./setup.py sdist

54 of 63

sdist output

  • Creates a directory called dist/ containing your source tarball (on some systems a different format like .zip is used; you can check available formats with ./setup.py sdist --help-formats)
  • Check the contents to ensure that they are everything you wanted:
    • $ tar tf dist/YourDistributionName-X.Y.Z.tar.gz

55 of 63

sdist testing

  • For bigger, more complicated projects a manual inspection of the contents may not be enough
  • You can test installing from your source tarball using pip: Just run “pip install dist/YourPackageName-X.Y.Z.tar.gz”
  • It’s good to use a virtualenv for this
  • Run your project’s tests (or manual test)

56 of 63

Make a MANIFEST.in

  • setuptools doesn’t automatically know about all files you want to include in your distribution—for example “CHANGELOG.rst”
  • For unfortunate historic reasons this does not go in setup.py; instead must create MANIFEST.in file
  • You can also use this exclude files you might not want in your sdist (generated files for example)

57 of 63

Register on PyPI

  • First we need accounts on PyPI
  • But actually we will use TestPyPI, this will allow us to play around without cluttering the real Python package index
  • Go to testpypi.python.org and register an account

58 of 63

PyPI authentication still sucks

  • Put a file called “.pypirc” in your home directory (don’t forget the leading “.”)
  • This file is going to contain our PyPI password in plain text, so make sure to set appropriate permissions on it:
    • $ chmod 600 ~/.pypirc

59 of 63

.pypirc contents (for testpypi)

[distutils]

index-servers=

pypitest

[pypitest]

repository = https://testpypi.python.org/pypi

username = erik

password = <your password goes here>

60 of 63

Now run the register command

  • $ ./setup.py register -r https://testpypi.python.org/pypi
  • Yes, unfortunately for anything but the default PyPI you have to manually include the testpypi URL to register there instead
  • Now go to testpypi.python.org and you should find your project there

61 of 63

Now run the register command

  • $ ./setup.py register -r https://testpypi.python.org/pypi
  • Yes, unfortunately for anything but the default PyPI you have to manually include the testpypi URL to register there instead
  • Now go to testpypi.python.org and you should find your project there

62 of 63

Build dist and upload

  • Simply registering a release on PyPI simply reserves a distribution name / version number, and uploads the metadata for your release
  • You can upload multiple files for your release—most commonly just the source tarball, but there are other possibilities too (ex: wheels)
  • Run $ ./setup.py sdist upload -r https://testpypi.python.org/pypi

63 of 63

Congrats!

  • You are now the proud owner of a distribution / release on PyPI!
  • Other people can install your code by simply running “pip install YourDistributionName”
  • You can release future versions on PyPI while still keeping the old versions available—if you want you can hide / show any number of old releases on the PyPI index, but the downloads are always available unless explicitly removed