How to embedded data files in python using setuptools

Within a Python package, it is useful to provide data files. These files are not python modules so you do not want to place same together with your module. Let us put them in a directory share/data. Suppose you have the following structure

|– share
|   |– data
|– src|
| `– yourpackage
|   |–

Now, the setup file should include the directory share/data and its contents. There are different ways of doing it. We chose to save the share files within the distribution (not in a global share directory). The setup file should look like this:

#datadir = os.path.join('share','data')
#datafiles = [(datadir, [f for f in glob.glob(os.path.join(datadir, '*'))])]
# Based on Jeremy's comment,we can also us os.walk for recursion
datadir = os.path.join('share','data')
datafiles = [(d, [os.path.join(d,f) for f in files])
    for d, folders, files in os.walk(datadir)]
import metainfo # a file with relevant information
    name             = 'yourpackage',
    version          = metainfo.version,
    maintainer       = metainfo.maintainer,
    maintainer_email = metainfo.maintainer_email,
    author           = metainfo.authors,
    author_email     = metainfo.authors,
    description      = metainfo.description,
    keywords         = metainfo.keywords,
    long_description = metainfo.long_description,
    # package installation
    packages = find_packages('src'),
    package_dir  = package_dir,
    data_files = datafiles,

Then you could have a script to automatically generate the proper path name to any file in ./share/data

from os.path import join as pj
import yourpackage
def get_data(filename):
    packagedir = yourpackage.__path__[0]
    dirname = pj(os.path.dirname(packagedir), '..', 'share','data')
    fullname = os.path.join(dirname, filename)
    return fullname
Please follow and like us:
This entry was posted in Python and tagged , . Bookmark the permalink.

8 Responses to How to embedded data files in python using setuptools

  1. Stu says:

    This seem to work with virtualenv + python 2.7 unfortunately.

  2. JeremyM says:

    I would recommend something like

    [ (d, [os.path.join(d, f) for f in files]) for d,folders,files in os.walk(datadir)]

    over glob for general cases as it’ll handle the recursion

    • Thomas Cokelaer says:

      Just rediscovered my post… and your comment on such a case (recursion). Indeed the os.walk and your recipe worked. I’ve edited the post. thanks

  3. Pingback: Building a binary into your Python package | Fahhem's Blog

  4. Craig says:

    data_files doesn’t preserve your relative directory structure, it will just dump every file into the .egg directory. 🙁

    • Hi,
      Seems to work for me under Linux/Python 2.7 but it is never easy to tune the setup as wanted so there may be slight differences on the system or within the that prevent
      the structure to be kept.

  5. ChePazzo says:

    To install cleanly into a virtualenv, I do this:

    import os
    ROOT_DIR = os.getenv(‘VIRTUAL_ENV’)
    (ROOT_DIR+’/etc’, [‘config/app.conf’]),

  6. Sak says:

    Thanks dude!
    That was cool

Leave a Reply

Your email address will not be published.