Pythonic Code Organization

Making happy little modules

by Mike Johnson / @IndentError

Purpose

There are only two hard things in Computer Science: cache invalidation and naming things.

- Phil Karlton

What is good organization?

Organized Code

Real World Examples

Python 3!

PEP 3000

PEP 3000

Guido van Rossum guido at python.org Tue Dec 19 01:52:28 CET 2006 Py3k release schedule worries

Ok, so be it. Let this be a pronouncement -- the only stdlib reorg we're doing will be (a) deleting silly old stuff; (b) rename modules that don't conform to the current module/package naming convention, like StringIO, cPickle or UserDict.

UserDict

TK

Observation 1:
Combine similar things

Observation 2:
If you are torn between two, make a new name

Other renamings

“Modules should have short, all-lowercase names. Underscores can be used in the module name if it improves readability.”

- pep8

Modules with underscores

~/src/cpython/Lib $ ls -1 [a-z]*_*.py dummy_threading.py py_compile.py sre_compile.py sre_constants.py sre_parse.py

Modules with underscores

~/src/cpython/Lib $ ls -1 lib2to3/fixes fix_apply.py fix_asserts.py fix_basestring.py fix_buffer.py fix_callable.py fix_dict.py fix_except.py fix_execfile.py fix_exec.py fix_exitfunc.py fix_filter.py ...

Observation 3:
Actually read PEP 8

Observation 4:
Avoid underscores and use lower case

Digging Deeper

totals = []

for filename in glob.glob('/usr/lib/python2.7/*.py'):
    source = Path(filename)
    mod = source.name.replace('.py', '')
    if mod.startswith('_') and mod not in MAPPING:
        continue

    new_name = MAPPING.get(mod, mod).replace('.', '/') + '.py'
    dest = Path('/usr/lib/python3.4') / new_name
    if not dest.exists():
        continue

    seq = SequenceMatcher(a=source.open(encoding='utf-8').readlines(),
                          b=dest.open(encoding='utf-8').readlines())
    totals.append(seq)
totals = sorted(totals, key=lambda s: s.ratio())
for seq in totals:
    print('{:40} -> {:40} {}'.format(seq.source, seq.dest, seq.ratio()))
/usr/lib/python2.7/commands.py           -> /usr/lib/python3.4/subprocess.py         0.01
/usr/lib/python2.7/StringIO.py           -> /usr/lib/python3.4/io.py                 0.05
/usr/lib/python2.7/plistlib.py           -> /usr/lib/python3.4/plistlib.py           0.14
/usr/lib/python2.7/ConfigParser.py       -> /usr/lib/python3.4/configparser.py       0.18
/usr/lib/python2.7/htmlentitydefs.py     -> /usr/lib/python3.4/html/entities.py      0.19
/usr/lib/python2.7/functools.py          -> /usr/lib/python3.4/functools.py          0.19
/usr/lib/python2.7/dis.py                -> /usr/lib/python3.4/dis.py                0.23
/usr/lib/python2.7/SimpleHTTPServer.py   -> /usr/lib/python3.4/http/server.py        0.26
/usr/lib/python2.7/types.py              -> /usr/lib/python3.4/types.py              0.26
/usr/lib/python2.7/socket.py             -> /usr/lib/python3.4/socket.py             0.27
/usr/lib/python2.7/nntplib.py            -> /usr/lib/python3.4/nntplib.py            0.27
/usr/lib/python2.7/base64.py             -> /usr/lib/python3.4/base64.py             0.28
/usr/lib/python2.7/CGIHTTPServer.py      -> /usr/lib/python3.4/http/server.py        0.31
/usr/lib/python2.7/struct.py             -> /usr/lib/python3.4/struct.py             0.33

Observation 6:
Moving stuff defeats the purpose

Digging Deeper

Observation 5:
Say what you mean

urllib, same as it ever was

Defining Packages

How does the stdlib write __init__.py?

$ find . -name '__init__.py' ! -path './test*' | xargs wc -l 1 ./http/__init__.py 48 ./asyncio/__init__.py 0 ./urllib/__init__.py 23 ./sqlite3/__init__.py 1161 ./collections/__init__.py 3838 ./tkinter/__init__.py 132 ./html/__init__.py 1950 ./logging/__init__.py 210 ./ensurepip/__init__.py 449 ./venv/__init__.py ... snip ... 10593 total

Defining Packages - Collections

~/src/cpython/Lib $ ls -1 collections/ abc.py __init__.py __main__.py

Defining Packages

How many symbols are exported?

Symbols

#!/usr/bin/env python
from pathlib import Path
import importlib

lib = Path('/home/mrj/src/cpython/Lib')
for init in lib.glob('**/__init__.py'):
    package = init.parent

    mod = importlib.import_module(package.name)
    symbols = [n for n in dir(mod) if not n.startswith('_')]
    print(len(symbols), '\t', package.name)

Defining Packages - Symbols

$ python package_symbols.py 286 curses 136 tkinter 84 asyncio 76 sqlite3 67 ctypes 64 logging 43 multiprocessing 30 unittest 26 distutils 26 collections 10 venv 9 json 9 importlib 8 dbm 7 encodings 4 email 3 html 1 urllib

Defining Packages - asyncio

# This relies on each of the submodules having an __all__ variable.
from .coroutines import *
from .events import *
from .futures import *
from .locks import *
from .protocols import *
from .queues import *
from .streams import *
from .subprocess import *
from .tasks import *
from .transports import *

Defining Packages - Public Interface

Observation 7:
Create a public interface

Requests

Observation 9:
If it has state, it may be a class

Summary

Thanks!

by Mike Johnson / @IndentError

/