The Hitchhiker's Guide to Python

Greetings, Earthling! Welcome to The Hitchhiker's Guide to Python.

This is a living, breathing guide. If you'd like to contribute, fork us on GitHub!

This handcrafted guide exists to provide both novice and expert Python developers a best practice handbook for the installation, configuration, and usage of Python on a daily basis.

This guide is opinionated in a way that is almost, but not quite, entirely unlike Python's official documentation. You won't find a list of every Python web framework available here. Rather, you'll find a nice concise list of highly recommended options.

Python 3 is the standard. If you are using Python 3, congratulations — you are indeed a person of excellent taste. — Kenneth Reitz

Let's get started! But first, let's make sure you know where your towel is.

Source: docs.python-guide.org — Licensed under CC BY-NC-SA 3.0

Choosing a Python Interpreter

Python is not just a language — it's a specification with multiple implementations. Choosing the right interpreter depends on your project's needs: compatibility, performance, and ecosystem.

CPython (Recommended)

CPython is the reference implementation of Python, written in C. It compiles Python code to intermediate bytecode which is then interpreted by a virtual machine. CPython provides the highest level of compatibility with Python packages and C extension modules.

Use CPython unless you have a specific reason not to. It's the standard, the most widely tested, and what the vast majority of the Python ecosystem targets.

If you are writing open source Python code and want to reach the widest possible audience, targeting CPython is best.
To use packages which rely on C extensions to function, CPython is your only practical option.
All versions of the Python language are defined by CPython's behavior since it is the reference implementation.

Current recommended version: Python 3.12 or 3.13. Each new version brings improved standard library modules, performance gains, security fixes, and better error messages. Python 3.13 introduced a free-threaded build (no-GIL) as an experimental feature.

Install the latest version with uv:

$ uv python install 3.13

Or use pyenv if you need to manage multiple versions:

$ pyenv install 3.13

PyPy

PyPy is a Python interpreter implemented in a restricted statically-typed subset of the Python language called RPython. The interpreter features a just-in-time (JIT) compiler and supports multiple back-ends (C, CLI, JVM).

PyPy aims for maximum compatibility with CPython while significantly improving performance. On a suite of benchmarks, it's consistently 5x or more faster than CPython.

If you have a CPU-bound Python application and need a performance boost without rewriting code, PyPy is worth trying. It targets modern Python 3.x.

Caveat: Some C extension packages may not work with PyPy. Test your dependencies first.

GraalPy

GraalPy is a Python implementation built on Oracle's GraalVM. It provides high performance through GraalVM's JIT compiler and offers seamless interoperability with Java, JavaScript, Ruby, and other GraalVM-supported languages.

GraalPy is a good choice when:

You need to embed Python in a JVM application
You want polyglot interoperability between Python and other languages
You're running in a GraalVM environment already

It targets Python 3.x compatibility and is actively developed.

Jython

Jython compiles Python code to Java bytecode which is then executed by the JVM. It can import and use any Java class like a Python module.

Jython is useful if you need to interface with an existing Java codebase or have other reasons to write Python code for the JVM. However, note that Jython only supports Python 2.7 and development has slowed considerably. For new JVM-based Python work, consider GraalPy instead.

IronPython

IronPython is an implementation of Python for the .NET framework. It can use both Python and .NET framework libraries, and can expose Python code to other .NET languages.

IronPython supports Python 2.7, with IronPython 3 under development. It remains useful for specific .NET integration scenarios.

Python for .NET (pythonnet)

Python for .NET takes a different approach from IronPython — it provides near-seamless integration of a natively installed CPython with the .NET Common Language Runtime (CLR). This lets you use standard CPython with full package ecosystem access while also calling into .NET libraries.

It is compatible with Python 3.x and can run alongside IronPython without conflict.

Which Should You Use?

Implementation	Best For	Python 3 Support
CPython	General purpose — use by default	✅ Latest
PyPy	CPU-bound workloads needing speed	✅ Yes
GraalPy	JVM/polyglot environments	✅ Yes
Jython	Legacy Java integration	❌ Python 2.7 only
IronPython	.NET integration	⚠️ Python 2.7 (v3 in dev)
pythonnet	.NET integration with CPython packages	✅ Yes

When in doubt, use CPython. It's what the vast majority of Python developers use, what CI systems expect, and what package authors test against.

Installing Python

There's a good chance you already have Python on your operating system. If so, you may not need to install anything else to use Python. That said, I strongly recommend installing the tools described in the guides below before you start building Python applications for real-world use.

The Easiest Way: uv

uv is a fast, all-in-one Python tool that manages Python installations, virtual environments, and packages. It's the recommended way to get started.

Install uv:

macOS / Linux:

$ curl -LsSf https://astral.sh/uv/install.sh | sh

Windows:

$ powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

With pip:

$ pip install uv

Then install the latest Python:

$ uv python install 3.13

uv can install and manage multiple Python versions, create virtual environments, install packages, and run scripts — all without needing separate tools.

Installation by Platform

Python 3 on macOS

Install via Homebrew:

$ brew install python@3.13

Or use uv:

$ uv python install 3.13

Or download the installer from python.org.

Python 3 on Windows

Install via the Microsoft Store (search "Python"), or download the installer from python.org. During installation, check "Add Python to PATH".

Or use uv:

$ uv python install 3.13

Python 3 on Linux

Most Linux distributions include Python 3. Install it with your package manager if it's not already present:

Debian / Ubuntu:

$ sudo apt update
$ sudo apt install python3 python3-pip python3-venv

Fedora:

$ sudo dnf install python3 python3-pip

Arch:

$ sudo pacman -S python python-pip

Or use uv for a version-independent install:

$ uv python install 3.13

Managing Multiple Versions

If you need multiple Python versions side by side:

With uv:

$ uv python install 3.12
$ uv python install 3.13
$ uv python list  # see installed versions

With pyenv (macOS/Linux):

$ brew install pyenv  # or see pyenv docs for Linux
$ pyenv install 3.13
$ pyenv install 3.12
$ pyenv global 3.13  # set default

Verifying Your Installation

$ python3 --version
Python 3.13.x

$ pip3 --version
pip 24.x from ... (python 3.13)

You're ready to go. Next up: setting up virtual environments.

Package & Virtual Environment Management

This tutorial walks you through installing and using Python packages.

It will show you how to install and use the necessary tools and make strong recommendations on best practices. Keep in mind that Python is used for a great many different purposes, and precisely how you want to manage your dependencies may change based on how you decide to publish your software. The guidance presented here is most directly applicable to the development and deployment of network services (including web applications), but is also very well suited to managing development and testing environments for any kind of project.

Make sure you've got Python

Before you go any further, make sure you have Python and that it's available from your command line:

$ python3 --version

You should get output like Python 3.13.x. If you do not have Python, see the Installing Python section of this guide.

If you're a newcomer and you get an error like NameError: name 'python' is not defined, it's because this command is intended to be run in a shell (also called a terminal or console), not inside the Python interpreter.

uv (Recommended)

uv is a fast, all-in-one Python package manager and project tool written in Rust. It replaces pip, virtualenv, pip-tools, pipx, poetry, pyenv, and more — with a single binary that's 10-100x faster than the tools it replaces.

If you're familiar with Node.js's npm or Rust's cargo, it's similar in spirit to those tools.

Installing uv

macOS / Linux:

$ curl -LsSf https://astral.sh/uv/install.sh | sh

Windows:

$ powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

With pip:

$ pip install uv

Creating a Project

$ mkdir myproject && cd myproject
$ uv init

This creates a pyproject.toml and a .python-version file. Now add a dependency:

$ uv add requests

uv will:

Create a virtual environment automatically (.venv/)
Resolve and lock all dependencies (uv.lock)
Install everything

Using Installed Packages

Create a main.py file:

import requests

response = requests.get('https://httpbin.org/ip')
print(f'Your IP is {response.json()["origin"]}')

Run it with uv:

$ uv run python main.py

Using uv run ensures your script runs within the project's virtual environment with all dependencies available.

Adding and Removing Dependencies

$ uv add httpx          # add a dependency
$ uv add --dev pytest   # add a development dependency
$ uv remove httpx       # remove a dependency

All changes are reflected in pyproject.toml and uv.lock.

Running Commands in the Environment

$ uv run pytest         # run any command in the virtual environment
$ uv run python app.py  # run a script with dependencies available
$ uv shell              # activate the virtual environment in your shell

pipx: Install Global CLI Tools

pipx installs and runs Python CLI applications in isolated environments. Use it for tools you want available system-wide:

$ uv tool install black      # install a tool globally
$ uv tool install ruff
$ uv tool run ruff check .   # run without installing

Lower Level: venv

Python 3 includes the built-in venv module for creating virtual environments. This is useful when you don't need a full project setup.

Basic Usage

Create a virtual environment:

$ python3 -m venv .venv

Activate it:

macOS / Linux:

$ source .venv/bin/activate

Windows:

> .venv\Scripts\activate

The name of the active environment appears in your prompt (e.g., (.venv)).

Install packages:

$ pip install requests

Save dependencies:

$ pip freeze > requirements.txt

Install from a requirements file:

$ pip install -r requirements.txt

Deactivate:

$ deactivate

To delete a virtual environment, simply remove its folder:

$ rm -rf .venv

Remember to exclude the virtual environment folder from source control by adding .venv/ to your .gitignore.

direnv

direnv automatically activates a virtual environment when you cd into a project directory. Add this to your project's .envrc:

source .venv/bin/activate

Install on macOS:

$ brew install direnv

Then follow the setup instructions for your shell.

Your Development Environment

Editors & IDEs

Just about anything that can edit plain text will work for writing Python code; however, using a more powerful editor will make your life much easier.

VS Code (Recommended)

Visual Studio Code is the most popular editor for Python development today. It's free, lightweight, and has excellent Python support via the official Python extension.

Key features:

IntelliSense (autocompletion with type checking)
Integrated debugging and testing
Jupyter notebook support
Linting with Ruff, Pylance type checking
Remote development (SSH, containers, WSL)
Integrated terminal

Install the Python extension and optionally Ruff for fast linting and formatting.

Cursor

Cursor is a VS Code-based editor with built-in AI assistance. It supports all VS Code extensions and adds AI-powered code completion, editing, and chat. A strong choice if you want AI integrated into your workflow.

PyCharm

PyCharm by JetBrains is a full-featured Python IDE. It has two versions: Professional (paid) and Community (free, Apache 2.0). PyCharm provides excellent refactoring, debugging, testing, and database tools out of the box.

Most of PyCharm's features are also available in IntelliJ IDEA via the free Python plugin.

Zed

Zed is a fast, modern editor built in Rust with native Python support, AI-assisted editing, and collaborative features. Still maturing but already excellent for Python development.

Vim / Neovim

Vim and Neovim remain powerful choices for developers who prefer keyboard-driven editing. A good starting configuration for Python in your ~/.vimrc or Neovim config:

set textwidth=79
set shiftwidth=4
set tabstop=4
set expandtab
set softtabstop=4
set shiftround
set autoindent

For a modern Neovim Python setup, consider these plugins:

nvim-lspconfig — LSP client (for pyright/pylsp)
none-ls — formatting and linting (for ruff)
nvim-cmp — autocompletion

Emacs

Emacs is another powerful, programmable editor. See Python Programming in Emacs at EmacsWiki for setup guides.

Sublime Text

Sublime Text is a fast, polished editor with Python support and a vibrant plugin ecosystem. The LSP plugin enables modern language server integration.

Linting & Formatting

Ruff (Recommended)

Ruff is a fast Python linter and formatter written in Rust. It replaces over a dozen tools (pycodestyle, flake8, isort, autopep8, yapf, pydocstyle, and more) with a single binary that's 10-100x faster.

$ uv tool install ruff
$ ruff check .          # lint
$ ruff format .         # format

Configuration goes in pyproject.toml:

[tool.ruff]
line-length = 88
target-version = "py312"

[tool.ruff.lint]
select = ["E", "F", "I", "N", "W", "UP"]

Black

Black is the uncompromising Python code formatter. If you're not using Ruff, Black is the standard choice:

$ uv tool install black
$ black .

Type Checkers

Static type checking catches bugs before runtime:

pyright / basedpyright — fast, VS Code's default
mypy — the original, thorough but slower

$ uv add --dev pyright
$ uv run pyright

Interpreter Tools

Virtual Environments

Virtual environments isolate project package dependencies. See the Package & Virtual Environment Management chapter for full details.

pyenv

pyenv lets you install and switch between multiple Python versions. It works by inserting shims into your PATH:

$ pyenv install 3.13
$ pyenv install 3.12
$ pyenv global 3.13   # set default

pyenv installs CPython, PyPy, Anaconda, miniconda, and other interpreters.

For managing virtual environments with pyenv, use pyenv-virtualenv.

Note: If you use uv, it can manage Python installations too (uv python install), making pyenv optional.

Interactive Shells

IPython

IPython provides a rich interactive Python shell with syntax highlighting, autocompletion, magic commands, and more:

$ uv tool install ipython
$ ipython

bpython

bpython is an alternative interface with in-line syntax highlighting, readline-like autocomplete, and a "rewind" feature:

$ uv tool install bpython
$ bpython

ptpython

ptpython is a REPL built on prompt_toolkit with syntax highlighting, multiline editing, and Vi/Emacs modes:

$ uv tool install ptpython
$ ptpython

Package Manager Configuration

uv Configuration

uv is the recommended package manager. Most configuration lives in pyproject.toml and uv.lock, but some behaviors can be configured via environment variables.

Global Cache

uv caches downloaded packages by default in ~/.cache/uv/ (macOS/Linux) or %LOCALAPPDATA%\uv\cache (Windows). This means packages are downloaded once and shared across all projects and virtual environments.

To clear the cache:

$ uv cache clean

To set a custom cache directory:

$ export UV_CACHE_DIR=/path/to/cache

Requiring an Active Environment

When using pip directly (not uv), it's easy to accidentally install packages globally. To prevent this, add to your ~/.bashrc or ~/.zshrc:

export PIP_REQUIRE_VIRTUALENV=true

Then pip will refuse to install outside a virtual environment:

$ pip install requests
Could not find an activated virtualenv (required).

Override it when you need a global install:

gpip() {
    PIP_REQUIRE_VIRTUALENV=false pip "$@"
}

pip Configuration

If you use pip alongside uv, you can configure it via pip.conf (macOS/Linux) or pip.ini (Windows):

macOS / Linux: ~/.pip/pip.conf Windows: %USERPROFILE%\pip\pip.ini

Example configuration:

[global]
require-virtualenv = true

Caching with pip

Modern pip (6.0+) has built-in caching enabled by default — no configuration needed. If you're on an older version, upgrade:

$ pip install --upgrade pip

Summary: uv vs pip

Task	uv	pip
Install package	`uv add requests`	`pip install requests`
Install (one-time)	`uv pip install requests`	`pip install requests`
Lock dependencies	Automatic (`uv.lock`)	Manual (`pip freeze > requirements.txt`)
Create venv	Automatic or `uv venv`	`python3 -m venv .venv`
Run in env	`uv run <command>`	Activate venv first
Global tools	`uv tool install <pkg>`	`pipx install <pkg>`
Cache	Built-in, fast	Built-in

Use uv for projects. Use uv pip when you need pip-compatible commands. Use uv tool for global CLI tools.

Structuring Your Project

By "structure" we mean the decisions you make concerning how your project best meets its objective. We need to consider how to best leverage Python's features to create clean, effective code. In practical terms, "structure" means making clean code whose logic and dependencies are clear as well as how the files and folders are organized in the filesystem.

Which functions should go into which modules? How does data flow through the project? What features and functions can be grouped together and isolated? By answering questions like these you can begin to plan, in a broad sense, what your finished product will look like.

In this section, we take a closer look at Python's modules and import systems as they are the central elements to enforcing structure in your project. We then discuss various perspectives on how to build code which can be extended and tested reliably.

Structure of the Repository

It's Important

Just as Code Style, API Design, and Automation are essential for a healthy development cycle, repository structure is a crucial part of your project's architecture.

When a potential user or contributor lands on your repository's page, they see a few things:

Project Name
Project Description
Bunch O' Files

Only when they scroll below the fold will the user see your project's README.

If your repo is a massive dump of files or a nested mess of directories, they might look elsewhere before even reading your beautiful documentation.

Dress for the job you want, not the job you have.

Of course, first impressions aren't everything. You and your colleagues will spend countless hours working with this repository, eventually becoming intimately familiar with every nook and cranny. The layout is important.

Sample Repository (Modern)

Here's a modern Python project using the src layout and pyproject.toml:

README.md
LICENSE
pyproject.toml
uv.lock
src/
    sample/
        __init__.py
        core.py
        helpers.py
docs/
    conf.py
    index.rst
tests/
    test_basic.py
    test_advanced.py

Let's get into some specifics.

pyproject.toml

The pyproject.toml file is the modern standard for Python project configuration (defined by PEP 517, PEP 518, and PEP 621). It replaces setup.py, setup.cfg, and requirements.txt.

A minimal pyproject.toml:

[project]
name = "sample"
version = "1.0.0"
description = "A sample project"
requires-python = ">=3.12"
dependencies = [
    "requests>=2.31",
]

[project.optional-dependencies]
dev = [
    "pytest>=8.0",
    "ruff>=0.4",
]

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

Use uv to manage this file:

$ uv init              # create a new project
$ uv add requests      # add a dependency
$ uv add --dev pytest   # add a dev dependency
$ uv sync              # install all dependencies

The src Layout

Placing your package inside a src/ directory is now considered best practice. This prevents accidental imports from the project root during development and ensures you're testing the installed package, not the source directly.

src/
    sample/
        __init__.py
        core.py

License

Place the full license text and copyright claims in ./LICENSE. If you aren't sure which license to use, check out choosealicense.com.

You are free to publish code without a license, but this prevents many people from using or contributing to your code.

Documentation

Documentation belongs in ./docs/. There's little reason for it to exist elsewhere.

Test Suite

For advice on writing your tests, see Testing Your Code.

Tests belong in ./tests/. Starting out, a small test suite might exist in a single file:

./test_sample.py

Once the suite grows:

tests/test_basic.py
tests/test_advanced.py

With the src layout and a properly configured pyproject.toml, tests can import your package directly — no path hacks needed:

# tests/test_basic.py
from sample.core import something

def test_something():
    assert something() == "expected"

Install the package in development mode:

$ uv sync
# or
$ uv pip install -e .

Makefile or Just

A Makefile (or Justfile) is useful for defining common tasks:

init:
    uv sync

test:
    uv run pytest tests

lint:
    uv run ruff check .
    uv run ruff format --check .

.PHONY: init test lint

Regarding Django Applications

When starting a Django project, run the command from within your repository:

$ django-admin startproject samplesite .

Note the . — this avoids unnecessary nesting. The resulting structure:

README.md
manage.py
samplesite/settings.py
samplesite/wsgi.py
samplesite/sampleapp/models.py

Structure of Code is Key

Thanks to the way imports and modules are handled in Python, it is relatively easy to structure a Python project. Easy, here, means that you do not have many constraints and that the module importing model is easy to grasp. Therefore, you are left with the pure architectural task of crafting the different parts of your project and their interactions.

Easy structuring means it's also easy to do it poorly. Some signs of a poorly structured project include:

Multiple and messy circular dependencies: If the classes Table and Chair in furn.py need to import Carpenter from workers.py, and the class Carpenter also needs to import Table and Chair, you have a circular dependency requiring fragile hacks.
Hidden coupling: Each change in Table's implementation breaks 20 tests in unrelated test cases because it breaks Carpenter's code.
Heavy usage of global state: Instead of explicitly passing (height, width, type, wood), Table and Carpenter rely on global variables that can be modified on the fly by different agents.
Spaghetti code: Multiple pages of nested if clauses and for loops with lots of copy-pasted procedural code and no proper segmentation.
Ravioli code: Hundreds of similar little pieces of logic without proper structure. If you can never remember whether to use FurnitureTable, AssetTable, or Table for your task, you might be swimming in ravioli code.

Modules

Python modules are one of the main abstraction layers available and probably the most natural one. Abstraction layers allow separating code into parts holding related data and functionality.

For example, a layer of a project can handle interfacing with user actions, while another would handle low-level manipulation of data. The most natural way to separate these two layers is to regroup all interfacing functionality in one file, and all low-level operations in another file.

As soon as you use import statements, you use modules. These can be either built-in modules such as os and sys, third-party modules you've installed, or your project's internal modules.

To keep in line with the style guide, keep module names short, lowercase, and avoid using special symbols like the dot (.) or question mark (?). A file name like my.spam.py should be avoided — it will interfere with the way Python looks for modules.

If you like, you could name your module my_spam.py, but even the underscore should not be seen that often in module names. Using other characters (spaces or hyphens) in module names will prevent importing. Try to keep module names short so there is no need to separate words. And, most of all, don't namespace with underscores; use submodules instead.

# OK
import library.plugin.foo
# not OK
import library.foo_plugin

The import modu statement will look for modu.py in the same directory as the caller, then search the Python path recursively. When found, the interpreter executes the module in an isolated scope. Then, the module's variables, functions, and classes are available through the module's namespace.

It is possible to use from modu import *, but this is generally considered bad practice. Using import * makes the code harder to read and makes dependencies less compartmentalized.

Very bad

from modu import *
x = sqrt(4)  # Is sqrt part of modu? A builtin? Defined above?

Better

from modu import sqrt
x = sqrt(4)  # sqrt may be part of modu, if not redefined in between

Best

import modu
x = modu.sqrt(4)  # sqrt is visibly part of modu's namespace

Packages

Python provides a straightforward packaging system, which is simply an extension of the module mechanism to a directory.

Any directory with an __init__.py file is considered a Python package. A file modu.py in the directory pack/ is imported with import pack.modu.

Leaving an __init__.py file empty is considered normal and even good practice, if the package's modules and sub-packages do not need to share any code.

A convenient syntax is available for importing deeply nested packages: import very.deep.module as mod. This allows you to use mod in place of the verbose repetition of very.deep.module.

Object-Oriented Programming

In Python, everything is an object. Functions, classes, strings, and even types are objects: they have a type, they can be passed as function arguments, and they may have methods and properties.

However, unlike Java, Python does not impose object-oriented programming as the main programming paradigm. It is perfectly viable for a Python project to not be object-oriented.

Moreover, the way Python handles modules and namespaces gives the developer a natural way to ensure encapsulation and separation of abstraction layers, both being the most common reasons to use object-orientation.

There are some reasons to avoid unnecessary object-orientation. Defining custom classes is useful when we want to glue some state and some functionality together. The problem comes from the "state" part of the equation.

In some architectures, typically web applications, multiple instances of Python processes are spawned as a response to simultaneous requests. Holding state in instantiated objects is prone to concurrency problems or race conditions.

This and other issues led to the idea that using stateless functions is a better programming paradigm for many cases. A function's implicit context is made up of any global variables or items in the persistence layer that are accessed from within the function. Side-effects are the changes that a function makes to its implicit context.

Carefully isolating functions with context and side-effects from functions with logic (called pure functions) allows the following benefits:

Pure functions are deterministic: given a fixed input, the output will always be the same.
Pure functions are much easier to change or replace if they need to be refactored or optimized.
Pure functions are easier to test with unit tests: less need for complex context setup and data cleaning afterwards.
Pure functions are easier to manipulate, decorate, and pass around.

Decorators

The Python language provides a simple yet powerful syntax called "decorators". A decorator is a function or class that wraps (or decorates) a function or method. Because functions are first-class objects in Python, this can be done "manually", but using the @decorator syntax is clearer and thus preferred.

def foo():
    # do something

def decorator(func):
    # manipulate func
    return func

foo = decorator(foo)  # Manually decorate

@decorator
def bar():
    # Do something
# bar() is decorated

This mechanism is useful for separating concerns and avoiding external unrelated logic "polluting" the core logic of the function or method. A good example is memoization or caching: you want to store the results of an expensive function and use them directly instead of recomputing.

Python's standard library includes @functools.cache (and @functools.lru_cache) for this purpose:

from functools import cache

@cache
def expensive_function(n):
    return compute_result(n)

Context Managers

A context manager provides extra contextual information to an action using the with statement. The most well-known example:

with open('file.txt') as f:
    contents = f.read()

This ensures f.close() is called. There are two ways to create your own:

Using a class:

class CustomOpen:
    def __init__(self, filename):
        self.file = open(filename)

    def __enter__(self):
        return self.file

    def __exit__(self, ctx_type, ctx_value, ctx_traceback):
        self.file.close()

with CustomOpen('file') as f:
    contents = f.read()

Using a generator with contextlib:

from contextlib import contextmanager

@contextmanager
def custom_open(filename):
    f = open(filename)
    try:
        yield f
    finally:
        f.close()

with custom_open('file') as f:
    contents = f.read()

Use the class approach when there's considerable logic to encapsulate. Use the function approach for simple actions.

Dynamic Typing

Python is dynamically typed — variables do not have a fixed type. Variables are "tags" or "names" pointing to objects. This flexibility can lead to hard-to-debug code if abused.

Guidelines:

Avoid using the same variable name for different things.
Use descriptive names that reflect the type/purpose.

Bad

a = 1
a = 'a string'
def a():
    pass

Good

count = 1
msg = 'a string'
def func():
    pass

Using short functions or methods helps reduce the risk of using the same name for unrelated things.

Python supports optional type hints via PEP 484, which can be checked with tools like mypy or pyright. Type hints don't change runtime behavior but catch bugs during development:

def greet(name: str) -> str:
    return f"Hello, {name}"

Mutable and Immutable Types

Python has two kinds of built-in types.

Mutable types allow in-place modification: lists, dictionaries, sets.

Immutable types provide no method for changing their content: integers, strings, tuples.

my_list = [1, 2, 3]
my_list[0] = 4
print(my_list)  # [4, 2, 3] — the same list has changed

x = 6
x = x + 1  # The new x is a different object

Mutable types cannot be used as dictionary keys. Use the immutable equivalent — tuples — instead.

Strings are immutable. When constructing a string from parts, use join() rather than concatenation:

Bad

nums = ""
for n in range(20):
    nums += str(n)  # slow, creates a new string each time

Best

nums = [str(n) for n in range(20)]
print("".join(nums))

For simple concatenation of a few strings, + is fine. For building strings from many parts, use join(). And for string interpolation, use f-strings:

name = "World"
greeting = f"Hello, {name}!"

Code Style

If you ask Python programmers what they like most about Python, they will often cite its high readability. Indeed, a high level of readability is at the heart of the design of the Python language, following the recognized fact that code is read much more often than it is written.

One reason for the high readability of Python code is its relatively complete set of Code Style guidelines and "Pythonic" idioms.

When a veteran Python developer (a Pythonista) calls portions of code not "Pythonic", they usually mean that these lines of code do not follow the common guidelines and fail to express its intent in what is considered the best (hear: most readable) way.

General Concepts

Explicit Code

While any kind of black magic is possible with Python, the most explicit and straightforward manner is preferred.

Bad

def make_complex(*args):
    x, y = args
    return dict(**locals())

Good

def make_complex(x, y):
    return {'x': x, 'y': y}

In the good code above, x and y are explicitly received from the caller, and an explicit dictionary is returned. The developer using this function knows exactly what to do by reading the first and last lines.

One Statement Per Line

While compound statements such as list comprehensions are appreciated for their brevity, it is bad practice to have two disjointed statements on the same line.

Bad

print('one'); print('two')

if x == 1: print('one')

if <complex comparison> and <other complex comparison>:
    # do something

Good

print('one')
print('two')

if x == 1:
    print('one')

cond1 = <complex comparison>
cond2 = <other complex comparison>
if cond1 and cond2:
    # do something

Function Arguments

Arguments can be passed to functions in four different ways:

Positional arguments are mandatory and have no default values. For instance, in send(message, recipient) or point(x, y).
Keyword arguments are not mandatory and have default values. When a function has more than two or three positional parameters, keyword arguments with default values are helpful. For instance: send(message, to, cc=None, bcc=None).

As a side note, following the YAGNI principle, it's often harder to remove an optional argument that was added "just in case" than to add one when needed.
Arbitrary argument list (*args): If the function's intention is better expressed with an extensible number of positional arguments. However, if a function receives a list of arguments of the same nature, it's often clearer to define it as a function of one list argument.
Arbitrary keyword argument dictionary (**kwargs): For an undetermined series of named arguments. Use these powerful techniques only when there's a proven necessity.

Avoid the Magical Wand

Python comes with a very rich set of hooks and tools for tricky tricks — changing how objects are created, how the interpreter imports modules, even embedding C routines. However, these options have drawbacks and readability suffers greatly.

Knowing how and particularly when not to use them is very important. Like a kung fu master, a Pythonista knows how to kill with a single finger, and never to actually do it.

We Are All Responsible Users

Python allows many tricks, and some are potentially dangerous. Any client code can override an object's properties and methods — there is no "private" keyword. This philosophy is expressed by: "We are all responsible users."

The main convention for private properties and implementation details is to prefix all "internals" with an underscore. If client code breaks this rule, any misbehavior is the client code's responsibility.

Using this convention generously is encouraged: any method or property not intended for client code should be prefixed with an underscore.

Returning Values

When a function grows in complexity, it's common to use multiple return statements. However, to keep a clear intent:

Return early for error cases (returning None or raising an exception)
Keep a single exit point for the main result

def complex_function(a, b, c):
    if not a:
        return None  # Raising an exception might be better
    if not b:
        return None  # Raising an exception might be better
    # Some complex code trying to compute x from a, b and c
    if not x:
        # Some Plan-B computation of x
    return x  # One single exit point for the returned value

Idioms

A programming idiom is a way to write code. Idiomatic Python code is often referred to as being Pythonic. Some common Python idioms follow:

Unpacking

If you know the length of a list or tuple, you can assign names to its elements:

for index, item in enumerate(some_list):
    # do something with index and item

Swap variables:

a, b = b, a

Nested unpacking:

a, (b, c) = 1, (2, 3)

Extended unpacking:

a, *rest = [1, 2, 3]
# a = 1, rest = [2, 3]

a, *middle, c = [1, 2, 3, 4]
# a = 1, middle = [2, 3], c = 4

Create an Ignored Variable

If you need to assign something but won't use it, use __:

filename = 'foobar.txt'
basename, __, ext = filename.rpartition('.')

Many style guides recommend _ for throwaway variables, but _ conflicts with gettext.gettext and the interactive prompt's last result. Double underscore avoids these conflicts.

Create a Length-N List of the Same Thing

four_nones = [None] * 4

Create a Length-N List of Lists

Because lists are mutable, * creates N references to the same list. Use a list comprehension:

four_lists = [[] for __ in range(4)]

Create a String from a List

letters = ['s', 'p', 'a', 'm']
word = ''.join(letters)

Searching for an Item in a Collection

Sets and dictionaries use hashtables for O(1) lookups. Lists require O(n) scanning:

s = {'s', 'p', 'a', 'm'}
l = ['s', 'p', 'a', 'm']

def lookup_set(s):
    return 's' in s  # Fast — O(1)

def lookup_list(l):
    return 's' in l  # Slow for large lists — O(n)

Use sets or dictionaries instead of lists when:

The collection will contain many items
You will repeatedly search for items
You don't have duplicate items

Use f-strings for String Formatting

F-strings (introduced in Python 3.6 by PEP 498) are the preferred way to format strings:

name = "World"
age = 30

# Best — f-strings (Python 3.6+)
greeting = f"Hello, {name}! You are {age} years old."

# OK — .format()
greeting = "Hello, {}! You are {} years old.".format(name, age)

# Avoid — % formatting (old style)
greeting = "Hello, %s! You are %d years old." % (name, age)

F-strings support expressions:

f"2 + 2 = {2 + 2}"           # "2 + 2 = 4"
f"{'hello'.upper()}"          # "HELLO"
f"{name!r}"                   # "'World'" (repr)
f"{age:>10}"                  # "        30" (right-aligned)

Note: For logging, prefer %s style or lazy formatting for performance — f-strings evaluate immediately even if the log level won't emit the message.

Type Hints

Python supports optional type hints (PEP 484) which improve code clarity and enable static analysis:

def greet(name: str, age: int = 0) -> str:
    return f"Hello, {name}! You are {age} years old."

from typing import Optional

def find_user(user_id: int) -> Optional[dict]:
    """Returns user dict or None if not found."""
    ...

For modern Python (3.10+), use the cleaner syntax:

def find_user(user_id: int) -> dict | None:
    ...

names: list[str] = []
config: dict[str, int] = {}

Check types with pyright or mypy:

$ uv add --dev pyright
$ uv run pyright

Zen of Python

Also known as PEP 20, the guiding principles for Python's design:

>>> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

PEP 8

PEP 8 is the de facto code style guide for Python. A high-quality, easy-to-read version is also available at pep8.org.

The entire Python community does their best to adhere to these guidelines. Conforming your Python code to PEP 8 is generally a good idea and helps make code more consistent when working with other developers.

Linting & Formatting Tools

Ruff (Recommended)

Ruff is a fast Python linter and formatter written in Rust. It replaces pycodestyle, flake8, isort, autopep8, yapf, and dozens of other tools with a single binary:

$ uv tool install ruff
$ ruff check .          # lint for issues
$ ruff format .         # auto-format

Configuration in pyproject.toml:

[tool.ruff]
line-length = 88
target-version = "py312"

[tool.ruff.lint]
select = ["E", "F", "I", "N", "W", "UP"]

Ruff can also auto-fix issues:

$ ruff check --fix .

Black

Black is the uncompromising Python code formatter. It takes the formatting debate out of your hands:

$ uv tool install black
$ black .

Black's formatting is deterministic — everyone on your team gets the same result.

Conventions

Check if a Variable Equals a Constant

You don't need to explicitly compare a value to True, None, or 0:

Bad

if attr == True:
    print('True!')

if attr == None:
    print('attr is None!')

Good

# Just check the value
if attr:
    print('attr is truthy!')

# Check for the opposite
if not attr:
    print('attr is falsey!')

# Explicitly check for None
if attr is None:
    print('attr is None!')

Access a Dictionary Element

Use in or dict.get():

Bad

d = {'hello': 'world'}
if d.has_key('hello'):  # has_key was removed in Python 3
    print(d['hello'])

Good

d = {'hello': 'world'}

print(d.get('hello', 'default_value'))  # prints 'world'
print(d.get('thingy', 'default_value'))  # prints 'default_value'

if 'hello' in d:
    print(d['hello'])

Short Ways to Manipulate Lists

List comprehensions provide a powerful, concise way to work with lists. Generator expressions follow similar syntax but return a generator instead of creating a list.

Bad

# Allocates a full list in memory
valedictorian = max([(student.gpa, student.name) for student in graduates])

Good

valedictorian = max((student.gpa, student.name) for student in graduates)

Never use a list comprehension just for side effects:

Bad

[print(x) for x in sequence]

Good

for x in sequence:
    print(x)

Filtering a List

Bad — never remove items from a list while iterating:

a = [3, 4, 5]
for i in a:
    if i > 4:
        a.remove(i)

Good — use a list comprehension or generator expression:

filtered_values = [value for value in sequence if value != x]
# or as a generator:
filtered_values = (value for value in sequence if value != x)

Modifying Values in a List

Good — create a new list to avoid side effects:

a = [3, 4, 5]
b = a
a = [i + 3 for i in a]  # b is unchanged

Use enumerate to track position:

a = [3, 4, 5]
for i, item in enumerate(a):
    print(i, item)

Read From a File

Use the with open syntax:

Bad

f = open('file.txt')
a = f.read()
print(a)
f.close()

Good

with open('file.txt') as f:
    for line in f:
        print(line)

Line Continuations

Use parentheses for line continuations instead of backslashes:

Bad

my_very_big_string = """For a long time I used to go to bed early. Sometimes, \
    when I had put out my candle, my eyes would close so quickly that I had not even \
    time to say "I'm going to sleep.""""

from some.deep.module.inside.a.module import a_nice_function, another_nice_function, \
    yet_another_nice_function

Good

my_very_big_string = (
    "For a long time I used to go to bed early. Sometimes, "
    "when I had put out my candle, my eyes would close so quickly "
    "that I had not even time to say \"I'm going to sleep.\""
)

from some.deep.module.inside.a.module import (
    a_nice_function, another_nice_function, yet_another_nice_function
)

Having to split a long logical line is often a sign you're trying to do too many things at once.

Reading Great Code

One of the secrets of becoming a great Python programmer is to read, understand, and comprehend excellent code.

Excellent code typically follows the guidelines outlined in Code Style, and does its best to express a clear and concise intent to the reader.

Here is a list of recommended Python projects for reading. Each one is a paragon of Python coding.

Howdoi — a code search tool written in Python.
Flask — a microframework for Python based on Werkzeug and Jinja2.
Requests — an Apache2 Licensed HTTP library for human beings.
Tablib — a format-agnostic tabular dataset library.
httpx — a next-generation HTTP client for Python.
Rich — a library for rich text and beautiful formatting in the terminal.
Pydantic — data validation using Python type hints.

Documentation

Readability is a primary focus for Python developers, in both project and code documentation. Following some simple best practices can save both you and others a lot of time.

Project Documentation

A README file at the root directory should give general information to both users and maintainers. It should be written in Markdown or reStructuredText and contain:

A few lines explaining the purpose of the project
The URL of the main source
Basic credit and license information
Installation instructions

An INSTALL section is less necessary with modern Python. Installation is often reduced to one command:

$ uv add mypackage
# or
$ uv pip install mypackage

A LICENSE file should always be present and specify the license under which the software is made available.

A CHANGELOG file or section should compile a short overview of changes for the latest versions.

Project Publication

Depending on the project, your documentation might include:

An introduction — a very short overview with one or two simplified use cases. The thirty-second pitch.
A tutorial — primary use cases in detail, step-by-step.
An API reference — typically generated from docstrings. Lists all public interfaces, parameters, and return values.
Developer documentation — for potential contributors. Code conventions and design strategy.

Sphinx

Sphinx is the most popular Python documentation tool. It converts reStructuredText into HTML, LaTeX, man pages, and more.

Free hosting is available at Read the Docs. Configure it with commit hooks for automatic rebuilds.

MkDocs

MkDocs is a fast, simple static site generator for building project documentation with Markdown. It's a popular alternative to Sphinx, especially for projects already using Markdown.

reStructuredText

Most Python documentation is written with reStructuredText. It's like Markdown but with more built-in extensions.

See the reStructuredText Primer for syntax help.

Code Documentation Advice

Comments clarify the code and are added to make the code easier to understand. In Python, comments begin with #.

Docstrings

In Python, docstrings describe modules, classes, and functions:

def square_and_rooter(x):
    """Return the square root of self times self."""
    ...

Follow PEP 8 for comments and PEP 257 for docstring conventions.

Commenting Sections of Code

Do not use triple-quote strings to comment code. Line-oriented tools like grep won't be aware the code is inactive. Use # at the proper indentation level. Your editor has a comment/uncomment toggle — learn it.

Docstrings and Magic

Tools like Sphinx parse docstrings as reStructuredText and render them as HTML. This makes it easy to embed example code.

Doctest reads embedded docstrings that look like Python REPL sessions and runs them, verifying the output matches:

def my_function(a, b):
    """
    >>> my_function(2, 3)
    6
    >>> my_function('a', 3)
    'aaa'
    """
    return a * b

Writing Docstrings

For simple functions, a one-line docstring is appropriate:

def add(a, b):
    """Add two numbers and return the result."""
    return a + b

For more complex code, the NumPy style is popular:

def random_number_generator(arg1, arg2):
    """
    Summary line.

    Extended description of function.

    Parameters
    ----------
    arg1 : int
        Description of arg1
    arg2 : str
        Description of arg2

    Returns
    -------
    int
        Description of return value
    """
    return 42

The sphinx.ext.napoleon plugin lets Sphinx parse this style.

Google style is another popular option:

def function_with_types(arg1: str, arg2: int) -> bool:
    """Summary line.

    Args:
        arg1: Description of arg1.
        arg2: Description of arg2.

    Returns:
        Description of return value.

    Raises:
        ValueError: If arg2 is negative.
    """
    ...

At the end of the day, it doesn't matter which style you use — as long as it's correct, understandable, and gets the relevant points across.

Other Tools

Pycco — literate-programming-style documentation generator
MkDocs — fast, simple static site generator for Markdown documentation
Jupyter Book — build books and docs from Markdown and notebooks

Testing Your Code

Testing your code is very important.

Getting used to writing testing code and running it in parallel is now considered a good habit. Used wisely, this method helps define your code's intent more precisely and leads to a more decoupled architecture.

Some general rules of testing:

A testing unit should focus on one tiny bit of functionality and prove it correct.
Each test unit must be fully independent. Each test must be able to run alone and within the test suite, regardless of order. Use setUp() and tearDown() (or setup/teardown fixtures) to manage test data.
Try hard to make tests that run fast. If a single test needs more than a few milliseconds, development will slow down. Keep heavier tests in a separate suite.
Learn your tools and learn how to run a single test or test case.
Always run the full test suite before and after a coding session.
Implement a hook that runs all tests before pushing code.
If you're in the middle of development and have to interrupt, write a broken unit test about what you want to develop next. You'll have a pointer when you return.
The first step when debugging is to write a new test pinpointing the bug.
Use long and descriptive names for testing functions: test_square_of_number_2(), test_square_negative_number().
Testing code is read as much as running code. A unit test with unclear purpose is not very helpful.
Testing code serves as an introduction to new developers. Running and reading tests is often the best way to learn a codebase.

The Basics

unittest

unittest is the batteries-included test module in the Python standard library.

import unittest

def fun(x):
    return x + 1

class MyTest(unittest.TestCase):
    def test(self):
        self.assertEqual(fun(3), 4)

Doctest

The doctest module searches for interactive Python sessions in docstrings and verifies they work as shown.

Doctests are useful as expressive documentation of main use cases, though they're less detailed than proper unit tests:

def square(x):
    """Return the square of x.

    >>> square(2)
    4
    >>> square(-2)
    4
    """
    return x * x

if __name__ == '__main__':
    import doctest
    doctest.testmod()

Tools

pytest (Recommended)

pytest is a no-boilerplate alternative to unittest. It's the most popular Python testing framework:

$ uv add --dev pytest

Creating a test suite is as easy as writing a module with functions:

# content of test_sample.py
def func(x):
    return x + 1

def test_answer():
    assert func(3) == 5

Run with:

$ uv run pytest
=========================== test session starts ============================
platform darwin -- Python 3.13.0 -- pytest-8.3.4
collected 1 items

test_sample.py F

================================= FAILURES =================================
_______________________________ test_answer ________________________________

    def test_answer():
>       assert func(3) == 5
E       assert 4 == 5
E        +  where 4 = func(3)

test_sample.py:5: AssertionError
========================= 1 failed in 0.02 seconds =========================

Far less work than the equivalent with unittest!

Hypothesis

Hypothesis lets you write tests parameterized by a source of examples. It generates simple, comprehensible examples that make your tests fail:

$ uv add --dev hypothesis

from hypothesis import given, strategies as st

@given(st.lists(st.floats(allow_nan=False, allow_infinity=False), min_size=1))
def test_mean(xs):
    mean = sum(xs) / len(xs)
    assert min(xs) <= mean <= max(xs)

Hypothesis will find bugs that escape all other forms of testing.

tox

tox automates test environment management and testing against multiple Python versions:

$ uv add --dev tox

Configure in pyproject.toml or tox.ini. Modern alternative: nox.

unittest.mock

unittest.mock is in the standard library since Python 3.3 — no installation needed.

It allows you to replace parts of your system under test with mock objects:

from unittest.mock import MagicMock

thing = ProductionClass()
thing.method = MagicMock(return_value=3)
thing.method(3, 4, 5, key='value')

thing.method.assert_called_with(3, 4, 5, key='value')

Use the patch decorator to mock classes or objects during a test:

from unittest.mock import patch

def mock_search(self):
    class MockSearchQuerySet(SearchQuerySet):
        def __iter__(self):
            return iter(["foo", "bar", "baz"])
    return MockSearchQuerySet()

@patch('myapp.SearchForm.search', mock_search)
def test_new_watchlist_activities(self):
    self.assertEqual(len(myapp.get_search_results(q="fish")), 3)

Logging

The logging module has been a part of Python"s Standard Library since the early days of Python. It is succinctly described in 282. The documentation is notoriously hard to read, except for the basic logging tutorial.

As an alternative, loguru provides an approach for logging, nearly as simple as using a simple print statement.

Logging serves two purposes:

Diagnostic logging records events related to the application"s operation. If a user calls in to report an error, for example, the logs can be searched for context.
Audit logging records events for business analysis. A user"s transactions can be extracted and combined with other user details for reports or to optimize a business goal.

... or Print?

The only time that print is a better option than logging is when the goal is to display a help statement for a command line application. Other reasons why logging is better than print:

The log record, which is created with every logging event, contains readily available diagnostic information such as the file name, full path, function, and line number of the logging event.
Events logged in included modules are automatically accessible via the root logger to your application"s logging stream, unless you filter them out.
Logging can be selectively silenced by using the method logging.Logger.setLevel or disabled by setting the attribute logging.Logger.disabled to True.

Logging in a Library

Notes for configuring logging for a library are in the logging tutorial. Because the user, not the library, should dictate what happens when a logging event occurs, one admonition bears repeating:

Note:

Note

It is strongly advised that you do not add any handlers other than NullHandler to your library's loggers. :

Best practice when instantiating loggers in a library is to only create them using the __name__ global variable: the logging module creates a hierarchy of loggers using dot notation, so using __name__ ensures no name collisions.

Here is an example of the best practice from the requests source -- place this in your __init__.py:

import logging
logging.getLogger(__name__).addHandler(logging.NullHandler())

Logging in an Application

The twelve factor app, an authoritative reference for good practice in application development, contains a section on logging best practice. It emphatically advocates for treating log events as an event stream, and for sending that event stream to standard output to be handled by the application environment.

There are at least three ways to configure a logger:

Using an INI-formatted file:

: - Pro: possible to update configuration while running, using the function logging.config.listen to listen on a socket. - Con: less control (e.g. custom subclassed filters or loggers) than possible when configuring a logger in code.

Using a dictionary or a JSON-formatted file:

: - Pro: in addition to updating while running, it is possible to load from a file using the json module, in the standard library. - Con: less control than when configuring a logger in code.

Using code:

: - Pro: complete control over the configuration. - Con: modifications require a change to the source code.

Example Configuration via an INI File

Let us say that the file is named logging_config.ini. More details for the file format are in the logging configuration section of the logging tutorial.

[loggers]
keys=root

[handlers]
keys=stream_handler

[formatters]
keys=formatter

[logger_root]
level=DEBUG
handlers=stream_handler

[handler_stream_handler]
class=StreamHandler
level=DEBUG
formatter=formatter
args=(sys.stderr,)

[formatter_formatter]
format=%(asctime)s %(name)-12s %(levelname)-8s %(message)s

Then use logging.config.fileConfig in the code:

import logging
from logging.config import fileConfig

fileConfig('logging_config.ini')
logger = logging.getLogger()
logger.debug('often makes a very good meal of %s', 'visiting tourists')

Example Configuration via a Dictionary

You can use a dictionary with configuration details. 391 contains a list of the mandatory and optional elements in the configuration dictionary.

import logging
from logging.config import dictConfig

logging_config = dict(
    version = 1,
    formatters = {
        'f': {'format':
              '%(asctime)s %(name)-12s %(levelname)-8s %(message)s'}
        },
    handlers = {
        'h': {'class': 'logging.StreamHandler',
              'formatter': 'f',
              'level': logging.DEBUG}
        },
    root = {
        'handlers': ['h'],
        'level': logging.DEBUG,
        },
)

dictConfig(logging_config)

logger = logging.getLogger()
logger.debug('often makes a very good meal of %s', 'visiting tourists')

Example Configuration Directly in Code

import logging

logger = logging.getLogger()
handler = logging.StreamHandler()
formatter = logging.Formatter(
        '%(asctime)s %(name)-12s %(levelname)-8s %(message)s')
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)

logger.debug('often makes a very good meal of %s', 'visiting tourists')

Common Gotchas

For the most part, Python aims to be a clean and consistent language that avoids surprises. However, there are a few cases that can be confusing for newcomers.

Some of these cases are intentional but can be potentially surprising. Some could arguably be considered language warts. In general, what follows is a collection of potentially tricky behavior that might seem strange at first glance, but are generally sensible, once you"re aware of the underlying cause for the surprise.

Mutable Default Arguments

Seemingly the most common surprise new Python programmers encounter is Python"s treatment of mutable default arguments in function definitions.

What You Wrote

def append_to(element, to=[]):
    to.append(element)
    return to

What You Might Have Expected to Happen

my_list = append_to(12)
print(my_list)

my_other_list = append_to(42)
print(my_other_list)

A new list is created each time the function is called if a second argument isn"t provided, so that the output is:

[12]
[42]

What Actually Happens

::: testoutput [12] [12, 42]

A new list is created once when the function is defined, and the same list is used in each successive call.

Python"s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.

What You Should Do Instead

Create a new object each time the function is called, by using a default arg to signal that no argument was provided (None is often a good choice).

def append_to(element, to=None):
    if to is None:
        to = []
    to.append(element)
    return to

Do not forget, you are passing a list object as the second argument.

When the Gotcha Isn"t a Gotcha

Sometimes you can specifically "exploit" (read: use as intended) this behavior to maintain state between calls of a function. This is often done when writing a caching function.

Late Binding Closures

Another common source of confusion is the way Python binds its variables in closures (or in the surrounding global scope).

What You Wrote

::: testcode

def create_multipliers():

: return [lambda x : i * x for i in range(5)]

What You Might Have Expected to Happen

::: testcode

for multiplier in create_multipliers():

: print(multiplier(2))

A list containing five functions that each have their own closed-over i variable that multiplies their argument, producing:

What Actually Happens

::: testoutput 8 8 8 8 8

Five functions are created; instead all of them just multiply x by 4.

Python"s closures are late binding. This means that the values of variables used in closures are looked up at the time the inner function is called.

Here, whenever any of the returned functions are called, the value of i is looked up in the surrounding scope at call time. By then, the loop has completed and i is left with its final value of 4.

What"s particularly nasty about this gotcha is the seemingly prevalent misinformation that this has something to do with lambdas <python:lambda> in Python. Functions created with a lambda expression are in no way special, and in fact the same exact behavior is exhibited by just using an ordinary def:

def create_multipliers():
    multipliers = []

    for i in range(5):
        def multiplier(x):
            return i * x
        multipliers.append(multiplier)

    return multipliers

What You Should Do Instead

The most general solution is arguably a bit of a hack. Due to Python"s aforementioned behavior concerning evaluating default arguments to functions (see default_args), you can create a closure that binds immediately to its arguments by using a default arg like so:

def create_multipliers():
    return [lambda x, i=i : i * x for i in range(5)]

Alternatively, you can use the functools.partial function:

from functools import partial
from operator import mul

def create_multipliers():
    return [partial(mul, i) for i in range(5)]

When the Gotcha Isn"t a Gotcha

Sometimes you want your closures to behave this way. Late binding is good in lots of situations. Looping to create unique functions is unfortunately a case where they can cause hiccups.

Bytecode (.pyc) Files Everywhere!

By default, when executing Python code from files, the Python interpreter will automatically write a bytecode version of that file to disk, e.g. module.pyc.

These .pyc files should not be checked into your source code repositories.

Theoretically, this behavior is on by default for performance reasons. Without these bytecode files, Python would re-generate the bytecode every time the file is loaded.

Disabling Bytecode (.pyc) Files

Luckily, the process of generating the bytecode is extremely fast, and isn"t something you need to worry about while developing your code.

Those files are annoying, so let"s get rid of them!

$ export PYTHONDONTWRITEBYTECODE=1

With the $PYTHONDONTWRITEBYTECODE environment variable set, Python will no longer write these files to disk, and your development environment will remain nice and clean.

I recommend setting this environment variable in your ~/.profile.

Removing Bytecode (.pyc) Files

Here"s nice trick for removing all of these files, if they already exist:

$ find . -type f -name "*.py[co]" -delete -or -type d -name "__pycache__" -delete

Run that from the root directory of your project, and all .pyc files will suddenly vanish. Much better.

Version Control Ignores

If you still need the .pyc files for performance reasons, you can always add them to the ignore files of your version control repositories. Popular version control systems have the ability to use wildcards defined in a file to apply special rules.

An ignore file will make sure the matching files don"t get checked into the repository. Git uses .gitignore while Mercurial uses .hgignore.

At the minimum your ignore files should look like this.

syntax:glob   # This line is not needed for .gitignore files.
*.py[cod]     # Will match .pyc, .pyo and .pyd files.
__pycache__/  # Exclude the whole folder

You may wish to include more files and directories depending on your needs. The next time you commit to the repository, these files will not be included.

Choosing a License

Your source publication needs a license. In the US, unless a license is specified, users have no legal right to download, modify, or distribute the product. Furthermore, people can"t contribute to your code unless you tell them what rules to play by. Choosing a license is complicated, so here are some pointers:

Open source. There are plenty of open source licenses available to choose from.

In general, these licenses tend to fall into one of two categories:

licenses that focus more on the user"s freedom to do with the software as they please (these are the more permissive open source licenses such as the MIT, BSD, and Apache)
licenses that focus more on making sure that the code itself --- including any changes made to it and distributed along with it --- always remains free (these are the less permissive free software licenses such as the GPL and LGPL)

The latter are less permissive in the sense that they don"t permit someone to add code to the software and distribute it without also including the source code for their changes.

To help you choose one for your project, there"s a license chooser; use it.

More Permissive

PSFL (Python Software Foundation License) -- for contributing to Python itself
MIT / BSD / ISC
- MIT (X11)
- New BSD
- ISC
Apache

Less Permissive:

LGPL
GPL
- GPLv2
- GPLv3

A good overview of licenses with explanations of what one can, cannot, and must do using a particular software can be found at tl;drLegal.

Network Applications

HTTP

The Hypertext Transfer Protocol (HTTP) is an application protocol for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web.

Requests

Python's standard library provides urllib.request for HTTP, but the API is verbose. For most use cases, a third-party library is recommended.

Requests takes all of the work out of Python HTTP --- making your integration with web services seamless. There's no need to manually add query strings to your URLs, or to form-encode your POST data. Keep-alive and HTTP connection pooling are 100% automatic, powered by urllib3, which is embedded within Requests.

Distributed Systems

ZeroMQ

ØMQ (also spelled ZeroMQ, 0MQ or ZMQ) is a high-performance asynchronous messaging library aimed at use in scalable distributed or concurrent applications. It provides a message queue, but unlike message-oriented middleware, a ØMQ system can run without a dedicated message broker. The library is designed to have a familiar socket-style API.

RabbitMQ

RabbitMQ is an open source message broker software that implements the Advanced Message Queuing Protocol (AMQP). The RabbitMQ server is written in the Erlang programming language and is built on the Open Telecom Platform framework for clustering and failover. Client libraries to interface with the broker are available for all major programming languages.

Web Applications & Frameworks

As a powerful scripting language adapted to both fast prototyping and bigger projects, Python is widely used in web application development.

Context

WSGI

The Web Server Gateway Interface (or "WSGI" for short) is a standard interface between web servers and Python web application frameworks. By standardizing behavior and communication between web servers and Python web frameworks, WSGI makes it possible to write portable Python web code that can be deployed in any WSGI-compliant web server <wsgi-servers-ref>. WSGI is documented in 3333.

Frameworks

Broadly speaking, a web framework consists of a set of libraries and a main handler within which you can build custom code to implement a web application (i.e. an interactive web site). Most web frameworks include patterns and utilities to accomplish at least the following:

URL Routing

: Matches an incoming HTTP request to a particular piece of Python code to be invoked

Request and Response Objects

: Encapsulates the information received from or sent to a user"s browser

Template Engine

: Allows for separating Python code implementing an application"s logic from the HTML (or other) output that it produces

Development Web Server

: Runs an HTTP server on development machines to enable rapid development; often automatically reloads server-side code when files are updated

Django

Django is a "batteries included" web application framework, and is an excellent choice for creating content-oriented websites. By providing many utilities and patterns out of the box, Django aims to make it possible to build complex, database-backed web applications quickly, while encouraging best practices in code written using it.

Django has a large and active community, and many pre-built re-usable modules that can be incorporated into a new project as-is, or customized to fit your needs.

There are annual Django conferences in the United States, Europe, and Australia.

The majority of new Python web applications today are built with Django.

Flask

Flask is a "microframework" for Python, and is an excellent choice for building smaller applications, APIs, and web services.

Building an app with Flask is a lot like writing standard Python modules, except some functions have routes attached to them. It"s really beautiful.

Rather than aiming to provide everything you could possibly need, Flask implements the most commonly-used core components of a web application framework, like URL routing, request and response objects, and templates.

If you use Flask, it is up to you to choose other components for your application, if any. For example, database access or form generation and validation are not built-in functions of Flask.

This is great, because many web applications don"t need those features. For those that do, there are many Extensions available that may suit your needs. Or, you can easily use any library you want yourself!

Flask is default choice for any Python web application that isn"t a good fit for Django.

Falcon

Falcon is a good choice when your goal is to build RESTful API microservices that are fast and scalable.

It is a reliable, high-performance Python web framework for building large-scale app backends and microservices. Falcon encourages the REST architectural style of mapping URIs to resources, trying to do as little as possible while remaining highly effective.

Falcon highlights four main focuses: speed, reliability, flexibility, and debuggability. It implements HTTP through "responders" such as on_get(), on_put(), etc. These responders receive intuitive request and response objects.

Tornado

Tornado is an asynchronous web framework for Python that has its own event loop. This allows it to natively support WebSockets, for example. Well-written Tornado applications are known to have excellent performance characteristics.

I do not recommend using Tornado unless you think you need it.

Pyramid

Pyramid is a very flexible framework with a heavy focus on modularity. It comes with a small number of libraries ("batteries") built-in, and encourages users to extend its base functionality. A set of provided cookiecutter templates helps making new project decisions for users. It powers one of the most important parts of python infrastructure PyPI.

Pyramid does not have a large user base, unlike Django and Flask. It"s a capable framework, but not a very popular choice for new Python web applications today.

Masonite

Masonite is a modern and developer centric, "batteries included", web framework.

The Masonite framework follows the MVC (Model-View-Controller) architecture pattern and is heavily inspired by frameworks such as Rails and Laravel, so if you are coming to Python from a Ruby or PHP background then you will feel right at home!

Masonite comes with a lot of functionality out of the box including a powerful IOC container with auto resolving dependency injection, craft command line tools, and the Orator active record style ORM.

Masonite is perfect for beginners or experienced developers alike and works hard to be fast and easy from install through to deployment. Try it once and you'll fall in love.

FastAPI

FastAPI is a modern web framework for building APIs with Python 3.6+.

It has very high performance as it is based on Starlette and Pydantic.

FastAPI takes advantage of standard Python type declarations in function parameters to declare request parameters and bodies, perform data conversion (serialization, parsing), data validation, and automatic API documentation with OpenAPI 3 (including JSON Schema).

It includes tools and utilities for security and authentication (including OAuth2 with JWT tokens), a dependency injection system, automatic generation of interactive API documentation, and other features.

Web Servers

Nginx

Nginx (pronounced "engine-x") is a web server and reverse-proxy for HTTP, SMTP, and other protocols. It is known for its high performance, relative simplicity, and compatibility with many application servers (like WSGI servers). It also includes handy features like load-balancing, basic authentication, streaming, and others. Designed to serve high-load websites, Nginx is gradually becoming quite popular.

WSGI Servers

Stand-alone WSGI servers typically use less resources than traditional web servers and provide top performance¹.

Gunicorn

Gunicorn (Green Unicorn) is a pure-Python WSGI server used to serve Python applications. Unlike other Python web servers, it has a thoughtful user interface, and is extremely easy to use and configure.

Gunicorn has sane and reasonable defaults for configurations. However, some other servers, like uWSGI, are tremendously more customizable, and therefore, are much more difficult to effectively use.

Gunicorn is the recommended choice for new Python web applications today.

Waitress

Waitress is a pure-Python WSGI server that claims "very acceptable performance". Its documentation is not very detailed, but it does offer some nice functionality that Gunicorn doesn"t have (e.g. HTTP request buffering).

Waitress is gaining popularity within the Python web development community.

uWSGI

uWSGI is a full stack for building hosting services. In addition to process management, process monitoring, and other functionality, uWSGI acts as an application server for various programming languages and protocols -- including Python and WSGI. uWSGI can either be run as a stand-alone web router, or be run behind a full web server (such as Nginx or Apache). In the latter case, a web server can configure uWSGI and an application"s operation over the uwsgi protocol. uWSGI"s web server support allows for dynamically configuring Python, passing environment variables, and further tuning. For full details, see uWSGI magic variables.

I do not recommend using uWSGI unless you know why you need it.

Server Best Practices

The majority of self-hosted Python applications today are hosted with a WSGI server such as Gunicorn <gunicorn-ref>, either directly or behind a lightweight web server such as nginx <nginx-ref>.

The WSGI servers serve the Python applications while the web server handles tasks better suited for it such as static file serving, request routing, DDoS protection, and basic authentication.

Hosting

Platform-as-a-Service (PaaS) is a type of cloud computing infrastructure which abstracts and manages infrastructure, routing, and scaling of web applications. When using a PaaS, application developers can focus on writing application code rather than needing to be concerned with deployment details.

Heroku

Heroku offers first-class support for Python 2.7--3.5 applications.

Heroku supports all types of Python web applications, servers, and frameworks. Applications can be developed on Heroku for free. Once your application is ready for production, you can upgrade to a Hobby or Professional application.

Heroku maintains detailed articles on using Python with Heroku, as well as step-by-step instructions on how to set up your first application.

Heroku is the recommended PaaS for deploying Python web applications today.

Templating

Most WSGI applications are responding to HTTP requests to serve content in HTML or other markup languages. Instead of directly generating textual content from Python, the concept of separation of concerns advises us to use templates. A template engine manages a suite of template files, with a system of hierarchy and inclusion to avoid unnecessary repetition, and is in charge of rendering (generating) the actual content, filling the static content of the templates with the dynamic content generated by the application.

As template files are sometimes written by designers or front-end developers, it can be difficult to handle increasing complexity.

Some general good practices apply to the part of the application passing dynamic content to the template engine, and to the templates themselves.

Template files should be passed only the dynamic content that is needed for rendering the template. Avoid the temptation to pass additional content "just in case": it is easier to add some missing variable when needed than to remove a likely unused variable later.
Many template engines allow for complex statements or assignments in the template itself, and many allow some Python code to be evaluated in the templates. This convenience can lead to uncontrolled increase in complexity, and often make it harder to find bugs.
It is often necessary to mix JavaScript templates with HTML templates. A sane approach to this design is to isolate the parts where the HTML template passes some variable content to the JavaScript code.

Jinja2

Jinja2 is a very well-regarded template engine.

It uses a text-based template language and can thus be used to generate any type of markup, not just HTML. It allows customization of filters, tags, tests, and globals. It features many improvements over Django"s templating system.

Here some important HTML tags in Jinja2:

{# This is a comment #}

{# The next tag is a variable output: #}
{{title}}

{# Tag for a block, can be replaced through inheritance with other html code #}
{% block head %}
<h1>This is the head!</h1>
{% endblock %}

{# Output of an array as an iteration #}
{% for item in list %}
<li>{{ item }}</li>
{% endfor %}

The next listings are an example of a web site in combination with the Tornado web server. Tornado is not very complicated to use.

# import Jinja2
from jinja2 import Environment, FileSystemLoader

# import Tornado
import tornado.ioloop
import tornado.web

# Load template file templates/site.html
TEMPLATE_FILE = "site.html"
templateLoader = FileSystemLoader( searchpath="templates/" )
templateEnv = Environment( loader=templateLoader )
template = templateEnv.get_template(TEMPLATE_FILE)

# List for famous movie rendering
movie_list = [[1,"The Hitchhiker's Guide to the Galaxy"],[2,"Back to future"],[3,"Matrix"]]

# template.render() returns a string which contains the rendered html
html_output = template.render(list=movie_list,
                        title="Here is my favorite movie list")

# Handler for main page
class MainHandler(tornado.web.RequestHandler):
    def get(self):
        # Returns rendered template string to the browser request
        self.write(html_output)

# Assign handler to the server root  (127.0.0.1:PORT/)
application = tornado.web.Application([
    (r"/", MainHandler),
])
PORT=8884
if __name__ == "__main__":
    # Setup the server
    application.listen(PORT)
    tornado.ioloop.IOLoop.instance().start()

The base.html file can be used as base for all site pages which are for example implemented in the content block.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html lang="en">
<html xmlns="https://www.w3.org/1999/xhtml">
<head>
    <link rel="stylesheet" href="style.css" />
    <title>{{title}} - My Webpage</title>
</head>
<body>
<div id="content">
    {# In the next line the content from the site.html template will be added #}
    {% block content %}{% endblock %}
</div>
<div id="footer">
    {% block footer %}
    &copy; Copyright 2013 by <a href="http://domain.invalid/">you</a>.
    {% endblock %}
</div>
</body>

The next listing is our site page (site.html) loaded in the Python app which extends base.html. The content block is automatically set into the corresponding block in the base.html page.

{% extends "base.html" %}
{% block content %}
    <p class="important">
    <div id="content">
        <h2>{{title}}</h2>
        <p>{{ list_title }}</p>
        <ul>
             {% for item in list %}
             <li>{{ item[0]}} :  {{ item[1]}}</li>
             {% endfor %}
        </ul>
    </div>
    </p>
{% endblock %}

Jinja2 is the recommended templating library for new Python web applications.

Chameleon

Chameleon Page Templates are an HTML/XML template engine implementation of the Template Attribute Language (TAL), TAL Expression Syntax (TALES), and Macro Expansion TAL (Metal) syntaxes.

Chameleon is available for Python 3.x and PyPy, and is commonly used by the Pyramid Framework.

Page Templates add within your document structure special element attributes and text markup. Using a set of simple language constructs, you control the document flow, element repetition, text replacement, and translation. Because of the attribute-based syntax, unrendered page templates are valid HTML and can be viewed in a browser and even edited in WYSIWYG editors. This can make round-trip collaboration with designers and prototyping with static files in a browser easier.

The basic TAL language is simple enough to grasp from an example:

<html>
  <body>
  <h1>Hello, <span tal:replace="context.name">World</span>!</h1>
    <table>
      <tr tal:repeat="row 'apple', 'banana', 'pineapple'">
        <td tal:repeat="col 'juice', 'muffin', 'pie'">
           <span tal:replace="row.capitalize()" /> <span tal:replace="col" />
        </td>
      </tr>
    </table>
  </body>
</html>

The [<span tal:replace="expression" />] pattern for text insertion is common enough that if you do not require strict validity in your unrendered templates, you can replace it with a more terse and readable syntax that uses the pattern [${expression}], as follows:

<html>
  <body>
    <h1>Hello, ${world}!</h1>
    <table>
      <tr tal:repeat="row 'apple', 'banana', 'pineapple'">
        <td tal:repeat="col 'juice', 'muffin', 'pie'">
           ${row.capitalize()} ${col}
        </td>
      </tr>
    </table>
  </body>
</html>

But keep in mind that the full [<span tal:replace="expression">Default Text</span>] syntax also allows for default content in the unrendered template.

Being from the Pyramid world, Chameleon is not widely used.

Mako

Mako is a template language that compiles to Python for maximum performance. Its syntax and API are borrowed from the best parts of other templating languages like Django and Jinja2 templates. It is the default template language included with the Pylons and Pyramid web frameworks.

An example template in Mako looks like:

<%inherit file="base.html"/>
<%
    rows = [[v for v in range(0,10)] for row in range(0,10)]
%>
<table>
    % for row in rows:
        ${makerow(row)}
    % endfor
</table>

<%def name="makerow(row)">
    <tr>
    % for name in row:
        <td>${name}</td>\
    % endfor
    </tr>
</%def>

To render a very basic template, you can do the following:

from mako.template import Template
print(Template("hello ${data}!").render(data="world"))

Mako is well respected within the Python web community.

References

Benchmark of Python WSGI Servers ↩

HTML Scraping

Web Scraping

Web sites are written using HTML, which means that each web page is a structured document. Sometimes it would be great to obtain some data from them and preserve the structure while we"re at it. Web sites don"t always provide their data in comfortable formats such as CSV or JSON.

This is where web scraping comes in. Web scraping is the practice of using a computer program to sift through a web page and gather the data that you need in a format most useful to you while at the same time preserving the structure of the data.

lxml and Requests

lxml is a pretty extensive library written for parsing XML and HTML documents very quickly, even handling messed up tags in the process. We will also be using the Requests module as it is more intuitive and feature-rich. Install both with uv add lxml requests.

Let"s start with the imports:

from lxml import html
import requests

Next we will use requests.get to retrieve the web page with our data, parse it using the html module, and save the results in tree:

page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
tree = html.fromstring(page.content)

(We need to use page.content rather than page.text because html.fromstring implicitly expects bytes as input.)

tree now contains the whole HTML file in a nice tree structure which we can go over two different ways: XPath and CSSSelect. In this example, we will focus on the former.

XPath is a way of locating information in structured documents such as HTML or XML documents. A good introduction to XPath is on W3Schools .

There are also various tools for obtaining the XPath of elements such as FireBug for Firefox or the Chrome Inspector. If you"re using Chrome, you can right click an element, choose "Inspect element", highlight the code, right click again, and choose "Copy XPath".

After a quick analysis, we see that in our page the data is contained in two elements -- one is a div with title "buyer-name" and the other is a span with class "item-price":

<div title="buyer-name">Carson Busses</div>
<span class="item-price">$29.95</span>

Knowing this we can create the correct XPath query and use the lxml xpath function like this:

#This will create a list of buyers:
buyers = tree.xpath('//div[@title="buyer-name"]/text()')
#This will create a list of prices
prices = tree.xpath('//span[@class="item-price"]/text()')

Let"s see what we got exactly:

print('Buyers: ', buyers)
print('Prices: ', prices)

Buyers:  ['Carson Busses', 'Earl E. Byrd', 'Patty Cakes',
'Derri Anne Connecticut', 'Moe Dess', 'Leda Doggslife', 'Dan Druff',
'Al Fresco', 'Ido Hoe', 'Howie Kisses', 'Len Lease', 'Phil Meup',
'Ira Pent', 'Ben D. Rules', 'Ave Sectomy', 'Gary Shattire',
'Bobbi Soks', 'Sheila Takya', 'Rose Tattoo', 'Moe Tell']

Prices:  ['$29.95', '$8.37', '$15.26', '$19.25', '$19.25',
'$13.99', '$31.57', '$8.49', '$14.47', '$15.86', '$11.11',
'$15.98', '$16.27', '$7.50', '$50.85', '$14.26', '$5.68',
'$15.00', '$114.07', '$10.09']

Congratulations! We have successfully scraped all the data we wanted from a web page using lxml and Requests. We have it stored in memory as two lists. Now we can do all sorts of cool stuff with it: we can analyze it using Python or we can save it to a file and share it with the world.

Some more cool ideas to think about are modifying this script to iterate through the rest of the pages of this example dataset, or rewriting this application to use threads for improved speed.

Command-line Applications

Command-line applications, also referred to as Console Applications, are computer programs designed to be used from a text interface, such as a shell. Command-line applications usually accept various inputs as arguments, often referred to as parameters or sub-commands, as well as options, often referred to as flags or switches.

Some popular command-line applications include:

grep - A plain-text data search utility
curl - A tool for data transfer with URL syntax
httpie - A command-line HTTP client, a user-friendly cURL replacement
Git - A distributed version control system
Mercurial - A distributed version control system primarily written in Python

Click

click is a Python package for creating command-line interfaces in a composable way with as little code as possible. This "Command-Line Interface Creation Kit" is highly configurable but comes with good defaults out of the box.

docopt

docopt is a lightweight, highly Pythonic package that allows creating command-line interfaces easily and intuitively, by parsing POSIX-style usage instructions.

Plac

Plac is a simple wrapper over the Python standard library argparse, which hides most of its complexity by using a declarative interface: the argument parser is inferred rather than written down imperatively. This module targets unsophisticated users, programmers, sysadmins, scientists, and in general people writing throw-away scripts for themselves, who choose to create a command-line interface because it is quick and simple.

Cliff

Cliff is a framework for building command-line programs. It uses setuptools entry points to provide subcommands, output formatters, and other extensions. The framework is meant to be used to create multi-level commands such as svn and git, where the main program handles some basic argument parsing and then invokes a sub-command to do the work.

Cement

typer is a library for building CLI applications based on Python type hints. It automatically generates help text, shell completion, and validates input. Built on top of Click, it is the modern recommended way to build CLI apps in Python.

Cement is an advanced CLI Application Framework. Its goal is to introduce a standard and feature-full platform for both simple and complex command line applications as well as support rapid development needs without sacrificing quality. Cement is flexible, and its use cases span from the simplicity of a micro-framework to the complexity of a mega-framework.

Python Fire

Python Fire is a library for automatically generating command-line interfaces from absolutely any Python object. It can help debug Python code more easily from the command line, create CLI interfaces to existing code, allow you to interactively explore code in a REPL, and simplify transitioning between Python and Bash (or any other shell).

GUI Applications

Alphabetical list of GUI Applications.

Camelot

Camelot provides components for building applications on top of Python, SQLAlchemy, and Qt. It is inspired by the Django admin interface.

The main resource for information is the website: http://www.python-camelot.com and the mailing list https://groups.google.com/forum/#!forum/project-camelot.

Cocoa

Note:

Note

The Cocoa framework is only available on OS X. Don"t pick this if you"re writing a cross-platform application! :

GTk

Note:

Note

PyGTK provides Python bindings for the GTK+ toolkit. However, it has been superseded by PyGObject. PyGTK should not be used for new projects and existing projects should be ported to PyGObject. :

PyGObject aka (PyGi)

PyGObject provides Python bindings which gives access to the entire GNOME software platform. It is fully compatible with GTK+ 3. Here is a tutorial to get started with Python GTK+ 3 Tutorial.

API Reference

Kivy

Kivy is a Python library for development of multi-touch enabled media rich applications. The aim is to allow for quick and easy interaction design and rapid prototyping, while making your code reusable and deployable.

Kivy is written in Python, based on OpenGL, and supports different input devices such as: Mouse, Dual Mouse, TUIO, WiiMote, WM_TOUCH, HIDtouch, Apple"s products, and so on.

Kivy is actively being developed by a community and is free to use. It operates on all major platforms (Linux, OS X, Windows, Android).

The main resource for information is the website: https://kivy.org

PyObjC

Note:

Note

Only available on OS X. Don"t pick this if you"re writing a cross-platform application. :

PySide

PySide is a Python binding of the cross-platform GUI toolkit Qt. The package name depends on the major Qt version ([PySide] for Qt4, [PySide2] for Qt5, and [PySide6] for Qt6). This set of bindings is developed by The Qt Company.

$ pip install pyside6

https://pyside.org

PyQt

Note:

Note

If your software does not fully comply with the GPL you will need a commercial license! :

PyQt provides Python bindings for the Qt Framework (see below).

http://www.riverbankcomputing.co.uk/software/pyqt/download

Pyjs Desktop (formerly Pyjamas Desktop)

Pyjs Desktop is a application widget set for desktop and a cross-platform framework. It allows the exact same Python web application source code to be executed as a standalone desktop application.

The main website: pyjs.

Qt

Qt is a cross-platform application framework that is widely used for developing software with a GUI but can also be used for non-GUI applications.

PySimpleGUI

PySimpleGUI is a wrapper for Tkinter and Qt (others on the way). The amount of code required to implement custom GUIs is much shorter using PySimpleGUI than if the same GUI were written directly using Tkinter or Qt. PySimpleGUI code can be "ported" between GUI frameworks by changing import statements.

$ pip install pysimplegui

PySimpleGUI is contained in a single PySimpleGUI.py file. Should pip installation be impossible, copying the PySimpleGUI.py file into a project"s folder is all that"s required to import and begin using.

Toga

Toga is a Python native, OS native, cross platform GUI toolkit. Toga consists of a library of base components with a shared interface to simplify platform-agnostic GUI development.

Toga is available on macOS, Windows, Linux (GTK), and mobile platforms such as Android and iOS.

Tk

Tkinter is a thin object-oriented layer on top of Tcl/Tk. It has the advantage of being included with the Python standard library, making it the most convenient and compatible toolkit to program with.

Both Tk and Tkinter are available on most Unix platforms, as well as on Windows and Macintosh systems. Starting with the 8.0 release, Tk offers native look and feel on all platforms.

There"s a good multi-language Tk tutorial with Python examples at TkDocs. There"s more information available on the Python Wiki.

wxPython

wxPython is a GUI toolkit for the Python programming language. It allows Python programmers to create programs with a robust, highly functional graphical user interface, simply and easily. It is implemented as a Python extension module (native code) that wraps the popular wxWidgets cross platform GUI library, which is written in C++.

Install (Stable) wxPython go to https://www.wxpython.org/pages/downloads/ and download the appropriate package for your OS.

Databases

DB-API

The Python Database API (DB-API) defines a standard interface for Python database access modules. It"s documented in 249. Nearly all Python database modules such as [sqlite3], [psycopg], and [mysql-python] conform to this interface.

Tutorials that explain how to work with modules that conform to this interface can be found here and here.

SQLAlchemy

SQLAlchemy is a commonly used database toolkit. Unlike many database libraries it not only provides an ORM layer but also a generalized API for writing database-agnostic code without SQL.

$ pip install sqlalchemy

Records

Records is minimalist SQL library, designed for sending raw SQL queries to various databases. Data can be used programmatically or exported to a number of useful data formats.

$ pip install records

Also included is a command-line tool for exporting SQL data.

PugSQL

PugSQL is a simple Python interface for organizing and using parameterized, handwritten SQL. It is an anti-ORM that is philosophically lo-fi, but it still presents a clean interface in Python.

$ pip install pugsql

Django ORM

The Django ORM is the interface used by Django to provide database access.

It"s based on the idea of models, an abstraction that makes it easier to manipulate data in Python.

The basics:

Each model is a Python class that subclasses django.db.models.Model.
Each attribute of the model represents a database field.
Django gives you an automatically-generated database-access API; see Making queries.

peewee

peewee is another ORM with a focus on being lightweight with support for SQLite, MySQL, and PostgreSQL. It supports SQLite, MySQL, and PostgreSQL by default. The model layer is similar to that of the Django ORM and it has SQL-like methods to query data. While SQLite, MySQL, and PostgreSQL are supported out-of-the-box, there is a collection of add-ons available.

PonyORM

PonyORM is an ORM that takes a different approach to querying the database. Instead of writing an SQL-like language or boolean expressions, Python"s generator syntax is used. There"s also a graphical schema editor that can generate PonyORM entities for you. It can connect to SQLite, MySQL, PostgreSQL, and Oracle.

SQLObject

SQLObject is yet another ORM. It supports a wide variety of databases: common database systems like MySQL, PostgreSQL, and SQLite and more exotic systems like SAP DB, SyBase, and Microsoft SQL Server.

Networking

Twisted

Twisted is an event-driven networking engine. It can be used to build applications around many different networking protocols, including HTTP servers and clients, applications using SMTP, POP3, IMAP, or SSH protocols, instant messaging, and much more.

PyZMQ

PyZMQ is the Python binding for ZeroMQ, which is a high-performance asynchronous messaging library. One great advantage of ZeroMQ is that it can be used for message queuing without a message broker. The basic patterns for this are:

request-reply: connects a set of clients to a set of services. This is a remote procedure call and task distribution pattern.
publish-subscribe: connects a set of publishers to a set of subscribers. This is a data distribution pattern.
push-pull (or pipeline): connects nodes in a fan-out/fan-in pattern that can have multiple steps and loops. This is a parallel task distribution and collection pattern.

For a quick start, read the ZeroMQ guide.

gevent

gevent is a coroutine-based Python networking library that uses greenlets to provide a high-level synchronous API on top of the libev event loop.

Systems Administration

Fabric

Fabric is a library for simplifying system administration tasks. While Chef and Puppet tend to focus on managing servers and system libraries, Fabric is more focused on application level tasks such as deployment.

Install Fabric:

$ pip install fabric

The following code will create two tasks that we can use: memory_usage and deploy. The former will output the memory usage on each machine. The latter will SSH into each server, cd to our project directory, activate the virtual environment, pull the newest codebase, and restart the application server.

from fabric.api import cd, env, prefix, run, task

env.hosts = ['my_server1', 'my_server2']

@task
def memory_usage():
    run('free -m')

@task
def deploy():
    with cd('/var/www/project-env/project'):
        with prefix('. ../bin/activate'):
            run('git pull')
            run('touch app.wsgi')

With the previous code saved in a file named fabfile.py, we can check memory usage with:

$ fab memory_usage
[my_server1] Executing task 'memory'
[my_server1] run: free -m
[my_server1] out:              total     used     free   shared  buffers   cached
[my_server1] out: Mem:          6964     1897     5067        0      166      222
[my_server1] out: -/+ buffers/cache:     1509     5455
[my_server1] out: Swap:            0        0        0

[my_server2] Executing task 'memory'
[my_server2] run: free -m
[my_server2] out:              total     used     free   shared  buffers   cached
[my_server2] out: Mem:          1666      902      764        0      180      572
[my_server2] out: -/+ buffers/cache:      148     1517
[my_server2] out: Swap:          895        1      894

and we can deploy with:

$ fab deploy

Additional features include parallel execution, interaction with remote programs, and host grouping.

Fabric Documentation

Salt

Salt is an open source infrastructure management tool. It supports remote command execution from a central point (master host) to multiple hosts (minions). It also supports system states which can be used to configure multiple servers using simple template files.

Salt can be installed via pip:

$ pip install salt

After configuring a master server and any number of minion hosts, we can run arbitrary shell commands or use pre-built modules of complex commands on our minions.

The following command lists all available minion hosts, using the ping module.

$ salt '*' test.ping

The host filtering is accomplished by matching the minion id or using the grains system. The grains system uses static host information like the operating system version or the CPU architecture to provide a host taxonomy for the Salt modules.

The following command lists all available minions running CentOS using the grains system:

$ salt -G 'os:CentOS' test.ping

Salt also provides a state system. States can be used to configure the minion hosts.

For example, when a minion host is ordered to read the following state file, it will install and start the Apache server:

apache:
  pkg:
    - installed
  service:
    - running
    - enable: True
    - require:
      - pkg: apache

State files can be written using YAML, the Jinja2 template system, or pure Python.

Salt Documentation

Psutil

Psutil is an interface to different system information (e.g. CPU, memory, disks, network, users, and processes).

Here is an example to be aware of some server overload. If any of the tests (net, CPU) fail, it will send an email.

# Functions to get system values:
from psutil import cpu_percent, net_io_counters
# Functions to take a break:
from time import sleep
# Package for email services:
import smtplib
import string
MAX_NET_USAGE = 400000 # bytes per seconds
MAX_ATTACKS = 4
attack = 0
while attack <= MAX_ATTACKS:
    sleep(4)

    # Check the net usage wit named tuples
    neti1 = net_io_counters().bytes_recv
    neto1 = net_io_counters().bytes_sent
    sleep(1)
    neti2 = net_io_counters().bytes_recv
    neto2 = net_io_counters().bytes_sent

    # Calculate the bytes per second
    net = ((neti2+neto2) - (neti1+neto1))/2

    # Check the net and cpu usage
    if (net > MAX_NET_USAGE) or (cpu_percent(interval = 1) > 70):
        attack+=1
    elif attack > 1:
        attack-=1

# Write a very important email if attack is higher than 4
TO = "you@your_email.com"
FROM = "webmaster@your_domain.com"
SUBJECT = "Your domain is out of system resources!"
text = "Go and fix your server!"
string="\r\n"
BODY = string.join(("From: %s" %FROM,"To: %s" %TO,
                    "Subject: %s" %SUBJECT, "",text))
server = smtplib.SMTP('127.0.0.1')
server.sendmail(FROM, [TO], BODY)
server.quit()

A full terminal application like a widely extended top is Glance, which is based on psutil and has the ability for client-server monitoring.

Ansible

Ansible is an open source system automation tool. Its biggest advantage over Puppet or Chef is that it does not require an agent on the client machine. Playbooks are Ansible's configuration, deployment, and orchestration language and are written in YAML with Jinja2 for templating.

Ansible can be installed via pip:

$ pip install ansible

Ansible requires an inventory file that describes the hosts to which it has access. Below is an example of a host and playbook that will ping all the hosts in the inventory file.

Here is an example inventory file: hosts.yml

[server_name]
127.0.0.1

Here is an example playbook: ping.yml

---
- hosts: all

  tasks:
    - name: ping
      action: ping

To run the playbook:

$ ansible-playbook ping.yml -i hosts.yml --ask-pass

The Ansible playbook will ping all of the servers in the hosts.yml file. You can also select groups of servers using Ansible. For more information about Ansible, read the Ansible Docs.

An Ansible tutorial is also a great and detailed introduction to getting started with Ansible.

Chef

Chef is a systems and cloud infrastructure automation framework that makes it easy to deploy servers and applications to any physical, virtual, or cloud location. In case this is your choice for configuration management, you will primarily use Ruby to write your infrastructure code.

Chef clients run on every server that is part of your infrastructure and these regularly check with your Chef server to ensure your system is always aligned and represents the desired state. Since each individual server has its own distinct Chef client, each server configures itself and this distributed approach makes Chef a scalable automation platform.

Chef works by using custom recipes (configuration elements), implemented in cookbooks. Cookbooks, which are basically packages for infrastructure choices, are usually stored in your Chef server. Read the DigitalOcean tutorial series on Chef to learn how to create a simple Chef Server.

To create a simple cookbook the knife command is used:

knife cookbook create cookbook_name

Getting started with Chef is a good starting point for Chef Beginners and many community maintained cookbooks that can serve as a good reference or tweaked to serve your infrastructure configuration needs can be found on the Chef Supermarket.

Chef Documentation

Puppet

Puppet is IT Automation and configuration management software from Puppet Labs that allows System Administrators to define the state of their IT Infrastructure, thereby providing an elegant way to manage their fleet of physical and virtual machines.

Puppet is available both as an Open Source and an Enterprise variant. Modules are small, shareable units of code written to automate or define the state of a system. Puppet Forge is a repository for modules written by the community for Open Source and Enterprise Puppet.

Puppet Agents are installed on nodes whose state needs to be monitored or changed. A designated server known as the Puppet Master is responsible for orchestrating the agent nodes.

Agent nodes send basic facts about the system such as the operating system, kernel, architecture, IP address, hostname, etc. to the Puppet Master. The Puppet Master then compiles a catalog with information provided by the agents on how each node should be configured and sends it to the agent. The agent enforces the change as prescribed in the catalog and sends a report back to the Puppet Master.

Facter is an interesting tool that ships with Puppet that pulls basic facts about the system. These facts can be referenced as a variable while writing your Puppet modules.

$ facter kernel
Linux

$ facter operatingsystem
Ubuntu

Writing Modules in Puppet is pretty straight forward. Puppet Manifests together form Puppet Modules. Puppet manifests end with an extension of .pp. Here is an example of "Hello World" in Puppet.

notify { 'This message is getting logged into the agent node':

    #As nothing is specified in the body the resource title
    #the notification message by default.
}

Here is another example with system based logic. Note how the operating system fact is being used as a variable prepended with the $ sign. Similarly, this holds true for other facts such as hostname which can be referenced by $hostname.

notify{ 'Mac Warning':
    message => $operatingsystem ? {
        'Darwin' => 'This seems to be a Mac.',
        default  => 'I am a PC.',
    },
}

There are several resource types for Puppet but the package-file-service paradigm is all you need for undertaking the majority of the configuration management. The following Puppet code makes sure that the OpenSSH-Server package is installed in a system and the sshd service is notified to restart every time the sshd configuration file is changed.

package { 'openssh-server':
    ensure => installed,
}

file { '/etc/ssh/sshd_config':
    source   => 'puppet:///modules/sshd/sshd_config',
    owner    => 'root',
    group    => 'root',
    mode     => '640',
    notify   =>  Service['sshd'], # sshd will restart
                                  # whenever you edit this
                                  # file
    require  => Package['openssh-server'],

}

service { 'sshd':
    ensure    => running,
    enable    => true,
    hasstatus => true,
    hasrestart=> true,
}

For more information, refer to the Puppet Labs Documentation

Blueprint

::: todo Write about Blueprint

Buildout

Buildout is an open source software build tool. Buildout is created using the Python programming language. It implements a principle of separation of configuration from the scripts that do the setting up. Buildout is primarily used to download and set up dependencies in Python eggs format of the software being developed or deployed. Recipes for build tasks in any environment can be created, and many are already available.

Continuous Integration

Note:

Note

For advice on writing your tests, see /writing/tests. :

Why?

Martin Fowler, who first wrote about Continuous Integration (short: CI) together with Kent Beck, describes CI as follows:

Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly.

Jenkins

Jenkins CI is an extensible Continuous Integration engine. Use it.

Buildbot

Buildbot is a Python system to automate the compile/test cycle to validate code changes.

Tox

tox is an automation tool providing packaging, testing, and deployment of Python software right from the console or CI server. It is a generic virtualenv management and test command line tool which provides the following features:

Checking that packages install correctly with different Python versions and interpreters
Running tests in each of the environments, configuring your test tool of choice
Acting as a front-end to Continuous Integration servers, reducing boilerplate and merging CI and shell-based testing

Travis-CI

Travis-CI is a distributed CI server which builds tests for open source projects for free. It provides multiple workers to run Python tests on and seamlessly integrates with GitHub. You can even have it comment on your Pull Requests whether this particular changeset breaks the build or not. So, if you are hosting your code on GitHub, Travis-CI is a great and easy way to get started with Continuous Integration.

In order to get started, add a .travis.yml file to your repository with this example content:

language: python
python:
  - "2.6"
  - "2.7"
  - "3.2"
  - "3.3"
# command to install dependencies
script: python tests/test_all_of_the_units.py
branches:
  only:
    - master

This will get your project tested on all the listed Python versions by running the given script, and will only build the master branch. There are a lot more options you can enable, like notifications, before and after steps, and much more. The Travis-CI docs explain all of these options, and are very thorough.

In order to activate testing for your project, go to the Travis-CI site and login with your GitHub account. Then activate your project in your profile settings and you"re ready to go. From now on, your project"s tests will be run on every push to GitHub.

Speed

CPython, the most commonly used implementation of Python, is slow for CPU bound tasks. PyPy is fast.

Using a slightly modified version of David Beazley"s CPU bound test code (added loop for multiple tests), you can see the difference between CPython and PyPy"s processing.

# PyPy
$ ./pypy -V
Python 2.7.1 (7773f8fc4223, Nov 18 2011, 18:47:10)
[PyPy 1.7.0 with GCC 4.4.3]
$ ./pypy measure2.py
0.0683999061584
0.0483210086823
0.0388588905334
0.0440690517426
0.0695300102234

# CPython
$ ./python -V
Python 2.7.1
$ ./python measure2.py
1.06774401665
1.45412397385
1.51485204697
1.54693889618
1.60109114647

Context

The GIL

The GIL (Global Interpreter Lock) is how Python allows multiple threads to operate at the same time. Python"s memory management isn"t entirely thread-safe, so the GIL is required to prevent multiple threads from running the same Python code at once.

David Beazley has a great guide on how the GIL operates. He also covers the new GIL in Python 3.2. His results show that maximizing performance in a Python application requires a strong understanding of the GIL, how it affects your specific application, how many cores you have, and where your application bottlenecks are.

C Extensions

The GIL

Special care must be taken when writing C extensions to make sure you register your threads with the interpreter.

C Extensions

Cython

Cython implements a superset of the Python language with which you are able to write C and C++ modules for Python. Cython also allows you to call functions from compiled C libraries. Using Cython allows you to take advantage of Python"s strong typing of variables and operations.

Here"s an example of strong typing with Cython:

def primes(int kmax):
"""Calculation of prime numbers with additional
Cython keywords"""

    cdef int n, k, i
    cdef int p[1000]
    result = []
    if kmax > 1000:
        kmax = 1000
    k = 0
    n = 2
    while k < kmax:
        i = 0
        while i < k and n % p[i] != 0:
            i = i + 1
        if i == k:
            p[k] = n
            k = k + 1
            result.append(n)
        n = n + 1
    return result

This implementation of an algorithm to find prime numbers has some additional keywords compared to the next one, which is implemented in pure Python:

def primes(kmax):
"""Calculation of prime numbers in standard Python syntax"""

    p = range(1000)
    result = []
    if kmax > 1000:
        kmax = 1000
    k = 0
    n = 2
    while k < kmax:
        i = 0
        while i < k and n % p[i] != 0:
            i = i + 1
        if i == k:
            p[k] = n
            k = k + 1
            result.append(n)
        n = n + 1
    return result

Notice that in the Cython version you declare integers and integer arrays to be compiled into C types while also creating a Python list:

def primes(int kmax):
    """Calculation of prime numbers with additional
    Cython keywords"""

    cdef int n, k, i
    cdef int p[1000]
    result = []

def primes(kmax):
    """Calculation of prime numbers in standard Python syntax"""

    p = range(1000)
    result = []

What is the difference? In the upper Cython version you can see the declaration of the variable types and the integer array in a similar way as in standard C. For example [cdef int n,k,i] in line 3. This additional type declaration (i.e. integer) allows the Cython compiler to generate more efficient C code from the second version. While standard Python code is saved in *.py files, Cython code is saved in *.pyx files.

What"s the difference in speed? Let"s try it!

import time
# Activate pyx compiler
import pyximport
pyximport.install() 
import primesCy  # primes implemented with Cython
import primes  # primes implemented with Python

print("Cython:")
t1 = time.time()
print(primesCy.primes(500))
t2 = time.time()
print("Cython time: %s" % (t2 - t1))
print("")
print("Python")
t1 = time.time()
print(primes.primes(500))
t2 = time.time()
print("Python time: %s" % (t2 - t1))

These lines both need a remark:

import pyximport
pyximport.install()

The [pyximport] module allows you to import *.pyx files (e.g., primesCy.pyx) with the Cython-compiled version of the [primes] function. The [pyximport.install()] command allows the Python interpreter to start the Cython compiler directly to generate C code, which is automatically compiled to a *.so C library. Cython is then able to import this library for you in your Python code, easily and efficiently. With the [time.time()] function you are able to compare the time between these 2 different calls to find 500 prime numbers. On a standard notebook (dual core AMD E-450 1.6 GHz), the measured values are:

Cython time: 0.0054 seconds

Python time: 0.0566 seconds

And here is the output of an embedded ARM beaglebone machine:

Cython time: 0.0196 seconds

Python time: 0.3302 seconds

Pyrex

Shedskin?

Concurrency

Concurrent.futures

The concurrent.futures module is a module in the standard library that provides a "high-level interface for asynchronously executing callables". It abstracts away a lot of the more complicated details about using multiple threads or processes for concurrency, and allows the user to focus on accomplishing the task at hand.

The concurrent.futures module exposes two main classes, the [ThreadPoolExecutor] and the [ProcessPoolExecutor]. The ThreadPoolExecutor will create a pool of worker threads that a user can submit jobs to. These jobs will then be executed in another thread when the next worker thread becomes available.

The ProcessPoolExecutor works in the same way, except instead of using multiple threads for its workers, it will use multiple processes. This makes it possible to side-step the GIL; however, because of the way things are passed to worker processes, only picklable objects can be executed and returned.

Because of the way the GIL works, a good rule of thumb is to use a ThreadPoolExecutor when the task being executed involves a lot of blocking (i.e. making requests over the network) and to use a ProcessPoolExecutor executor when the task is computationally expensive.

There are two main ways of executing things in parallel using the two Executors. One way is with the [map(func, iterables)] method. This works almost exactly like the builtin [map()] function, except it will execute everything in parallel.

from concurrent.futures import ThreadPoolExecutor
import requests

def get_webpage(url):
    page = requests.get(url)
    return page

pool = ThreadPoolExecutor(max_workers=5)

my_urls = ['http://google.com/']*10  # Create a list of urls

for page in pool.map(get_webpage, my_urls):
    # Do something with the result
    print(page.text)

For even more control, the [submit(func, *args, **kwargs)] method will schedule a callable to be executed ( as [func(*args, **kwargs)]) and returns a Future object that represents the execution of the callable.

The Future object provides various methods that can be used to check on the progress of the scheduled callable. These include:

cancel()

: Attempt to cancel the call.

cancelled()

: Return True if the call was successfully cancelled.

running()

: Return True if the call is currently being executed and cannot be cancelled.

done()

: Return True if the call was successfully cancelled or finished running.

result()

: Return the value returned by the call. Note that this call will block until the scheduled callable returns by default.

exception()

: Return the exception raised by the call. If no exception was raised then this returns None. Note that this will block just like [result()].

add_done_callback(fn)

: Attach a callback function that will be executed (as [fn(future)]) when the scheduled callable returns.

from concurrent.futures import ProcessPoolExecutor, as_completed

def is_prime(n):
    if n % 2 == 0:
        return n, False

    sqrt_n = int(n**0.5)
    for i in range(3, sqrt_n + 1, 2):
        if n % i == 0:
            return n, False
    return n, True

PRIMES = [
    112272535095293,
    112582705942171,
    112272535095293,
    115280095190773,
    115797848077099,
    1099726899285419]

futures = []
with ProcessPoolExecutor(max_workers=4) as pool:
    # Schedule the ProcessPoolExecutor to check if a number is prime
    # and add the returned Future to our list of futures
    for p in PRIMES:
        fut = pool.submit(is_prime, p)
        futures.append(fut)

# As the jobs are completed, print out the results
for number, result in as_completed(futures):
    if result:
        print("{} is prime".format(number))
    else:
        print("{} is not prime".format(number))

The concurrent.futures module contains two helper functions for working with Futures. The [as_completed(futures)] function returns an iterator over the list of futures, yielding the futures as they complete.

The [wait(futures)] function will simply block until all futures in the list of futures provided have completed.

For more information, on using the concurrent.futures module, consult the official documentation.

threading

The standard library comes with a threading module that allows a user to work with multiple threads manually.

Running a function in another thread is as simple as passing a callable and its arguments to [Thread]"s constructor and then calling `start()`:

from threading import Thread
import requests

def get_webpage(url):
    page = requests.get(url)
    return page

some_thread = Thread(get_webpage, 'http://google.com/')
some_thread.start()

To wait until the thread has terminated, call `join()`:

some_thread.join()

After calling [join()], it is always a good idea to check whether the thread is still alive (because the join call timed out):

if some_thread.is_alive():
    print("join() must have timed out.")
else:
    print("Our thread has terminated.")

Because multiple threads have access to the same section of memory, sometimes there might be situations where two or more threads are trying to write to the same resource at the same time or where the output is dependent on the sequence or timing of certain events. This is called a data race or race condition. When this happens, the output will be garbled or you may encounter problems which are difficult to debug. A good example is this Stack Overflow post.

The way this can be avoided is by using a Lock that each thread needs to acquire before writing to a shared resource. Locks can be acquired and released through either the contextmanager protocol ([with] statement), or by using [acquire()] and [release()] directly. Here is a (rather contrived) example:

from threading import Lock, Thread

file_lock = Lock()

def log(msg):
    with file_lock:
        open('website_changes.log', 'w') as f:
            f.write(changes)

def monitor_website(some_website):
    """
    Monitor a website and then if there are any changes,
    log them to disk.
    """
    while True:
        changes = check_for_changes(some_website)
        if changes:
            log(changes)

websites = ['http://google.com/', ... ]
for website in websites:
    t = Thread(monitor_website, website)
    t.start()

Here, we have a bunch of threads checking for changes on a list of sites and whenever there are any changes, they attempt to write those changes to a file by calling [log(changes)]. When [log()] is called, it will wait to acquire the lock with [with file_lock:]. This ensures that at any one time, only one thread is writing to the file.

Spawning Processes

Multiprocessing

Scientific Applications

Context

Python is frequently used for high-performance scientific applications. It is widely used in academia and scientific projects because it is easy to write and performs well.

Due to its high performance nature, scientific computing in Python often utilizes external libraries, typically written in faster languages (like C, or Fortran for matrix operations). The main libraries used are NumPy, SciPy and Matplotlib. Going into detail about these libraries is beyond the scope of the Python guide. However, a comprehensive introduction to the scientific Python ecosystem can be found in the Python Scientific Lecture Notes.

Tools

IPython

IPython is an enhanced version of Python interpreter, which provides features of great interest to scientists. The [inline mode] allows graphics and plots to be displayed in the terminal (Qt based version). Moreover, the [notebook] mode supports literate programming and reproducible science generating a web-based Python notebook. This notebook allows you to store chunks of Python code alongside the results and additional comments (HTML, LaTeX, Markdown). The notebook can then be shared and exported in various file formats.

Libraries

NumPy

NumPy is a low level library written in C (and Fortran) for high level mathematical functions. NumPy cleverly overcomes the problem of running slower algorithms on Python by using multidimensional arrays and functions that operate on arrays. Any algorithm can then be expressed as a function on arrays, allowing the algorithms to be run quickly.

NumPy is part of the SciPy project, and is released as a separate library so people who only need the basic requirements can use it without installing the rest of SciPy.

NumPy is compatible with Python versions 2.4 through 2.7.2 and 3.1+.

Numba

Numba is a NumPy aware Python compiler (just-in-time (JIT) specializing compiler) which compiles annotated Python (and NumPy) code to LLVM (Low Level Virtual Machine) through special decorators. Briefly, Numba uses a system that compiles Python code with LLVM to code which can be natively executed at runtime.

SciPy

SciPy is a library that uses NumPy for more mathematical functions. SciPy uses NumPy arrays as the basic data structure, and comes with modules for various commonly used tasks in scientific programming, including linear algebra, integration (calculus), ordinary differential equation solving, and signal processing.

Matplotlib

Matplotlib is a flexible plotting library for creating interactive 2D and 3D plots that can also be saved as manuscript-quality figures. The API in many ways reflects that of MATLAB, easing transition of MATLAB users to Python. Many examples, along with the source code to recreate them, are available in the matplotlib gallery.

Pandas

Pandas is a data manipulation library based on NumPy which provides many useful functions for accessing, indexing, merging, and grouping data easily. The main data structure (DataFrame) is close to what could be found in the R statistical package; that is, heterogeneous data tables with name indexing, time series operations, and auto-alignment of data.

xarray

xarray is similar to Pandas, but it is intended for wrapping multidimensional scientific data. By labelling the data with dimensions, coordinates, and attributes, it makes complex multidimensional operations clearer and more intuitive. It also wraps matplotlib for quick plotting, and can apply most operations in parallel using dask.

Rpy2

Rpy2 is a Python binding for the R statistical package allowing the execution of R functions from Python and passing data back and forth between the two environments. Rpy2 is the object oriented implementation of the Rpy bindings.

PsychoPy

PsychoPy is a library for cognitive scientists allowing the creation of cognitive psychology and neuroscience experiments. The library handles presentation of stimuli, scripting of experimental design, and data collection.

Resources

Installation of scientific Python packages can be troublesome, as many of these packages are implemented as Python C extensions which need to be compiled. This section lists various so-called scientific Python distributions which provide precompiled and easy-to-install collections of scientific Python packages.

Unofficial Windows Binaries for Python Extension Packages

Many people who do scientific computing are on Windows, yet many of the scientific computing packages are notoriously difficult to build and install on this platform. Christoph Gohlke, however, has compiled a list of Windows binaries for many useful Python packages. The list of packages has grown from a mainly scientific Python resource to a more general list. If you"re on Windows, you may want to check it out.

Anaconda

The Anaconda Python Distribution includes all the common scientific Python packages as well as many packages related to data analytics and big data. Anaconda itself is free, and a number of proprietary add-ons are available for a fee. Free licenses for the add-ons are available for academics and researchers.

Canopy

Canopy is another scientific Python distribution, produced by Enthought. A limited "Canopy Express" variant is available for free, but Enthought charges for the full distribution. Free licenses are available for academics.

Image Manipulation

Most image processing and manipulation techniques can be carried out effectively using two libraries: Python Imaging Library (PIL) and Open Source Computer Vision (OpenCV).

A brief description of both is given below.

Python Imaging Library

The Python Imaging Library, or PIL for short, is one of the core libraries for image manipulation in Python. Unfortunately, its development has stagnated, with its last release in 2009.

Luckily for you, there"s an actively-developed fork of PIL called Pillow -- it"s easier to install, runs on all major operating systems, and supports Python 3.

Installation

Before installing Pillow, you"ll have to install Pillow"s prerequisites. Find the instructions for your platform in the Pillow installation instructions.

After that, it"s straightforward:

$ pip install Pillow

Example

from PIL import Image, ImageFilter
#Read image
im = Image.open( 'image.jpg' )
#Display image
im.show()

#Applying a filter to the image
im_sharp = im.filter( ImageFilter.SHARPEN )
#Saving the filtered image to a new file
im_sharp.save( 'image_sharpened.jpg', 'JPEG' )

#Splitting the image into its respective bands, i.e. Red, Green,
#and Blue for RGB
r,g,b = im_sharp.split()

#Viewing EXIF data embedded in image
exif_data = im._getexif()
exif_data

There are more examples of the Pillow library in the Pillow tutorial.

Open Source Computer Vision

Open Source Computer Vision, more commonly known as OpenCV, is a more advanced image manipulation and processing software than PIL. It has been implemented in several languages and is widely used.

Installation

In Python, image processing using OpenCV is implemented using the cv2 and NumPy modules. The installation instructions for OpenCV should guide you through configuring the project for yourself.

NumPy can be downloaded from the Python Package Index(PyPI):

$ pip install numpy

Example

import cv2
#Read Image
img = cv2.imread('testimg.jpg')
#Display Image
cv2.imshow('image',img)
cv2.waitKey(0)
cv2.destroyAllWindows()

#Applying Grayscale filter to image
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

#Saving filtered image to new file
cv2.imwrite('graytest.jpg',gray)

There are more Python-implemented examples of OpenCV in this collection of tutorials.

Data Serialization

What is data serialization?

Data serialization is the process of converting structured data to a format that allows sharing or storage of the data in a form that allows recovery of its original structure. In some cases, the secondary intention of data serialization is to minimize the data"s size which then reduces disk space or bandwidth requirements.

Flat vs. Nested data

Before beginning to serialize data, it is important to identify or decide how the data should be structured during data serialization - flat or nested. The differences in the two styles are shown in the below examples.

Flat style:

{ "Type" : "A", "field1": "value1", "field2": "value2", "field3": "value3" }

Nested style:

{"A"
    { "field1": "value1", "field2": "value2", "field3": "value3" } }

For more reading on the two styles, please see the discussion on Python mailing list, IETF mailing list and in stackexchange.

Serializing Text

Simple file (flat data)

If the data to be serialized is located in a file and contains flat data, Python offers two methods to serialize data.

repr

The repr method in Python takes a single object parameter and returns a printable representation of the input:

# input as flat text
a =  { "Type" : "A", "field1": "value1", "field2": "value2", "field3": "value3" }

# the same input can also be read from a file
a = open('/tmp/file.py', 'r')

# returns a printable representation of the input;
# the output can be written to a file as well
print(repr(a))

# write content to files using repr
with open('/tmp/file.py') as f:f.write(repr(a))

ast.literal_eval

The literal_eval method safely parses and evaluates an expression for a Python datatype. Supported data types are: strings, numbers, tuples, lists, dicts, booleans, and None.

with open('/tmp/file.py', 'r') as f: inp = ast.literal_eval(f.read())

CSV file (flat data)

The CSV module in Python implements classes to read and write tabular data in CSV format.

Simple example for reading:

# Reading CSV content from a file
import csv
with open('/tmp/file.csv', newline='') as f:
    reader = csv.reader(f)
    for row in reader:
        print(row)

Simple example for writing:

# Writing CSV content to a file
import csv
with open('/temp/file.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    writer.writerows(iterable)

The module"s contents, functions, and examples can be found in the Python documentation.

YAML (nested data)

There are many third party modules to parse and read/write YAML file structures in Python. One such example is below.

# Reading YAML content from a file using the load method
import yaml
with open('/tmp/file.yaml', 'r', newline='') as f:
    try:
        print(yaml.load(f))
    except yaml.YAMLError as ymlexcp:
        print(ymlexcp)

Documentation on the third party module can be found in the PyYAML Documentation.

JSON file (nested data)

Python"s JSON module can be used to read and write JSON files. Example code is below.

Reading:

# Reading JSON content from a file
import json
with open('/tmp/file.json', 'r') as f:
    data = json.load(f)

Writing:

# Writing JSON content to a file using the dump method
import json
with open('/tmp/file.json', 'w') as f:
    json.dump(data, f, sort_keys=True)

XML (nested data)

XML parsing in Python is possible using the [xml] package.

Example:

# reading XML content from a file
import xml.etree.ElementTree as ET
tree = ET.parse('country_data.xml')
root = tree.getroot()

More documentation on using the [xml.dom] and [xml.sax] packages can be found in the Python XML library documentation.

Binary

NumPy Array (flat data)

Python"s NumPy array can be used to serialize and deserialize data to and from byte representation.

Example:

import NumPy as np

# Converting NumPy array to byte format
byte_output = np.array([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ]).tobytes()

# Converting byte format back to NumPy array
array_format = np.frombuffer(byte_output)

Pickle (nested data)

The native data serialization module for Python is called Pickle.

Here"s an example:

import pickle

#Here's an example dict
grades = { 'Alice': 89, 'Bob': 72, 'Charles': 87 }

#Use dumps to convert the object to a serialized string
serial_grades = pickle.dumps( grades )

#Use loads to de-serialize an object
received_grades = pickle.loads( serial_grades )

Protobuf

If you"re looking for a serialization module that has support in multiple languages, Google"s Protobuf library is an option.

XML parsing

untangle

untangle is a simple library which takes an XML document and returns a Python object which mirrors the nodes and attributes in its structure.

For example, an XML file like this:

<?xml version="1.0"?>
<root>
    <child name="child1">
</root>

can be loaded like this:

import untangle
obj = untangle.parse('path/to/file.xml')

and then you can get the child element"s name attribute like this:

obj.root.child['name']

untangle also supports loading XML from a string or a URL.

xmltodict

xmltodict is another simple library that aims at making XML feel like working with JSON.

An XML file like this:

<mydocument has="an attribute">
  <and>
    <many>elements</many>
    <many>more elements</many>
  </and>
  <plus a="complex">
    element as well
  </plus>
</mydocument>

can be loaded into a Python dict like this:

import xmltodict

with open('path/to/file.xml') as fd:
    doc = xmltodict.parse(fd.read())

and then you can access elements, attributes, and values like this:

doc['mydocument']['@has'] # == u'an attribute'
doc['mydocument']['and']['many'] # == [u'elements', u'more elements']
doc['mydocument']['plus']['@a'] # == u'complex'
doc['mydocument']['plus']['#text'] # == u'element as well'

xmltodict also lets you roundtrip back to XML with the unparse function, has a streaming mode suitable for handling files that don"t fit in memory, and supports XML namespaces.

xmlschema

xmlschema provides support for using XSD-Schemas in Python. Unlike other XML libraries, automatic type parsing is available, so f.e. if the schema defines an element to be of type int, the parsed dict will contain also an int value for that element. Moreover the library supports automatic and explicit validation of XML documents against a schema.

from xmlschema import XMLSchema, etree_tostring

# load a XSD schema file
schema = XMLSchema("your_schema.xsd")

# validate against the schema
schema.validate("your_file.xml")

# or
schema.is_valid("your_file.xml")

# decode a file
data = schmema.decode("your_file.xml")

# encode to string
s = etree_tostring(schema.encode(data))

JSON

The json library can parse JSON from strings or files. The library parses JSON into a Python dictionary or list. It can also convert Python dictionaries or lists into JSON strings.

Parsing JSON

Take the following string containing JSON data:

json_string = '{"first_name": "Guido", "last_name":"Rossum"}'

It can be parsed like this:

import json
parsed_json = json.loads(json_string)

and can now be used as a normal dictionary:

print(parsed_json['first_name'])
"Guido"

You can also convert the following to JSON:

d = {
    'first_name': 'Guido',
    'second_name': 'Rossum',
    'titles': ['BDFL', 'Developer'],
}

print(json.dumps(d))
'{"first_name": "Guido", "last_name": "Rossum", "titles": ["BDFL", "Developer"]}'

Cryptography

cryptography

cryptography is an actively developed library that provides cryptographic recipes and primitives. It supports Python 3.8+ and PyPy.

cryptography is divided into two layers of recipes and hazardous materials (hazmat). The recipes layer provides a simple API for proper symmetric encryption and the hazmat layer provides low-level cryptographic primitives.

Installation

$ pip install cryptography

Example

Example code using high level symmetric encryption recipe:

from cryptography.fernet import Fernet
key = Fernet.generate_key()
cipher_suite = Fernet(key)
cipher_text = cipher_suite.encrypt(b"A really secret message. Not for prying eyes.")
plain_text = cipher_suite.decrypt(cipher_text)

GPGME bindings

The GPGME Python bindings provide Pythonic access to GPG Made Easy, a C API for the entire GNU Privacy Guard suite of projects, including GPG, libgcrypt, and gpgsm (the S/MIME engine). It supports Python 3.8+. Depends on the SWIG C interface for Python as well as the GnuPG software and libraries.

A more comprehensive GPGME Python Bindings HOWTO is available with the source, and an HTML version is available at https://files.au.adversary.org. Python 3 sample scripts from the examples in the HOWTO are also provided with the source and are accessible at gnupg.org.

Available under the same terms as the rest of the GnuPG Project: GPLv2 and LGPLv2.1, both with the "or any later version" clause.

Installation

Included by default when compiling GPGME if the configure script locates a supported python version (which it will if it"s in $PATH during configuration).

Example

import gpg

# Encryption to public key specified in rkey.
a_key = input("Enter the fingerprint or key ID to encrypt to: ")
filename = input("Enter the filename to encrypt: ")
with open(filename, "rb") as afile:
    text = afile.read()
c = gpg.core.Context(armor=True)
rkey = list(c.keylist(pattern=a_key, secret=False))
ciphertext, result, sign_result = c.encrypt(text, recipients=rkey,
                                            always_trust=True,
                        add_encrypt_to=True)
with open("{0}.asc".format(filename), "wb") as bfile:
    bfile.write(ciphertext)
# Decryption with corresponding secret key
# invokes gpg-agent and pinentry.
with open("{0}.asc".format(filename), "rb") as cfile:
    plaintext, result, verify_result = gpg.Context().decrypt(cfile)
with open("new-{0}".format(filename), "wb") as dfile:
    dfile.write(plaintext)
# Matching the data.
# Also running a diff on filename and the new filename should match.
if text == plaintext:
    print("Hang on ... did you say *all* of GnuPG?  Yep.")
else:
    pass

Machine Learning

Python has a vast number of libraries for data analysis, statistics, and Machine Learning itself, making it a language of choice for many data scientists.

Some widely used packages for Machine Learning and other data science applications are listed below.

SciPy Stack

The SciPy stack consists of a bunch of core helper packages used in data science for statistical analysis and visualising data. Because of its huge number of functionalities and ease of use, the Stack is considered a must-have for most data science applications.

The Stack consists of the following packages (link to documentation given):

The stack also comes with Python bundled in, but has been excluded from the above list.

Installation

For installing the full stack, or individual packages, you can refer to the instructions given here.

NB: Anaconda is highly preferred and recommended for installing and maintaining data science packages seamlessly.

scikit-learn

Scikit is a free and open source machine learning library for Python. It offers off-the-shelf functions to implement many algorithms like linear regression, classifiers, SVMs, k-means, Neural Networks, etc. It also has a few sample datasets which can be directly used for training and testing.

Because of its speed, robustness, and ease of, it"s one of the most widely-used libraries for many Machine Learning applications.

Installation

Through PyPI:

pip install -U scikit-learn

Through conda:

conda install scikit-learn

scikit-learn also comes shipped with Anaconda (mentioned above). For more installation instructions, refer to this link.

Example

For this example, we train a simple classifier on the Iris dataset, which comes bundled in with scikit-learn.

The dataset takes four features of flowers: sepal length, sepal width, petal length, and petal width, and classifies them into three flower species (labels): setosa, versicolor, or virginica. The labels have been represented as numbers in the dataset: 0 (setosa), 1 (versicolor), and 2 (virginica).

We shuffle the Iris dataset and divide it into separate training and testing sets, keeping the last 10 data points for testing and rest for training. We then train the classifier on the training set and predict on the testing set.

from sklearn.datasets import load_iris
from sklearn import tree
from sklearn.metrics import accuracy_score
import numpy as np

#loading the iris dataset
iris = load_iris()

x = iris.data #array of the data
y = iris.target #array of labels (i.e answers) of each data entry

#getting label names i.e the three flower species
y_names = iris.target_names

#taking random indices to split the dataset into train and test
test_ids = np.random.permutation(len(x))

#splitting data and labels into train and test
#keeping last 10 entries for testing, rest for training

x_train = x[test_ids[:-10]]
x_test = x[test_ids[-10:]]

y_train = y[test_ids[:-10]]
y_test = y[test_ids[-10:]]

#classifying using decision tree
clf = tree.DecisionTreeClassifier()

#training (fitting) the classifier with the training set
clf.fit(x_train, y_train)

#predictions on the test dataset
pred = clf.predict(x_test)

print pred #predicted labels i.e flower species
print y_test #actual labels
print (accuracy_score(pred, y_test))*100 #prediction accuracy

Since we"re splitting randomly and the classifier trains on every iteration, the accuracy may vary. Running the above code gives:

[0 1 1 1 0 2 0 2 2 2]
[0 1 1 1 0 2 0 2 2 2]
100.0

The first line contains the labels (i.e. flower species) of the testing data as predicted by our classifier, and the second line contains the actual flower species as given in the dataset. We thus get an accuracy of 100% this time.

More on scikit-learn can be read in the documentation.

Interfacing with C/C++ Libraries

C Foreign Function Interface

CFFI provides a simple to use mechanism for interfacing with C from both CPython and PyPy. It supports two modes: an inline ABI compatibility mode (example provided below), which allows you to dynamically load and run functions from executable modules (essentially exposing the same functionality as LoadLibrary or dlopen), and an API mode, which allows you to build C extension modules.

ABI Interaction

from cffi import FFI
ffi = FFI()
ffi.cdef("size_t strlen(const char*);")
clib = ffi.dlopen(None)
length = clib.strlen("String to be evaluated.")
# prints: 23
print(f"{length}")

ctypes

ctypes is the de facto standard library for interfacing with C/C++ from CPython, and it provides not only full access to the native C interface of most major operating systems (e.g., kernel32 on Windows, or libc on *nix), but also provides support for loading and interfacing with dynamic libraries, such as DLLs or shared objects, at runtime. It brings along with it a whole host of types for interacting with system APIs, and allows you to rather easily define your own complex types, such as structs and unions, and allows you to modify things such as padding and alignment, if needed. It can be a bit crufty to use, but in conjunction with the struct module, you are essentially provided full control over how your data types get translated into something usable by a pure C/C++ method.

Struct Equivalents

MyStruct.h

struct my_struct {
    int a;
    int b;
};

MyStruct.py

import ctypes
class my_struct(ctypes.Structure):
    _fields_ = [("a", c_int),
                ("b", c_int)]

SWIG

SWIG, though not strictly Python focused (it supports a large number of scripting languages), is a tool for generating bindings for interpreted languages from C/C++ header files. It is extremely simple to use: the consumer simply needs to define an interface file (detailed in the tutorial and documentations), include the requisite C/C++ headers, and run the build tool against them. While it does have some limits (it currently seems to have issues with a small subset of newer C++ features, and getting template-heavy code to work can be a bit verbose), it provides a great deal of power and exposes lots of features to Python with little effort. Additionally, you can easily extend the bindings SWIG creates (in the interface file) to overload operators and built-in methods, effectively re- cast C++ exceptions to be catchable by Python, etc.

Example: Overloading repr

MyClass.h

#include <string>
class MyClass {
private:
    std::string name;
public:
    std::string getName();
};

myclass.i

%include "string.i"

%module myclass
%{
#include <string>
#include "MyClass.h"
%}

%extend MyClass {
    std::string __repr__()
    {
        return $self->getName();
    }
}

%include "MyClass.h"

Boost.Python

Boost.Python requires a bit more manual work to expose C++ object functionality, but it is capable of providing all the same features SWIG does and then some, to include providing wrappers to access PyObjects in C++, extracting SWIG wrapper objects, and even embedding bits of Python into your C++ code.

Publishing Your Code

::: todo Replace this kitten with the photo we want.

A healthy open source project needs a place to publish its code and project management stuff so other developers can collaborate with you. This lets your users gain a better understanding of your code, keep up with new developments, report bugs, and contribute code.

This development web site should include the source code history itself, a bug tracker, a patch submission (aka "Pull Request") queue, and possibly additional developer-oriented documentation.

There are several free open source project hosting sites (aka "forges"). These include GitHub, SourceForge, Bitbucket, and GitLab. GitHub is currently the best. Use GitHub.

Creating a Project Repo on GitHub

To publish your Python project on GitHub:

Create a GitHub account if you don"t already have one.
Create a new repo for your project.
1. Click on the "+" menu next to your avatar in the upper right of the page and choose "New repository".
2. Name it after your project and give it an SEO-friendly description.
3. If you don"t have an existing project repo, choose the settings to add a README, [.gitignore], and license. Use the Python [.gitignore] option.
On the newly created repo page, click "Manage topics" and add the tags "python" and "python3" and/or "python2" as appropriate.
Include a link to your new GitHub repo in your project"s README file so people who just have the project distribution know where to find it.

If this is a brand new repo, clone it to your local machine and start working:

$ git clone https://github.com/<username>/<projectname>

Or, if you already have a project Git repo, add your new GitHub repo as a remote:

$ cd <projectname>
$ git remote add origin https://github.com/<username>/<projectname>
$ git push --tags

When Your Project Grows

For more information about managing an open source software project, see the book Producing Open Source Software.

Packaging Your Code

Package your code to share it with other developers. For example, to share a library for other developers to use in their application, or for development tools like "py.test".

An advantage of this method of distribution is its well established ecosystem of tools such as PyPI and pip, which make it easy for other developers to download and install your package either for casual experiments, or as part of large, professional systems.

It is a well-established convention for Python code to be shared this way. If your code isn"t packaged on PyPI, then it will be harder for other developers to find it and to use it as part of their existing process. They will regard such projects with substantial suspicion of being either badly managed or abandoned.

The downside of distributing code like this is that it relies on the recipient understanding how to install the required version of Python, and being able and willing to use tools such as pip to install your code"s other dependencies. This is fine when distributing to other developers, but makes this method unsuitable for distributing applications to end-users.

The Python Packaging Guide provides an extensive guide on creating and maintaining Python packages.

Alternatives to Packaging

To distribute applications to end-users, you should freeze your application <freezing-your-code-ref>.

On Linux, you may also want to consider creating a Linux distro package <packaging-for-linux-distributions-ref> (e.g. a .deb file for Debian or Ubuntu.)

For Python Developers

If you"re writing an open source Python module, PyPI , more properly known as The Cheeseshop, is the place to host it.

Pip vs. easy_install

Use pip. More details here.

Personal PyPI

If you want to install packages from a source other than PyPI (say, if your packages are proprietary), you can do it by hosting a simple HTTP server, running from the directory which holds those packages which need to be installed.

Showing an example is always beneficial

For example, if you want to install a package called MyPackage.tar.gz, and assuming this is your directory structure:

pypiserver

pypiserver is a minimal PyPI compatible server. It can be used to serve a set of packages to easy_install or pip. It includes helpful features like an administrative command (-U) which will update all its packages to their latest versions found on PyPI.

S3-Hosted PyPi

One simple option for a personal PyPI server is to use Amazon S3. A prerequisite for this is that you have an Amazon AWS account with an S3 bucket.

Install all your requirements from PyPi or another source
Install pip2pi

pip install git+https://github.com/wolever/pip2pi.git

Follow pip2pi README for pip2tgz and dir2pi commands

pip2tgz packages/ YourPackage (or pip2tgz packages/ -r requirements.txt)
dir2pi packages/

Upload the new files

Use a client like Cyberduck to sync the entire packages folder to your s3 bucket.
Make sure you upload packages/simple/index.html as well as all new files and directories.

Fix new file permissions

By default, when you upload new files to the S3 bucket, they will have the wrong permissions set.
Use the Amazon web console to set the READ permission of the files to EVERYONE.
If you get HTTP 403 when trying to install a package, make sure you"ve set the permissions correctly.

All done

You can now install your package with pip install --index-url=http://your-s3-bucket/packages/simple/ YourPackage.

For Linux Distributions

Creating a Linux distro package is arguably the "right way" to distribute code on Linux.

Because a distribution package doesn"t include the Python interpreter, it makes the download and install about 2-12 MB smaller than freezing your application <freezing-your-code-ref>.

Also, if a distribution releases a new security update for Python, then your application will automatically start using that new version of Python.

The bdist_rpm command makes producing an RPM file for use by distributions like Red Hat or SuSE trivially easy.

However, creating and maintaining the different configurations required for each distribution"s format (e.g. .deb for Debian/Ubuntu, .rpm for Red Hat/Fedora, etc.) is a fair amount of work. If your code is an application that you plan to distribute on other platforms, then you"ll also have to create and maintain the separate config required to freeze your application for Windows and OS X. It would be much less work to simply create and maintain a single config for one of the cross platform freezing tools <freezing-your-code-ref>, which will produce stand-alone executables for all distributions of Linux, as well as Windows and OS X.

Creating a distribution package is also problematic if your code is for a version of Python that isn"t currently supported by a distribution. Having to tell some versions of Ubuntu end-users that they need to add the "dead-snakes" PPA using [sudo apt-repository] commands before they can install your .deb file makes for an extremely hostile user experience. Not only that, but you"d have to maintain a custom equivalent of these instructions for every distribution, and worse, have your users read, understand, and act on them.

Having said all that, here"s how to do it:

Useful Tools

fpm
alien
dh-virtualenv (for APT/DEB omnibus packaging)

Freezing Your Code

"Freezing" your code is creating a single-file executable file to distribute to end-users, that contains all of your application code as well as the Python interpreter.

Applications such as "Dropbox", "Eve Online", "Civilization IV", and BitTorrent clients do this.

The advantage of distributing this way is that your application will "just work", even if the user doesn"t already have the required version of Python (or any) installed. On Windows, and even on many Linux distributions and OS X, the right version of Python will not already be installed.

Besides, end-user software should always be in an executable format. Files ending in .py are for software engineers and system administrators.

One disadvantage of freezing is that it will increase the size of your distribution by about 2--12 MB. Also, you will be responsible for shipping updated versions of your application when security vulnerabilities to Python are patched.

Alternatives to Freezing

Packaging your code <packaging-your-code-ref> is for distributing libraries or tools to other developers.

On Linux, an alternative to freezing is to create a Linux distro package <packaging-for-linux-distributions-ref> (e.g. .deb files for Debian or Ubuntu, or .rpm files for Red Hat and SuSE.)

::: todo Fill in "Freezing Your Code" stub

Comparison of Freezing Tools

Date of this writing: Oct 5, 2019 Solutions and platforms/features supported:

Solution Windows Linux OS X Python 3 License One-file mode Zipfile import Eggs pkg_resources support Latest release date

bbFreeze yes yes yes no MIT no yes yes yes Jan 20, 2014 py2exe yes no no yes MIT yes yes no no Oct 21, 2014 pyInstaller yes yes yes yes GPL yes no yes no Jul 9, 2019 cx_Freeze yes yes yes yes PSF no yes yes no Aug 29, 2019 py2app no no yes yes MIT no yes yes yes Mar 25, 2019

Note:

Note

Freezing Python code on Linux into a Windows executable was only once supported in PyInstaller and later dropped. :

Note:

Note

All solutions need a Microsoft Visual C++ to be installed on the target machine, except py2app. Only PyInstaller makes a self-executable exe that bundles the appropriate DLL when passing --onefile to Configure.py. :

Windows

bbFreeze

Prerequisite is to install Python, Setuptools and pywin32 dependency on Windows <install-windows>.

Install bbfreeze:

$ pip install bbfreeze

Write most basic bb_setup.py

from bbfreeze import Freezer

freezer = Freezer(distdir='dist')
freezer.addScript('foobar.py', gui_only=True)
freezer()

Note:

Note

This will work for the most basic one file scripts. For more advanced freezing you will have to provide include and exclude paths like so:

freezer = Freezer(distdir='dist', includes=['my_code'], excludes=['docs'])

(Optionally) include icon

freezer.setIcon('my_awesome_icon.ico')

4. Provide the Microsoft Visual C++ runtime DLL for the freezer. It might be possible to append your sys.path with the Microsoft Visual Studio path but I find it easier to drop msvcp90.dll in the same folder where your script resides.

Freeze!

$ python bb_setup.py

py2exe

Prerequisite is to install Python on Windows <install-windows>. The last release of py2exe is from the year 2014. There is not active development.

Download and install https://sourceforge.net/projects/py2exe/files/py2exe/
Write setup.py (List of configuration options):

from distutils.core import setup
import py2exe

setup(
    windows=[{'script': 'foobar.py'}],
)

(Optionally) include icon
(Optionally) one-file mode
Generate .exe into dist directory:

$ python setup.py py2exe

Provide the Microsoft Visual C++ runtime DLL. Two options: globally install dll on target machine or distribute dll alongside with .exe.

PyInstaller

Prerequisite is to have installed Python, Setuptools and pywin32 dependency on Windows <install-windows>.

OS X

py2app

PyInstaller

PyInstaller can be used to build Unix executables and windowed apps on Mac OS X 10.6 (Snow Leopard) or newer.

To install PyInstaller, use pip:

$ pip install pyinstaller

To create a standard Unix executable, from say script.py, use:

$ pyinstaller script.py

This creates:

a script.spec file, analogous to a make file
a build folder, that holds some log files
a dist folder, that holds the main executable script, and some dependent Python libraries

all in the same folder as script.py. PyInstaller puts all the Python libraries used in script.py into the dist folder, so when distributing the executable, distribute the whole dist folder.

The script.spec file can be edited to customise the build, with options such as:

bundling data files with the executable
including run-time libraries (.dll or .so files) that PyInstaller can"t infer automatically
adding Python run-time options to the executable

Now script.spec can be run with pyinstaller (instead of using script.py again):

$ pyinstaller script.spec

To create a standalone windowed OS X application, use the --windowed option:

$ pyinstaller --windowed script.spec

This creates a script.app in the dist folder. Make sure to use GUI packages in your Python code, like PyQt or PySide, to control the graphical parts of the app.

There are several options in script.spec related to Mac OS X app bundles here. For example, to specify an icon for the app, use the icon=\path\to\icon.icns option.

Linux

bbFreeze

Warning:

Warning

bbFreeze will ONLY work in Python 2.x environment, since it"s no longer being maintained as stated by it"s former maintainer. If you"re interested in it, check the repository in here. :

bbFreeze can be used with all distributions that has Python installed along with pip2 and/or easy_install.

For pip2, use the following:

$ pip2 install bbfreeze

Or, for easy_install:

$ easy_install bbfreeze

With bbFreeze installed, you"re ready to freeze your applications.

Let"s assume you have a script, say, "hello.py" and a module called "module.py" and you have a function in it that"s being used in your script. No need to worry, you can just ask to freeze the main entrypoint of your script and it should freeze entirely:

$ bbfreeze script.py

With this, it creates a folder called dist/, of which contains the executable of the script and required .so (shared objects) files linked against libraries used within the Python script.

Alternatively, you can create a script that does the freezing for you. An API for the freezer is available from the library within:

from bbfreeze import Freezer

freezer = Freezer(distdir='dist')
freezer.addScript('script.py', gui_only=True) # Enable gui_only kwarg for app that uses GUI packages.
freezer()

PyInstaller

PyInstaller can be used in a similar fashion as in OS X. The installation goes in the same manner as shown in the OS X section.

Don"t forget to have dependencies such as Python and pip installed for usage.

Introduction

From the official Python website:

Python is a general-purpose, high-level programming language similar to Tcl, Perl, Ruby, Scheme, or Java. Some of its main key features include:

very clear, readable syntax

Python"s philosophy focuses on readability, from code blocks delineated with significant whitespace to intuitive keywords in place of inscrutable punctuation.
extensive standard libraries and third party modules for virtually any task

Python is sometimes described with the words "batteries included" because of its extensive standard library, which includes modules for regular expressions, file IO, fraction handling, object serialization, and much more.

Additionally, the Python Package Index is available for users to submit their packages for widespread use, similar to Perl"s CPAN. There is a thriving community of very powerful Python frameworks and tools like the Django web framework and the NumPy set of math routines.
integration with other systems

Python can integrate with Java libraries, enabling it to be used with the rich Java environment that corporate programmers are used to. It can also be extended by C or C++ modules when speed is of the essence.
ubiquity on computers

Python is available on Windows, *nix, and Mac. It runs wherever the Java virtual machine runs, and the reference implementation CPython can help bring Python to wherever there is a working C compiler.
friendly community

Python has a vibrant and large community <the-community> which maintains wikis, conferences, countless repositories, mailing lists, IRC channels, and so much more. Heck, the Python community is even helping to write this guide!

About This Guide

Purpose

The Hitchhiker"s Guide to Python exists to provide both novice and expert Python developers a best practice handbook for the installation, configuration, and usage of Python on a daily basis.

By the Community

This guide is architected and maintained by Kenneth Reitz in an open fashion. This is a community-driven effort that serves one purpose: to serve the community.

For the Community

All contributions to the Guide are welcome, from Pythonistas of all levels. If you think there"s a gap in what the Guide covers, fork the Guide on GitHub and submit a pull request.

Contributions are welcome from everyone, whether they"re an old hand or a first-time Pythonista, and the authors to the Guide will gladly help if you have any questions about the appropriateness, completeness, or accuracy of a contribution.

To get started working on The Hitchhiker"s Guide, see the /notes/contribute page.

The Community

BDFL

Guido van Rossum, the creator of Python, is often referred to as the BDFL --- the Benevolent Dictator For Life.

Python Software Foundation

The mission of the Python Software Foundation is to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers.

Learn More about the PSF.

PEPs

PEPs are Python Enhancement Proposals. They describe changes to Python itself, or the standards around it.

There are three different types of PEPs (as defined by 1):

Standards

: Describes a new feature or implementation.

Informational

: Describes a design issue, general guidelines, or information to the community.

Process

: Describes a process related to Python.

Notable PEPs

There are a few PEPs that could be considered required reading:

8: The Python Style Guide.

: Read this. All of it. Follow it.

20: The Zen of Python.

: A list of 19 statements that briefly explain the philosophy behind Python.

257: Docstring Conventions.

: Gives guidelines for semantics and conventions associated with Python docstrings.

You can read more at The PEP Index.

Submitting a PEP

PEPs are peer-reviewed and accepted/rejected after much discussion. Anyone can write and submit a PEP for review.

Here"s an overview of the PEP acceptance workflow:

Python Conferences

The major events for the Python community are developer conferences. The two most notable conferences are PyCon, which is held in the US, and its European sibling, EuroPython.

A comprehensive list of conferences is maintained at pycon.org.

Python User Groups

User Groups are where a bunch of Python developers meet to present or talk about Python topics of interest. A list of local user groups is maintained at the Python Software Foundation Wiki.

Online Communities

PythonistaCafe is an invite-only, online community of Python and software development enthusiasts helping each other succeed and grow. Think of it as a club of mutual improvement for Pythonistas where a broad range of programming questions, career advice, and other topics are discussed every day.

Python Job Boards

Python Jobs HQ is a Python job board, by Python Developers for Python Developers. The site aggregates Python job postings from across the web and also allows employers to post Python job openings directly on the site.

Learning Python

Beginner

The Python Tutorial

This is the official tutorial. It covers all the basics, and offers a tour of the language and the standard library. Recommended for those who need a quick-start guide to the language.

The Python Tutorial

Real Python

Real Python is a repository of free and in-depth Python tutorials created by a diverse team of professional Python developers. At Real Python you can learn all things Python from the ground up. Everything from the absolute basics of Python, to web development and web scraping, to data visualization, and beyond.

Real Python

Python Basics

pythonbasics.org is an introductory tutorial for beginners. The tutorial includes exercises. It covers the basics and there are also in-depth lessons like object oriented programming and regular expressions.

Python basics

Python for Beginners

thepythonguru.com is a tutorial focused on beginner programmers. It covers many Python concepts in depth. It also teaches you some advanced constructs of Python like lambda expressions and regular expressions. And last it finishes off with the tutorial "How to access MySQL db using Python"

Python for Beginners

Learn Python Interactive Tutorial

Learnpython.org is an easy non-intimidating way to get introduced to Python. The website takes the same approach used on the popular Try Ruby website. It has an interactive Python interpreter built into the site that allows you to go through the lessons without having to install Python locally.

Learn Python

Python for You and Me

If you want a more traditional book, Python For You and Me is an excellent resource for learning all aspects of the language.

Python for You and Me

Learn Python Step by Step

Techbeamers.com provides step-by-step tutorials to teach Python. Each tutorial is supplemented with logically added coding snippets and equips with a follow-up quiz on the subject learned. There is a section for Python interview questions to help job seekers. You can also read essential Python tips and learn best coding practices for writing quality code. Here, you"ll get the right platform to learn Python quickly.

Learn Python Basic to Advanced

Online Python Tutor

Online Python Tutor gives you a visual step-by-step representation of how your program runs. Python Tutor helps people overcome a fundamental barrier to learning programming by understanding what happens as the computer executes each line of a program"s source code.

Online Python Tutor

Invent Your Own Computer Games with Python

This beginner"s book is for those with no programming experience at all. Each chapter has the source code to a small game, using these example programs to demonstrate programming concepts to give the reader an idea of what programs "look like".

Invent Your Own Computer Games with Python

Hacking Secret Ciphers with Python

This book teaches Python programming and basic cryptography for absolute beginners. The chapters provide the source code for various ciphers, as well as programs that can break them.

Hacking Secret Ciphers with Python

Learn Python the Hard Way

This is an excellent beginner programmer"s guide to Python. It covers "hello world" from the console to the web.

Learn Python the Hard Way

Crash into Python

Also known as Python for Programmers with 3 Hours, this guide gives experienced developers from other languages a crash course on Python.

Crash into Python

Dive Into Python 3

Dive Into Python 3 is a good book for those ready to jump in to Python 3. It"s a good read if you are moving from Python 2 to 3 or if you already have some experience programming in another language.

Dive Into Python 3

Think Python: How to Think Like a Computer Scientist

Think Python attempts to give an introduction to basic concepts in computer science through the use of the Python language. The focus was to create a book with plenty of exercises, minimal jargon, and a section in each chapter devoted to the subject of debugging.

While exploring the various features available in the Python language the author weaves in various design patterns and best practices.

The book also includes several case studies which have the reader explore the topics discussed in the book in greater detail by applying those topics to real-world examples. Case studies include assignments in GUI programming and Markov Analysis.

Think Python

Python Koans

Python Koans is a port of Edgecase"s Ruby Koans. It uses a test-driven approach to provide an interactive tutorial teaching basic Python concepts. By fixing assertion statements that fail in a test script, this provides sequential steps to learning Python.

For those used to languages and figuring out puzzles on their own, this can be a fun, attractive option. For those new to Python and programming, having an additional resource or reference will be helpful.

Python Koans

More information about test driven development can be found at these resources:

Test Driven Development

A Byte of Python

A free introductory book that teaches Python at the beginner level, it assumes no previous programming experience.

A Byte of Python for Python 2.x A Byte of Python for Python 3.x

Computer Science Path on Codecademy

A Codecademy course for the absolute Python beginner. This free and interactive course provides and teaches the basics (and beyond) of Python programming while testing the user"s knowledge in between progress. This course also features a built-in interpreter for receiving instant feedback on your learning.

Computer Science Path on Codecademy

Code the blocks

Code the blocks provides free and interactive Python tutorials for beginners. It combines Python programming with a 3D environment where you "place blocks" and construct structures. The tutorials teach you how to use Python to create progressively more elaborate 3D structures, making the process of learning Python fun and engaging.

Code the blocks

Intermediate

Python Tricks: The Book

Discover Python"s best practices with simple examples and start writing even more beautiful + Pythonic code. Python Tricks: The Book shows you exactly how.

You'll master intermediate and advanced-level features in Python with practical examples and a clear narrative.

Python Tricks: The Book

Effective Python

This book contains 59 specific ways to improve writing Pythonic code. At 227 pages, it is a very brief overview of some of the most common adaptations programmers need to make to become efficient intermediate level Python programmers.

Effective Python

Advanced

Pro Python

This book is for intermediate to advanced Python programmers who are looking to understand how and why Python works the way it does and how they can take their code to the next level.

Pro Python

Expert Python Programming

Expert Python Programming deals with best practices in programming Python and is focused on the more advanced crowd.

It starts with topics like decorators (with caching, proxy, and context manager case studies), method resolution order, using super() and meta-programming, and general 8 best practices.

It has a detailed, multi-chapter case study on writing and releasing a package and eventually an application, including a chapter on using zc.buildout. Later chapters detail best practices such as writing documentation, test-driven development, version control, optimization, and profiling.

Expert Python Programming

A Guide to Python"s Magic Methods

This is a collection of blog posts by Rafe Kettler which explain "magic methods" in Python. Magic methods are surrounded by double underscores (i.e. __init__) and can make classes and objects behave in different and magical ways.

A Guide to Python"s Magic Methods

Note:

Note

Rafekettler.com is currently down; you can go to their GitHub version directly. Here you can find a PDF version: A Guide to Python"s Magic Methods (repo on GitHub) :

For Engineers and Scientists

A Primer on Scientific Programming with Python

A Primer on Scientific Programming with Python, written by Hans Petter Langtangen, mainly covers Python"s usage in the scientific field. In the book, examples are chosen from mathematics and the natural sciences.

A Primer on Scientific Programming with Python

Numerical Methods in Engineering with Python

Numerical Methods in Engineering with Python, written by Jaan Kiusalaas, puts the emphasis on numerical methods and how to implement them in Python.

Numerical Methods in Engineering with Python

Miscellaneous Topics

Problem Solving with Algorithms and Data Structures

Problem Solving with Algorithms and Data Structures covers a range of data structures and algorithms. All concepts are illustrated with Python code along with interactive samples that can be run directly in the browser.

Problem Solving with Algorithms and Data Structures

Programming Collective Intelligence

Programming Collective Intelligence introduces a wide array of basic machine learning and data mining methods. The exposition is not very mathematically formal, but rather focuses on explaining the underlying intuition and shows how to implement the algorithms in Python.

Programming Collective Intelligence

Transforming Code into Beautiful, Idiomatic Python

Transforming Code into Beautiful, Idiomatic Python is a video by Raymond Hettinger. Learn to take better advantage of Python"s best features and improve existing code through a series of code transformations: "When you see this, do that instead."

Transforming Code into Beautiful, Idiomatic Python

Fullstack Python

Fullstack Python offers a complete top-to-bottom resource for web development using Python.

From setting up the web server, to designing the front-end, choosing a database, optimizing/scaling, etc.

As the name suggests, it covers everything you need to build and run a complete web app from scratch.

Fullstack Python

PythonistaCafe

PythonistaCafe

References

Python in a Nutshell

Python in a Nutshell, written by Alex Martelli, covers most cross-platform Python usage, from its syntax to built-in libraries to advanced topics such as writing C extensions.

Python in a Nutshell

The Python Language Reference

This is Python"s reference manual. It covers the syntax and the core semantics of the language.

The Python Language Reference

Python Essential Reference

Python Essential Reference, written by David Beazley, is the definitive reference guide to Python. It concisely explains both the core language and the most essential parts of the standard library. It covers Python 3 and 2.6 versions.

Python Essential Reference

Python Pocket Reference

Python Pocket Reference, written by Mark Lutz, is an easy to use reference to the core language, with descriptions of commonly used modules and toolkits. It covers Python 3 and 2.6 versions.

Python Pocket Reference

Python Cookbook

Python Cookbook, written by David Beazley and Brian K. Jones, is packed with practical recipes. This book covers the core Python language as well as tasks common to a wide variety of application domains.

Python Cookbook

Writing Idiomatic Python

Writing Idiomatic Python, written by Jeff Knupp, contains the most common and important Python idioms in a format that maximizes identification and understanding. Each idiom is presented as a recommendation of a way to write some commonly used piece of code, followed by an explanation of why the idiom is important. It also contains two code samples for each idiom: the "Harmful" way to write it and the "Idiomatic" way.

For Python 2.7.3+

For Python 3.3+

Documentation

Official Documentation

The official Python Language and Library documentation can be found here:

Python 3.x

Read the Docs

Read the Docs is a popular community project that hosts documentation for open source software. It holds documentation for many Python modules, both popular and exotic.

Read the Docs

pydoc

pydoc is a utility that is installed when you install Python. It allows you to quickly retrieve and search for documentation from your shell. For example, if you needed a quick refresher on the time module, pulling up documentation would be as simple as:

$ pydoc time

The above command is essentially equivalent to opening the Python REPL and running:

>>> help(time)

News

PyCoder's Weekly

PyCoder's Weekly is a free weekly Python newsletter for Python developers by Python developers (Projects, Articles, News, and Jobs).

PyCoder's Weekly

Real Python

At Real Python you can learn all things Python from the ground up, with weekly free and in-depth tutorials. Everything from the absolute basics of Python, to web development and web scraping, to data visualization, and beyond.

Real Python

Planet Python

This is an aggregate of Python news from a growing number of developers.

Planet Python

/r/python

/r/python is the Reddit Python community where users contribute and vote on Python-related news.

/r/python

Talk Python Podcast

The #1 Python-focused podcast covering the people and ideas in Python.

Talk Python To Me

Python Bytes Podcast

A short-form Python podcast covering recent developer headlines.

Python Bytes

Python Weekly

Python Weekly is a free weekly newsletter featuring curated news, articles, new releases, jobs, etc. related to Python.

Python Weekly

Python News

Python News is the news section in the official Python web site (www.python.org). It briefly highlights the news from the Python community.

Python News

Import Python Weekly

Weekly Python Newsletter containing Python Articles, Projects, Videos, and Tweets delivered in your inbox. Keep Your Python Programming Skills Updated.

Import Python Weekly Newsletter

A weekly overview of the most popular Python news, articles, and packages.

Awesome Python Newsletter

Contribute

Python-guide is under active development, and contributors are welcome.

If you have a feature request, suggestion, or bug report, please open a new issue on GitHub. To submit patches, please send a pull request on GitHub. Once your changes get merged back in, you"ll automatically be added to the Contributors List.

Style Guide

For all contributions, please follow the guide-style-guide.

Todo List

If you"d like to contribute, there"s plenty to do. Here"s a short todo list.

Establish "use this" vs "alternatives are...." recommendations

::: todolist

License

The Guide is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported license.

The Guide Style Guide

As with all documentation, having a consistent format helps make the document more understandable. In order to make The Guide easier to digest, all contributions should fit within the rules of this style guide where appropriate.

The Guide is written as restructuredtext-ref.

Note:

Note

Parts of The Guide may not yet match this style guide. Feel free to update those parts to be in sync with The Guide Style Guide :

Note:

Note

On any page of the rendered HTML you can click "Show Source" to see how authors have styled the page. :

Relevancy

Strive to keep any contributions relevant to the purpose of The Guide <about-ref>.

Avoid including too much information on subjects that don"t directly relate to Python development.
Prefer to link to other sources if the information is already out there. Be sure to describe what and why you are linking.
Cite references where needed.
If a subject isn"t directly relevant to Python, but useful in conjunction with Python (e.g., Git, GitHub, Databases), reference by linking to useful resources, and describe why it"s useful to Python.
When in doubt, ask.

Headings

Use the following styles for headings.

Chapter title:

#########
Chapter 1
#########

Page title:

*******************
Time is an Illusion
*******************

Section headings:

Lunchtime Doubly So
===================

Sub section headings:

Very Deep
---------

Prose

Wrap text lines at 78 characters. Where necessary, lines may exceed 78 characters, especially if wrapping would make the source text more difficult to read.

Use Standard American English, not British English.

Use of the serial comma (also known as the Oxford comma) is 100% non-optional. Any attempt to submit content with a missing serial comma will result in permanent banishment from this project, due to complete and total lack of taste.

Banishment? Is this a joke? Hopefully we will never have to find out.

Code Examples

Wrap all code examples at 70 characters to avoid horizontal scrollbars.

Command line examples:

.. code-block:: console

    $ run command --help
    $ ls ..

Be sure to include the $ prefix before each line for Unix console examples.

For Windows console examples, use doscon or powershell instead of console, and omit the $ prefix.

Python interpreter examples:

Label the example::

.. code-block:: python

    >>> import this

Python examples:

Descriptive title::

.. code-block:: python

    def get_answer():
        return 42

Externally Linking

Prefer labels for well known subjects (e.g. proper nouns) when linking:

Sphinx_ is used to document Python.

Sphinx: https://www.sphinx-doc.org

Prefer to use descriptive labels with inline links instead of leaving bare links:

Read the `Sphinx Tutorial <https://www.sphinx-doc.org/en/master/usage/quickstart.html>`_

Avoid using labels such as "click here", "this", etc., preferring descriptive labels (SEO worthy) instead.

Linking to Sections in The Guide

To cross-reference other parts of this documentation, use the :ref: keyword and labels.

To make reference labels more clear and unique, always add a -ref suffix:

some-section-ref:

Some Section
------------

Notes and Warnings

Make use of the appropriate admonitions directives when making notes.

Notes:

.. note::
    The Hitchhiker’s Guide to the Galaxy has a few things to say
    on the subject of towels. A towel, it says, is about the most
    massively useful thing an interstellar hitch hiker can have.

Warnings:

.. warning:: DON'T PANIC

TODOs

Please mark any incomplete areas of The Guide with a todo directive. To avoid cluttering the todo-list-ref, use a single todo for stub documents or large incomplete sections.

.. todo::
    Learn the Ultimate Answer to the Ultimate Question
    of Life, The Universe, and Everything