Relational databases in a packaged Python application without permanent scaring: Using alembic seamlessly

Use alembic in a database-requiring application without users noticing

Aug 26, 2021

In a previous post, we described how to write a Python application which uses a relational database in such a way that users of your application do not need to know SQL. The tool which makes this possible is Alembic. Alembic is part of SQLAlchemy. Alembic makes your application accessible to people who do not know SQL by managing the creation of database tables on their behalf. But using Alembic comes at a cost: you must tell your users to edit alembic.ini and to run alembic upgrade head. Paying this cost is worthwhile. Running alembic once is easier for a user than learning SQL and manually issuing CREATE TABLE commands. But wouldn’t it be nice if the user didn’t have to run alembic at all?

This note describes how to use alembic in a database-requiring application in such a way that the user does not have to configure alembic. In other words, this note describes how to use alembic seamlessly. Instead of editing an alembic.ini and running alembic upgrade head, the user now assigns a value to DATABASE_URL, an environment variable, and runs a script, myapp-initdb. This script only needs to be run once.

Let’s walk through the steps.

In an empty directory, let’s add content to myapp/app.py and populate myapp/alembic. The contents of myapp/app.py file and the myapp/alembic directory will familiar from the previous post. Populate myapp/alembic by running alembic init myapp/alembic. For the moment, hang on to the alembic.ini that alembic init creates. Now put the following in myapp/app.py (note the use of DATABASE_URL):

import os import sqlalchemy as sa DATABASE_URL = os.environ["DATABASE_URL"] metadata = sa.MetaData() pastes = sa.Table( "pastes", metadata, sa.Column("id", sa.Integer, primary_key=True, autoincrement=True), sa.Column("text", sa.String(length=2000), nullable=False), ) if __name__ == "__main__": engine = sa.create_engine(DATABASE_URL, future=True) with engine.connect() as conn: stmt = sa.select(pastes) rows = conn.execute(stmt) html = "\n".join(f"<p>paste {id} text: <pre>{text}</pre>" for id, text in rows) print(html)

Create a revision, as before, starting with alembic revision -m"create pastes table". In the file the command creates, make sure the upgrade function looks like this:
def upgrade(): op.create_table( "pastes", sa.Column("id", sa.Integer, primary_key=True, autoincrement=True), sa.Column("text", sa.String(length=2000), nullable=False), )
Now is where we depart from our previous instructions. First we

delete alembic.ini and
remove the line fileConfig(config.config_file_name) from myapp/alembic/env.py.

Then we put the following in a new file named myapp-initdb which we mark as executable (run chmod u+x myapp-initdb):

#!/usr/bin/env python3 """ Create the initial database schema and stamp the `head` revision. The environment variable `DATABASE_URL` must be defined. """ import logging.config import os import alembic.command import alembic.config import sqlalchemy as sa import myapp.alembic import myapp.app # typically set in `alembic.ini` logging.config.dictConfig( { "root": {"level": "WARN", "handlers": ["console"]}, "loggers": { "sqlalchemy": {"level": "WARN", "handlers": []}, "alembic": {"level": "INFO", "handlers": []}, }, "handlers": { "console": { "class": "logging.StreamHandler", "level": "NOTSET", "stream": "ext://sys.stderr", "formatter": "generic", } }, "formatters": { "generic": { "format": "%(levelname)-5.5s [%(name)s] %(message)s", "datefmt": "%H:%M:%S", } }, "version": 1, } ) connection_string = os.environ["DATABASE_URL"] alembic_path = list(myapp.alembic.__path__)[0] engine = sa.create_engine(connection_string) myapp.app.metadata.create_all(bind=engine) config = alembic.config.Config() config.set_main_option("sqlalchemy.url", connection_string) config.set_main_option("script_location", alembic_path) alembic.command.stamp(config, "head")

Make myapp-initdb executable. You’re done.

Now users of your software install your package. They run the myapp-initdb script once. Then they can run DATABASE_URL={database-url} python3 -m myapp.app to run the app, replacing {database-url} with your database URL, the value you would have assigned to sqlalchemy.url.

There are a variety of benefits to using Alembic this way. Allowing users to specify the database connection string using an environment variable lets them skip editing alembic.ini, a configuration file written using an uncommon syntax (based on the Microsoft Windows INI format). Using an environment variable is likely something they are more familiar with. It’s standard practice in the Python world and elsewhere (see Twelve-Factor app). Asking users run a script with an understandable name, myapp-initdb, is less likely to generate confusion than asking users to run a mysterious program with odd arguments (i.e., alembic upgrade head). Perhaps one day SQLAlchemy will change the behavior of Alembic to resemble this approach. Until then, you know how to use Alembic seamlessly.

Postscript

We didn’t discuss migrations. A similar approach can be used to simplify applying schema changes when the user upgrades the software. Instead of myapp-initdb, the script might be called myapp-migrate.

This is episode 4 of season 1 of Polyglot Python.

This is version 1, published on 2021-08-26. The first version appeared on 2021-08-26.

If you read this episode and value it, consider subscribing. Doing so helps you by supporting the development of new episodes. There are also subscriber-only perks.

If you’re reading this in your inbox, know that there’s a version on the web.

Polyglot Python

Relational databases in a packaged Python application without permanent scaring: Using alembic seamlessly

Use alembic in a database-requiring application without users noticing

Postscript

Discussion about this post