Relational databases in a packaged Python application without permanent scaring: Using alembic seamlessly
Use alembic in a database-requiring application without users noticing
In a previous post, we described how to write a Python application which uses a relational database in such a way that users of your application do not need to know SQL. The tool which makes this possible is Alembic. Alembic is part of SQLAlchemy. Alembic makes your application accessible to people who do not know SQL by managing the creation of database tables on their behalf. But using Alembic comes at a cost: you must tell your users to edit alembic.ini
and to run alembic upgrade head
. Paying this cost is worthwhile. Running alembic
once is easier for a user than learning SQL and manually issuing CREATE TABLE
commands. But wouldn’t it be nice if the user didn’t have to run alembic
at all?
This note describes how to use alembic in a database-requiring application in such a way that the user does not have to configure alembic. In other words, this note describes how to use alembic seamlessly. Instead of editing an alembic.ini
and running alembic upgrade head
, the user now assigns a value to DATABASE_URL
, an environment variable, and runs a script, myapp-initdb
. This script only needs to be run once.
Let’s walk through the steps.
In an empty directory, let’s add content to myapp/app.py
and populate myapp/alembic
. The contents of myapp/app.py
file and the myapp/alembic
directory will familiar from the previous post. Populate myapp/alembic
by running alembic init myapp/alembic
. For the moment, hang on to the alembic.ini
that alembic init
creates. Now put the following in myapp/app.py
(note the use of DATABASE_URL
):
import os
import sqlalchemy as sa
DATABASE_URL = os.environ["DATABASE_URL"]
metadata = sa.MetaData()
pastes = sa.Table(
"pastes",
metadata,
sa.Column("id", sa.Integer, primary_key=True, autoincrement=True),
sa.Column("text", sa.String(length=2000), nullable=False),
)
if __name__ == "__main__":
engine = sa.create_engine(DATABASE_URL, future=True)
with engine.connect() as conn:
stmt = sa.select(pastes)
rows = conn.execute(stmt)
html = "\n".join(f"<p>paste {id} text: <pre>{text}</pre>" for id, text in rows)
print(html)
Create a revision, as before, starting with alembic revision -m"create pastes table". In the file the command creates, make sure the upgrade function looks like this:def upgrade():
op.create_table(
"pastes",
sa.Column("id", sa.Integer, primary_key=True, autoincrement=True),
sa.Column("text", sa.String(length=2000), nullable=False),
)
Now is where we depart from our previous instructions. First we
delete
alembic.ini
andremove the line
fileConfig(config.config_file_name)
frommyapp/alembic/env.py
.
Then we put the following in a new file named myapp-initdb
which we mark as executable (run chmod u+x myapp-initdb
):
#!/usr/bin/env python3
"""
Create the initial database schema and stamp the `head` revision.
The environment variable `DATABASE_URL` must be defined.
"""
import logging.config
import os
import alembic.command
import alembic.config
import sqlalchemy as sa
import myapp.alembic
import myapp.app
# typically set in `alembic.ini`
logging.config.dictConfig(
{
"root": {"level": "WARN", "handlers": ["console"]},
"loggers": {
"sqlalchemy": {"level": "WARN", "handlers": []},
"alembic": {"level": "INFO", "handlers": []},
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"level": "NOTSET",
"stream": "ext://sys.stderr",
"formatter": "generic",
}
},
"formatters": {
"generic": {
"format": "%(levelname)-5.5s [%(name)s] %(message)s",
"datefmt": "%H:%M:%S",
}
},
"version": 1,
}
)
connection_string = os.environ["DATABASE_URL"]
alembic_path = list(myapp.alembic.__path__)[0]
engine = sa.create_engine(connection_string)
myapp.app.metadata.create_all(bind=engine)
config = alembic.config.Config()
config.set_main_option("sqlalchemy.url", connection_string)
config.set_main_option("script_location", alembic_path)
alembic.command.stamp(config, "head")
Make myapp-initdb
executable. You’re done.
Now users of your software install your package. They run the myapp-initdb
script once. Then they can run DATABASE_URL={database-url} python3 -m myapp.app
to run the app, replacing {database-url}
with your database URL, the value you would have assigned to sqlalchemy.url
.
There are a variety of benefits to using Alembic this way. Allowing users to specify the database connection string using an environment variable lets them skip editing alembic.ini
, a configuration file written using an uncommon syntax (based on the Microsoft Windows INI format). Using an environment variable is likely something they are more familiar with. It’s standard practice in the Python world and elsewhere (see Twelve-Factor app). Asking users run a script with an understandable name, myapp-initdb
, is less likely to generate confusion than asking users to run a mysterious program with odd arguments (i.e., alembic upgrade head
). Perhaps one day SQLAlchemy will change the behavior of Alembic to resemble this approach. Until then, you know how to use Alembic seamlessly.
Postscript
We didn’t discuss migrations. A similar approach can be used to simplify applying schema changes when the user upgrades the software. Instead of myapp-initdb
, the script might be called myapp-migrate
.
This is episode 4 of season 1 of Polyglot Python.
This is version 1, published on 2021-08-26. The first version appeared on 2021-08-26.
If you read this episode and value it, consider subscribing. Doing so helps you by supporting the development of new episodes. There are also subscriber-only perks.
If you’re reading this in your inbox, know that there’s a version on the web.