As a data engineer, you often need to connect to a database. While you probably know how to write SQL, every SQL dialect is a bit different. In this article, you'll see a way of interacting with a database purely based on Python code. This not only prevents you from having to learn a new dialect but also allows you to migrate to a different database without having to change any code.
In this article, you'll learn how to connect to a database with SQLAlchemy. Furthermore, you'll see some examples, which should get you comfortable with using the API.
Connect to a database table
Without a database connection, you can't do anything. So let's start with the database connection. SQLAlchemy makes this relatively easy for you.
SQLAlchemy turns your database into a detail. You can specify and change the dialect as you please. Whether you want to use standard SQL or something exotic, it is just a matter of configuration.
That makes it easy. Let's look together at a code snippet that is sufficient to make a connection.
url = 'mysql://username:firstname.lastname@example.org/dbname'
We start with the database URL. First, you specify the dialect. A dialect can be MySQL, for example. Then you set the username and password.
Never store this username and password in your code. In fact, it is best to save the whole configuration string as an environment variable. Don't forget that if you ever save this URL in git, it can be retrieved forever. Be careful.
Finally, you have the hostname or IP address of the database and the database name. These data are all you need to establish a connection.
The port is optional, but SQLAlchemy is smart enough to know the MySQL database resides at port 3306.
engine = create_engine(url, echo=True) connection = engine.connect()
Finally, you create the connection object and invoke the connect method.
The echo option is a way to log your SQL queries. This is important to understand what SQLAlchemy can do for you. If you really don't need it, omit it.
It's as simple as that. And that's how you connected to the database. In the next section, you're going to see how to do some simple queries.
Some example queries
Everyone in the software world knows a bit of SQL. You may be more versed in SQL than I am. But no doubt you don't remember every database dialect. And you don't need to.
An ORM or Object-Relational Mapping tool makes this unnecessary. It writes the queries for you, as long as you give a good description. An ORM makes it super easy to switch databases and has many other advantages, but let's not dwell on that too much.
SELECT * FROM book
book_table = sqlalchemy.Table('book', sqlalchemy.MetaData(), autoload=True, autoload_with=engine) query = sqlalchemy.select(book) cursor_result: CursorResult = connection.execute(query) books: List[Book] = legacy_cursor_result.fetchall()
In our first example, we are going to query all records. You can use SQLAlchemy to load a table and all its info. So you don't need to know the fields to write this query.
You create the query and execute it. Finally, you can consult the results by running the fetchall method.
As you can see, you did not have to write a single letter of SQL. So if you change the database to another database type at the top of your code, your code will still work.
DELETE FROM book WHERE ISBN = 1234
query = sqlalchemy.delete(book).where(book.columns.isbn == 1234) connection.execute(query)
In our following example, we will delete a record based on a condition. In the code above, we delete the book with ISBN 1234.
Behind, SQLAlchemy translates our instruction into the SQL query.
INSERT INTO book (ISBN, TITLE) VALUES (1234, ‘A Beautiful World')
query = sqlalchemy.insert(book).values(isbn=1234, title='A Beautiful World') connection.execute(query)
The last example shows how to add a record. You add a record with ISBN 1234 and the title ‘A Beautiful World'.
What do you think? If I've piqued your interest, be sure to give it a go yourself. This is the quickest way to find out whether you like to use something or not.
Sometimes people say that an ORM is slow. But that is not the whole story. Sure, the abstraction layer above SQL adds a little overhead. It does. But this is negligible. You can write fast code with an ORM, don't worry about that. Much more important are database design choices.
My advice is always the same. Keep it simple and try to understand what you are doing. It is always recommended to leave the echo option (see the previous section) set to true. Here you can always see which queries are generated. If the query is slow, chances are high you are doing something wrong.
You have learned in this article how to connect to the database using SQLAlchemy. Then you have seen some example queries.
Now you should understand what an ORM can do for you. SQLAlchemy allows you to switch between different database dialects easily. It makes interacting with a database a pleasure.