Overview of Databases
Contents
Overview
Databases are ubiquitous, which means, they are everywhere. There are databases on your smartphone. There are databases behind every social media platform (Facebook, Twitter, Snapchat, etc.). There are databases in nearly every modern computer game. There are databases used in nearly every business that exists today. And the list goes on and on where you will find databases. It is critical for programmers to gain an understanding of what databases are, how to work with them, and how software and businesses use them to manage their operations. In this section, we are going to learn how to connect to a database from Python, how to read data from the database, and how to add data to a database from Python as well.
Databases were invented over 50 years ago. Today, there are many different types of databases and companies that provide database products and solutions. But what is a database?
This is a very generic definition of a database. It is important to understand that there are many types of databases, which means it is difficult to provide a single all-encompassing definition. So, rather than attempt to arrive at a detailed definition, we will focus on relational databases which are very common in business applications. Keep in mind though that while we will focus on relational, you may find yourself working with any of the other types in your career.
Relational databases are called relational because they store data in structures called tables and the data in those tables are (usually) related in some way. For example, in a system that tracks students at a college, that might have a table called students that stores student information (name, address, etc.). We might also have another table called majors that stores majors that each students declare (majors here plural because a student can major in more than one field of study). We could also have a table called registration that tracks what classes each student has registered for each semester. The full system would contain many tables, all of which are related to each other in various ways. Those tables would be related by ID values (called keys) so that we can easily find any student, all of their majors, the classes they have registered for and may other detailed relationships. We can visualize this example using a diagramming method called an Entity Relationship Diagram (ERD) as seen below in Figure 1.

Relational databases use a programming language called Structured Query Language (SQL) to create queries (questions) that we ask to obtain results from the database, like the example I used above of the customer and their orders. Inside Python (and other programming languagess), we can connect to a relational database, send it a SQL statement (query, update, insert, etc.) and the database will run that SQL statement and return any results to our program. This process is very common in software development for desktop applications, mobile applications, web applications, etc. That process looks like this:

In this example, the grading system data is stored in the grading_system.db SQLite database. In the Python code, the program makes a connection to the database and passes a SELECT statement (a query) to the database asking for all students with a last name of 'Parks'. The database processes the query and returns a result set, that is, a set of all records that match the query specification (last_name = 'Parks'). In this example, there is only one student with that last name so the result set contains one record.
Don't worry about understanding the above code yet, we will cover programmatically connecting to and querying database in detail over the next several sections.
So far all of the references above to databases have been generic, but where do these databases come from? A database resides in a database engine (a.k.a. "database server") which is software that usually runs on a hardware server or in a cloud-based service. There are numerous vendors that produce database server software. Here is a short list of common database vendors and their database products. There are many other vendors and databases available, those listed below have versions available for free that you can download and practice with as you learn more about programming.
Vendor | Product(s) * |
---|---|
Microsoft |
SQL Server
SQL Server is Microsoft's flagship database system very widely used in business. They make a developer edition available for free download. You can find it here--be sure to choose the Developer Edition from that download page. Also, they provide an Integrated Development Environment (IDE) for interactively working with the database server called SQL Server Management Studio (SSMS) which you can download for free here Also, to interact with SQL Server from Python you'll want to download and install the Python SQL Driver as well. ![]() |
Open Source |
MariaDB
MariaDB is an open source database system that is widely used and a good choice for developers and students. You can download MariaDB from here. For an IDE, you can use MySQL Workbench or dbForge Studio. Also, to interact with MariaDB from Python you'll want to download the Python connector as well. ![]() |
Open Source |
PostgreSQL
PostgreSQL is also a popular open source database system that is widely used and well supported. You can download PostgreSQL from here. The most common IDE for PostgreSQL is pgAdmin which you can download for free here. ![]() |
Open Source |
SQLite
SQLite is a database engine that is very commonly used by developers to embed database functionality into their applications. SQLite is not an application in the traditional sense that you download and install, rather you download a set of library files and then use those libraries from within a programming language. You can also use an additional tool called SQLite Studio to interact with a SQLite database using a GUI front end. You can download SQLite from here and SQLite Studio from here. Also, Note: I have placed more detailed documentation on this site for SQLite & SQLite Studio on the Tools Page here. ![]() |
Oracle |
MySQL
MySQL is a very popular database system, particularly for web applications. Most hosting companies include MySQL databases as a component part of the tools they make available when hosting websites on their platforms. You can download MySQL Community Edition (free) from here. Also, I would recommend using MySQL Workbench for working with MySQL datatabases, which you can download from here. ![]() |
* Note: The links provided above were accurate as of April 2023.
The word environment is often used to refer to the combination of hardware and software working together for a particular purpose. A production environment is a term used to indicate the hardware and software used during the actual operations of a business or system. In production systems, databases are installed and configured on one or more servers on a network or in cloud-based systems. The database engine then usually runs 24/7 so that it is available to processes, users, and administrators at all times. Those systems are usually configured with tight security and access to it is restricted by Database Administrators to keep the data secure.
With the database engine running, applications can then interact with the database engine as needed and configured. Some examples of applications that might interact with the database include an accounting clerk using Microsoft Excel to calculate summaries of sales and generate graphs. Another may be a data analyst using Tableau to generate a management dashboard to display real-time trends based on transaction data in the operational database. A web developer writes Pythonscript code to process web form data entered by users that are added to the database. A Software Developer writes an application in Python used inside the company by the Marketing Department to allow sales agents to enter data about potential clients into the database. The company also maintains a data warehouse that is populated once per night by an automation process that collects daily transaction data from the operational database and loads the analytical data warehouse. Each morning an automated reporting process generates summary reports using Microsoft PowerBI that are automatically emailed to executives for review. As a programmer, you may be involved in the development of applications and processes similar to any of the examples given above, or others. Here is a visual representation of the scenarios described above:

In contrast to a production environment, a development environment is a combination of hardware and software used by software developers to write and test applications that they are creating. Development environments are often set up to simulate the production environment where the software developer's application will run once it is complete. We will create a development environment that will combine your computer (laptop or desktop) with a database engine and a SQL editor in which you will work with the database in conjunction with your Python code. All of this will be combined in your computer to create your development environment and with that configuration you will be able to create databases and interact with them, simulate a production environment similar to the diagram above. Development environments look something like these:

In this example, the development environment includes the Python programming language, Oracle MySQL database, its Workbench (GUI) tool, the PyCharm IDE, Tableau, Excel and LucidChart.

In this example, the development environment includes the Python programming language, Microsoft SQL Server database, its Management Studio (GUI) tool, the PyCharm IDE, Tableau, Excel and LucidChart.