In today’s technology-driven workplace, data processing is a necessary skill for new professionals, programmers, and managers. In this post, we’ll go over how to retrieve data from a database in Python and introduce you to some basic concepts.
Extraction of Data from a Python Database
Data extraction involves retrieving information from a variety of databases, processing it as required, and transferring it to repositories for further study. As a result, some kind of data transformation occurs during the process. Python is one of the most common programming languages for data science. This general-purpose and scripting language is used by about 8.2 million people around the world.
We’ll go into how to extract data using PostgreSQL, an open-source relational database system, in the following guide. It has a ROW TO JSON function that returns the result sets as JSON objects with curly braces around them. JSON data forms can make it easier to modify query answers. But first, make sure you have a virtual environment enabled, such as psycopg2-binary.
Basics of Python Databases
Assume you have a database of the American National Football League in PostgreSQL (NFL). This will provide statistics on the players, coaches, and the standings of the various teams. Also, take note of the following information to learn more about the data:
- The athlete id, which is the primary key, as well as the players’ first and last names, jersey numbers, weight (in kg), height (in m), and country of origin are all stored in the players’ data table. It also contains the team id, a foreign key that indicates which team each athlete belongs to.
- Coach id (primary key), along with first and last names, and team id (a international key referencing the teams’ table field) make up the data table on coaches.
- Finally, there’s the teams’ table, which lists each football team’s name, league, level, and overall wins and losses (divided into “home” and “away” categories). The primary key in this table is team id, which is also included in the tables above.
Let’s look at how to write a SQL query to get a list of teams now that you’re familiar with the dataset. For eg, you can need football teams to be sorted by conference and rank. You can also have the number of athletes or players on – roster, as well as their coaches’ names. You may also be interested in the number of victories and defeats the teams have had at home and away.
To get started, follow the steps below:
SELECT f.name, f.city, f.conference, f.conference_rank, COUNT(a.player_id) AS number_of_athletes, CONCAT(c.first_name, ‘ ‘, c.last_name) AS coach, f.home_wins, f.away_wins FROM athletes a, teams f, coaches c WHERE a.team_id = f.team_id AND c.team_id = f.team_id GROUP BY f.name, c.first_name, c.last_name, f.city, f.conference, f.conference_rank, f.home_wins, f.away_wins ORDER BY f.conference, f.conference_rank
After that, you can use the JSON function we listed earlier (ROW TO JSON) to warp the query. The details will be saved to a query.sql file in your current directory. Now follow the measures shown below.
SELECT ROW_TO_JSON(team_info) FROM ( SELECT f.name, f.city, f.conference, f.conference_rank, COUNT(a.athelete_id)AS number_of_atheletes, CONCAT(c.first_name, ‘ ‘, c.last_name) AS coach, f.home_wins, f.away_wins FROM athletes a, teams f, coaches c WHERE a.team_id = f.team_id AND c.team_id = f.team_id GROUP BY f.name, c.first_name, c.last_name, f.city, f.conference, f.conference_rank, f.home_wins, f.away_wins ORDER BY f.conference, f.conference_rank ) AS team_info
Each row has the form of a Python dictionary, as you can see. The keys are simply the names of the fields returned by your questions.
You can also make some changes to your initialization files to avoid exposing your environment variables in plain sight. Depending on the requirements, you can use any of the following methods:
- For Windows users: Environment variables – Control panel – System – Advanced System Settings – Advanced Tab
- For a Unix-like environment, use the following commands: To your initialization file, add two lines with your username and password.
- With this, you’re ready to start writing Python code. To avoid errors, we will import some modules and functions right away. These statements will assist you in doing so:
os import import psycopg2 as p import Error from psycopg2
Then, by loading the contents of query.sql, we’ll create the relation. Use the open and read commands to open the SQL database file, and use the link feature to connect to the NFL database by defining the database account, password, host, and port number.
How to Fetch Data From a Database in Python?
After you’ve built a database link, you can start running queries. You’ll need to use the ‘cursor’ control structure. It’s as simple as typing “cursor = conn.cursor()” and then “cursor.execute(query).” The result will be a dictionary-formatted array of tuples (one-element).
result = cursor.fetchall()
You should try iterating over the outcome at this stage. You can manipulate the contents however you like, like inserting or feeding them into spreadsheets, HTML tables, and other applications. When you’re finishing up, don’t forget to wrap and clean up your code. You can do this by using a try-except-block and adding a phrase that says “finally.”
When working with large databases, whether relational or not, you’ll need some simple tools to query the tables, particularly if you’ll be manipulating the results. Python makes this kind of data transformation easy.
We learned how to bind to a relational database, run queries, and import reports in this python database tutorial. Python allows you to do a lot more and customise your code to do whatever you want.