Explore the future of Web Scraping. Request a free invite to ScrapeCon 2024

Import a CSV file into an Oracle database with Python

Using iris data, we will learn how to import a CSV file into an Oracle database using Python

Image source: hip

Prerequisites

Python 3.8.3, Oracle 11.2.0.2.0, cx_Oracle

Installing the cx_Oracle Python Package

python -m pip install **cx_Oracle**— upgrade

To perform this task, we will need to:

  1. Prepare or identify your data

  2. Connect to Oracle database and create a table

  3. Import the CSV data into the Oracle database

Step 1. Prepare or identify your data

To begin, prepare or identify the CSV file that you’d like to import to Oracle database. For example, I loaded iris data from GitHub.

import pandas as pd
irisData = pd.read_csv(‘https://github.com/Muhd-Shahid/Write-Raw-File-into-Database-Server/raw/main/iris.csv',index_col=False)
irisData.head()

image

Where, data is the comma delimiter. Each line of the CSV file is terminated by a newline character.

Step 2. Connect to the Oracle and create a table

Now let’s create connection object to connect to Oracle. The connect() constructor creates a connection to the Oracle server and returns a Connection object conn.

First, we will insure that column and datatype parameters in the Oracle table that we will create must match the number of columns and data types of the CSV file.

Data types of iris Data

Data types of iris Data

import cx_Oracle as orcCon
from cx_Oracle import DatabaseError
try:
    conn = orcCon.connect('shahid/Oracle@123@localhost/')
    if conn:
        print("cx_Oracle version:", orcCon.version)
        print("Database version:", conn.version)
        print("Client version:", orcCon.clientversion())

        # Now execute the sqlquery
        cursor = conn.cursor()
        print("You're connected.................")

        # Drop table if exists
        print('Droping iris table if exists............')
        cursor.execute("BEGIN EXECUTE IMMEDIATE 'DROP TABLE iris'; EXCEPTION WHEN OTHERS THEN NULL; END;")

        print('Creating table iris............')
        cursor.execute("CREATE TABLE iris (sepal_length number(3,1) NOT NULL, sepal_width number(3,1) NOT NULL, petal_length number(3,1) NOT NULL, petal_width number(3,1),species varchar2(10) NOT NULL)")
        print("iris table is created..............")
except DatabaseError as e:
    err, = e.args
    print("Oracle-Error-Code:", err.code)
    print("Oracle-Error-Message:", err.message)

finally:
    cursor.close()
    conn.close()

Step 3. Import the CSV data into the Oracle database

3.1. Inserting Pandas DataFrames Into Databases Using INSERT

import cx_Oracle as orcCon
from cx_Oracle import DatabaseError
try:
    #orcCon.connect('username/password@localhost')
    conn = orcCon.connect('shahid/Oracle@123@localhost/')
    if conn:
        print("cx_Oracle version:", orcCon.version)
        print("Database version:", conn.version)
        print("Client version:", orcCon.clientversion())
        cursor = conn.cursor()
        print("You're connected: ")
        print('Inserting data into table....')
        for i,row in irisData.iterrows():
            sql = "INSERT INTO iris(sepal_length,sepal_width,petal_length,petal_width,species) VALUES(:1,:2,:3,:4,:5)"
            cursor.execute(sql, tuple(row))
        # the connection is not autocommitted by default, so we must commit to save our changes
        conn.commit()
        print("Record inserted succesfully")
except DatabaseError as e:
    err, = e.args
    print("Oracle-Error-Code:", err.code)
    print("Oracle-Error-Message:", err.message)

finally:
    cursor.close()
    conn.close()

3.1.1. Query the database to check our work

let’s query the database to make sure that our inserted data has been saved correctly.

conn = orcCon.connect('shahid/Oracle@123@localhost/')
cursor = conn.cursor()
# Execute query
sql = "SELECT * FROM iris"
cursor.execute(sql)

# Fetch all the records
result = cursor.fetchall()
for i in result:
    print(i)

cursor.close()
conn.close()

image

3.2. Inserting Pandas DataFrames into a Database Using the to_sql() Function

# import the module
from sqlalchemy import create_engine
# create sqlalchemy engine
engine = create_engine('oracle://shahid:Oracle@123@localhost')
# Insert whole DataFrame into Oracle DB
irisData.to_sql('iris', con = engine, if_exists = 'append', chunksize = 1000,index=False)
print("Record inserted successfully")

What’s happening here?

  • iris is the name of table into which we want to insert our DataFrame.

  • con = engine provides the connection details (recall that we created engine using our authentication details in the previous step).

  • if_exists = 'append'checks whether the table we specified already exists or not, and then appends the new data (if it does exist) or creates a new table (if it doesn’t).

  • chunksize writes records in batches of a given size at a time. By default, all rows will be written at once.

3.2.1. Query the database to check our work

Again, let’s query the database to make sure that our inserted data has been saved correctly.

from sqlalchemy import create_engine
import pandas as pd
engine = create_engine('oracle://shahid:Oracle@123@localhost')
conn = engine.connect()
data = conn.execute("SELECT * FROM iris")
irisdf = pd.DataFrame(data.fetchall())
irisdf.columns = data.keys()
print(irisdf.head())
conn.close()

image

Conclusion

In this tutorial, We have discussed how to import Pandas DataFrames into Oracle databases using two different methods, including the highly efficient to_sql() method.

All the code for this article is available as a Jupyter Notebook on GitHub.

Thanks for reading!

Further reading: Pandas Cheatsheet.




Continue Learning