How to use Pandas-Profiling on Google Colab

Recently, pandas have come up with an amazing open-source library called pandas-profiling. Generally, EDA starts by df.describe(), df.info() and etc which to be done separately. Pandas_profiling extends the general data frame report using a single line of code: df.profile_report() which interactively describes the statistics, you can read it more here.

However, pandas_profiling cannot be straightforwardly used on Colab. The code will result in an error, as below;

“concat() got an unexpected keyword argument ‘join axes“

This is because Google Colab comes with a pre-installed older version of Pandas-profiling (v1) and the join_axes function is deprecated in the installed Pandas version on Google Colab.

The two main commands for Google Colab are:

! pip install [https://github.com/pandas-profiling/pandas-profiling/archive/master.zip](https://github.com/pandas-profiling/pandas-profiling/archive/master.zip)


profile.to_notebook_iframe()

STEPS: Install Pandas Profiling on Google Colab.

1. Run the below command, you can visit the link on github.

! pip install [https://github.com/pandas-profiling/pandas-profiling/archive/master.zip](https://github.com/pandas-profiling/pandas-profiling/archive/master.zip)

2. Restart the kernel

3. Re-import the libraries

4. Import and read your data set

5. Define your profile report:

profile = ProfileReport(df, title=’Heart Disease’, html={‘style’:{‘full_width’:True}})

OR as snapshot below

6. However, profile.to_widgets() will not be working properly as it is not yet fully supported on Google Colab, as below snapshot :

7. Instead, change to profile.to_notebook_iframe(), as below snapshot:

and here’s your output:

Gif by Author

Pandas_profiling displays descriptive overview of the data sets, by showing the number of variables, observations, total missing cells, duplicate rows, memory used and the variable types. Then, it generates detailed analysis for each variable, class distributions, interactions, correlations, missing values, samples and duplicated rows, which you can observe by clicking each tab.

I hope this will help you to play around with Pandas profiling.

UPDATE !!

there will be an error when you try re-run your notebook, as below;

TypeError: load() missing 1 required positional argument: 'Loader'

This is because the new version of pyyaml 6.0 is not compatible with the current way Google Colab imports packages. Hence, you’ll need to change pyyaml version back to the previous version by running code below.

!pip install pyyaml==5.4.1

So grateful for stack overflow, you can find the explanation here.

I hope this will help you to play around with Pandas profiling. Happy exploring!

How to use Pandas-Profiling on Google Colab

Automated exploratory data analysis using Pandas Profiling in Jupyter on Google Colab

The two main commands for Google Colab are:

STEPS: Install Pandas Profiling on Google Colab.

1. Run the below command, you can visit the link on github.

2. Restart the kernel

3. Re-import the libraries

4. Import and read your data set

5. Define your profile report:

6. However, profile.to_widgets() will not be working properly as it is not yet fully supported on Google Colab, as below snapshot :

7. Instead, change to profile.to_notebook_iframe(), as below snapshot:

and here’s your output:

UPDATE !!

Continue Learning

Create a Shopping Cart Application with Python, Flask, and React

Streamlit and SQLite: The Ultimate Duo for Application Development

How To Make A Digital Clock in Python

Top 70+ Python Project Ideas Beginners to Expert with Free Source Code [2024]

Python FastAPI — Serving Images, MP3 Files, etc. from Your Backend for Beginners

Build a Flask CRUD Application with MVC Architecture

Main Menu

Follow Us

How to use Pandas-Profiling on Google Colab

Automated exploratory data analysis using Pandas Profiling in Jupyter on Google Colab

The two main commands for Google Colab are:

STEPS: Install Pandas Profiling on Google Colab.

1. Run the below command, you can visit the link on github.

2. Restart the kernel

3. Re-import the libraries

4. Import and read your data set

5. Define your profile report:

6. However, profile.to_widgets() will not be working properly as it is not yet fully supported on Google Colab, as below snapshot :

7. Instead, change to profile.to_notebook_iframe(), as below snapshot:

and here’s your output:

8. Save your output file in html format: so you can share as a webpage

UPDATE !!

Continue Learning

Create a Shopping Cart Application with Python, Flask, and React

Streamlit and SQLite: The Ultimate Duo for Application Development

How To Make A Digital Clock in Python

Top 70+ Python Project Ideas Beginners to Expert with Free Source Code [2024]

Python FastAPI — Serving Images, MP3 Files, etc. from Your Backend for Beginners

Build a Flask CRUD Application with MVC Architecture