Science and technology

Using pandas to plot knowledge in Python

In this sequence of articles on Python-based plotting libraries, we will have a look at an instance of creating plots utilizing pandas, the massively fashionable Python knowledge manipulation library. Pandas is an ordinary software in Python for scalably reworking knowledge, and it has additionally change into a well-liked option to import and export from CSV and Excel formats.

On high of all that, it additionally accommodates a really good plotting API. This is extraordinarily handy—you have already got your knowledge in a pandas DataBody, so why not use the identical library to plot it?

In this sequence, we’ll be making the identical multi-bar plot in every library so we are able to examine how they work. The knowledge we’ll use is UK election outcomes from 1966 to 2020:

Data that plots itself

We’ve seen some impressively easy APIs on this sequence of articles, however pandas has to take the crown.

To plot a bar plot with a bunch for every get together and 12 months on the x-axis, I merely want to do that:

    import matplotlib.pyplot as plt
    from votes import vast as df
   
    ax = df.plot.bar(x='12 months')
   
    plt.present()

Four traces—positively the tersest multi-bar plot we have created on this sequence.

I’m utilizing my knowledge in wide form, which means there’s one column per political get together:

        12 months  conservative  labour  liberal  others
zero       1966           253     364       12       1
1       1970           330     287        6       7
2   Feb 1974           297     301       14      18
..       ...           ...     ...      ...     ...
12      2015           330     232        eight      80
13      2017           317     262       12      59
14      2019           365     202       11      72

This means pandas robotically is aware of how I need my bars grouped, and if I needed them grouped in a different way, pandas makes it straightforward to restructure my DataFrame.

As with Seaborn, pandas’ plotting characteristic is an abstraction on high of Matplotlib, which is why you name Matplotlib’s plt.present() perform to truly produce the plot.

Here’s what it appears like:

Looks nice, particularly contemplating how straightforward it was! Let’s fashion it to look similar to the Matplotlib instance.

Styling it

We can simply tweak the styling by accessing the underlying Matplotlib strategies.

Firstly, we are able to coloration our bars by passing a Matplotlib colormap into the plotting perform:

    from matplotlib.colours import ListedColormap
    cmap = ListedColormap(['#0343df', '#e50000', '#ffff14', '#929591'])
    ax = df.plot.bar(x='12 months', colormap=cmap)

And we are able to arrange axis labels and titles utilizing the return worth of the plotting perform—it is merely a Matplotlib Axis object.

    ax.set_xlabel(None)
    ax.set_ylabel('Seats')
    ax.set_title('UK election outcomes')

Here’s what it appears like now:

That’s just about an identical to the Matplotlib model proven above however in eight traces of code relatively than 16! My internal code golfer could be very happy.

Abstractions have to be escapable

As with Seaborn, the flexibility to drop down and entry Matplotlib APIs to do the detailed tweaking was actually useful. This is a good instance of giving an abstraction escape hatches to make it highly effective in addition to easy.


This article relies on How to make plots using Pandas on Anvil’s weblog and is reused with permission.

Most Popular

To Top