Adventures in Machine Learning

Mastering Horizontal Bar Charts with Matplotlib and Pandas

Creating Horizontal Bar Chart using Matplotlib and Pandas

Data visualization is an essential aspect of data analysis, and it involves representing data in graphical form to help better understand and interpret the data. Bar charts are popular data visualization tools used to represent data sets in a graphical form.

Bar charts can take different forms, with the horizontal bar chart being one of them. In this article, we will explore how to create horizontal bar charts using two popular visualization libraries in Python – Matplotlib and Pandas.

Creating a Horizontal Bar Chart using Matplotlib

The first step in creating a horizontal bar chart using Matplotlib is gathering the data to be plotted. Matplotlib can handle different types of data, such as integers, floats, and categorical data.

In this example, we will use categorical data.

import matplotlib.pyplot as plt
# data
x = ['John', 'Mary', 'Peter', 'Lucy', 'Chris']
y = [90, 87, 68, 77, 80]
# plot the horizontal bar chart
plt.barh(x, y)
plt.show()

In the code above, we imported the Matplotlib library, defined our data, created a horizontal bar chart, and displayed the chart using the show() method.

The barh() method is used to plot the horizontal bar chart, and it takes two arguments: the data for the y-axis (in this case, the names of the individuals), and the data for the x-axis (in this case, the marks scored). Styling the chart is an optional step, but it is necessary to make the chart visually appealing and easy to understand.

We could add different styling settings such as changing the color of the bars, adding titles and labels to the axis or adding a grid to the chart.

Creating a Horizontal Bar Chart using Pandas

Importing the necessary libraries is the first step in creating a horizontal bar chart using Pandas.

import pandas as pd
import matplotlib.pyplot as plt
# data
data = {'names': ['John', 'Mary', 'Peter', 'Lucy', 'Chris'], 'marks': [90, 87, 68, 77, 80]}
# dataframe
df = pd.DataFrame(data)
# plot the horizontal bar chart
df.plot.barh(x='names', y='marks')
plt.show()

In the code above, we first import the Pandas and Matplotlib libraries, define our data, create a data frame from the data and plot the horizontal bar chart passing the data to be plotted. The plot is done using the plot() method with the arguments barh to specify the type of chart and x and y to specify the columns containing the data for the chart.

Conclusion

Bar charts are effective data visualisation tools that help to summarise and represent data in graphical form. Creating horizontal bar charts using Matplotlib and Pandas is an easy process that provides different ways to data visualisation, which are useful for different aspects of data analysis.

Using Matplotlib, we can customize the chart to suit our preferences and style, while Pandas offers easy integration with other Pandas methods and an intuitive way of working with data frames. Explore these different possibilities to create insightful and informative charts for your data analysis tasks.

Adding an additional variable to the horizontal bar chart

To create more informative horizontal bar charts, we sometimes need to include additional data or variables to the chart. For example, a sales chart may include additional data such as profits or the percentage increase/decrease in sales.

In this section, we will explore how to add an additional variable to our horizontal bar chart using Pandas.

Capturing additional data for the chart

We will extend our previous example by adding another column to our data that represents a performance rating of each person.

import pandas as pd
import matplotlib.pyplot as plt
# data
data = {'names': ['John', 'Mary', 'Peter', 'Lucy', 'Chris'], 'marks': [90, 87, 68, 77, 80], 'rating': ['Excellent', 'Good', 'Fair', 'Satisfactory', 'Good']}
# dataframe
df = pd.DataFrame(data)
# plot the horizontal bar chart
df.plot.barh(x='names', y='marks')
plt.show()

In the code above, we added a new column, ‘rating,’ to our data, which represents the performance rating of each individual. We then passed the data to create a data frame and plotted the horizontal bar chart using the plot() method with the argument x as the category column ‘names’, the y column as ‘marks,’ and the rating column not being passed as an argument since it is not being plotted.

Modifying the Pandas dataframe to include additional data

To include the additional data, we first need to modify our data frame to include the new rating column.

# add rating column to dataframe
df['rating'] = pd.Series(['Excellent', 'Good', 'Fair', 'Satisfactory', 'Good'])
# plot the updated horizontal bar chart
df.plot.barh(x='names', y='marks', color='b', legend=False)
# add rating to each bar
for index, value in enumerate(df['rating']):
    plt.text(value+2, index, str(value))
plt.show()

Using the Series() method, we added the new rating column to the data frame, and then modified the horizontal bar chart in the plot() method by adding the color argument to set the colours of the chart and legend argument to remove the legend.

We then added an additional for loop to loop over each bar in the chart and add the corresponding rating using the text() method.

Styling the horizontal bar chart

Styling the horizontal bar chart is an important step in making it visually appealing and easy to read. Matplotlib and Pandas offer different ways to style the chart.

Adding style to the chart

Matplotlib comes with a plethora of style options that we can use to style the chart. A simple example is to add a title and axis labels to the chart.

import matplotlib.style as style
# set the style
style.use('seaborn')
# plot the horizontal bar chart with title and axis labels
ax = df.plot.barh(x='names', y='marks', color='b', legend=False)
ax.set_title('Marks scored in the exam')
ax.set_xlabel('Marks')
ax.set_ylabel('Names')
# add rating to each bar
for index, value in enumerate(df['rating']):
    ax.text(value+2, index, str(value))
plt.show()

In the code above, we used Matplotlib.style to set the style of the chart to ‘seaborn,’ a popular style option. We then used set_title(), set_xlabel(), and set_ylabel() methods to add a title and axis labels to the chart.

Using different styles in Matplotlib and Pandas

Pandas also allows us to use different styles for the charts. Many of the styles available in Matplotlib can be used in Pandas, including some styles that are designed specifically for Pandas.

# set the style
plt.style.use('ggplot')
# plot the horizontal bar chart with different style
df.plot.barh(x='names', y='marks', color='g', legend=False)
plt.title('Marks scored in the exam')
plt.xlabel('Marks')
plt.ylabel('Names')
# add rating to each bar
for index, value in enumerate(df['rating']):
    plt.text(value+2, index, str(value))
plt.show()

In the code above, we set the style of the chart to ‘ggplot’ using the plt.style.use() method and plotted the horizontal bar chart with different colours. We then used the plt.title(), plt.xlabel(), and plt.ylabel() methods to add a title and axis labels to the chart.

Conclusion

Adding additional variables to our horizontal bar charts can help make the charts more informative and insightful by providing context to our data. We can do this by making modifications to our data frames in Pandas and then incorporating the additional variables into our charts.

Styling the charts is also an essential step in creating visually appealing and easy-to-understand charts. By using the different styling options available to us in Matplotlib and Pandas, we can create unique and professional-looking charts for our data analysis.

Conclusion

In this article, we explored how to create horizontal bar charts using two popular data visualization libraries in Python – Matplotlib and Pandas. We discussed the steps involved in creating a horizontal bar chart using each library – gathering data, plotting the chart, adding an additional variable, and styling the chart.

Horizontal bar charts are an effective way to represent categorical data. By using Python’s data visualization libraries, we can create horizontal bar charts that effectively communicate the information that is present in the data.

We started by using Matplotlib to create a simple horizontal bar chart. Matplotlib is an excellent choice when we need complete control over the chart’s appearance.

We can customize every aspect of the chart using the numerous methods available in the Matplotlib library.

We then moved on to using Pandas to create our horizontal bar chart.

Pandas provides users with an intuitive and straightforward way to work with data frames, which makes it easy to create charts and integrate them with other Pandas methods. By creating data frames from our data, we can easily explore and visualize the data using the Pandas library.

We also looked at how we could add an additional variable to our horizontal bar chart using Pandas. This additional variable can provide more information about the data being represented in the chart, making it more informative and insightful.

Styling our horizontal bar chart is also crucial in making it appealing and easy to read. We discussed the different styling options that are available in Matplotlib and Pandas.

In conclusion, horizontal bar charts are simple yet powerful data visualization tools. Whether we use Matplotlib or Pandas, we can easily create horizontal bar charts that help us communicate the data’s information in a way that is intuitive and straightforward.

As data continues to grow in size and complexity, creating effective data visualizations is becoming increasingly essential. By mastering these tools and techniques, we can not only explore the data on a deeper level but also communicate our findings more effectively to others.

In summary, this article explored how to create horizontal bar charts using Matplotlib and Pandas. We covered the steps involved, including gathering data, plotting the chart, adding an additional variable, and styling the chart.

Horizontal bar charts are powerful data visualization tools that provide a quick and easy way to represent categorical data. By mastering the tools and techniques discussed in this article, we can create compelling visualizations that communicate data insights effectively to others.

As such, creating effective data visualizations will continue to be an important skill for data analysts and scientists.

Popular Posts