Adventures in Machine Learning

Mastering Treemaps: A Comprehensive Guide to Visualizing Hierarchical Data in Python

Introduction to Treemaps in Python

Treemaps are an effective way of visualizing hierarchical structures of data. They can be used as an alternative to pie-charts, which are limited in their display of hierarchical data.

Treemaps offer a more organized and informative way of representing data when compared to pie-charts. In this article, we will be discussing Treemaps, their advantages, and how to use the Squarify library to plot Treemaps in Python.

Advantages of Treemaps

There are several advantages of using Treemaps over other methods of visualization:

1. Hierarchical display of data

One of the most significant advantages of Treemaps is their ability to hierarchy of data.

They help to display the information in a logical, organized manner that is easy to understand. 2.

Better representation than pie-charts

Treemaps provide a better representation of data than pie-charts. Pie-charts are limited in their ability to show hierarchical data, while Treemaps show both the hierarchy and distribution at the same time.

3. Efficient use of space

Treemaps use space effectively by organizing the data compactly.

They enable us to fit large amounts of data into a single visualization.

Using Squarify library to plot Treemaps

The Squarify library is a popular library used to plot Treemaps in Python. Installing and importing the library is the first step to plot Treemaps using Python.

You can install the library using the pip command in the terminal. You can import the Squarify library using the following line of code:

import squarify

Plotting a Basic Treemap

To start plotting a Treemap, it is necessary to generate values for the rectangles. The values generated represent the sizes of the rectangles that will be displayed in the Treemap.

You can generate the sizes using any form of data. The data can be numerical or categorical, and it can be in the form of a Pandas data frame or a Python list.

Once you have generated the sizes, you can use the squarify.plot() function from the Squarify library to visualize the Treemap. The plot() function takes three arguments, the sizes of the rectangles, the color of the rectangles, and the label for each rectangle.

Example:

import matplotlib.pyplot as plt

import squarify

sizes = [50, 25, 20, 5]

colors = [‘#4285F4’, ‘#DB4437’, ‘#F4B400’, ‘#0F9D58’]

labels = [‘Label 1’, ‘Label 2’, ‘Label 3’, ‘Label 4’]

plt.figure(figsize=(6, 6))

squarify.plot(sizes=sizes, color=colors, label=labels, alpha=.7)

plt.axis(‘off’)

plt.show()

In the above example, we have generated sizes for four rectangles and assigned colors and labels to each rectangle. Then we have plotted the Treemap using the Squarify library and visualized it using the matplotlib library.

Conclusion

Treemaps are a powerful tool for visualizing hierarchical data. They offer a better representation of data than pie-charts and help to display data in a hierarchical manner.

The Squarify library is a popular library used to plot Treemaps in Python. It is easy to install and use, and it helps to create customizable and attractive graphics.

With the help of the Squarify library, it is possible to generate informative Treemaps that can provide invaluable insights into a dataset.

Adding Labels to the Treemap

Adding labels to the rectangles in the Treemap helps to give context to the data and makes it easier to understand at a glance. The squarify library allows for labels to be added to the rectangles very easily.

The syntax is almost similar to that of the plotting function. Example:

import matplotlib.pyplot as plt

import squarify

sizes = [50, 25, 20, 5]

labels = [‘Label 1’, ‘Label 2’, ‘Label 3’, ‘Label 4’]

plt.figure(figsize=(6, 6))

squarify.plot(sizes=sizes, label=labels, alpha=.7)

plt.axis(‘off’)

plt.show()

In the above example, we have introduced labels to the Treemap. We have generated the sizes of the rectangles and assigned them labels.

We then plot the Treemap using the squarify library and add the labels using the label parameter. To adjust the transparency of the labels, we have added the alpha parameter.

Random Selection of Rectangle Colors

In some cases, it is not necessary to assign specific colors to the rectangles in the Treemap. Instead, we can assign colors randomly to the rectangles.

The Squarify library allows us to do so with a little addition of code. We can add random colors to the rectangles to make a visually appealing Treemap.

Example:

import matplotlib.pyplot as plt

import squarify

import random

sizes = [50, 25, 20, 5]

labels = [‘Label 1’, ‘Label 2’, ‘Label 3’, ‘Label 4’]

colors = [plt.cm.Spectral(i/float(len(sizes))) for i in range(len(sizes))]

random.shuffle(colors)

plt.figure(figsize=(6, 6))

squarify.plot(sizes=sizes, label=labels, alpha=.7, color=colors)

plt.axis(‘off’)

plt.show()

In the above example, we have created a color scheme by generating a list of colors using the plt.cm.Spectral function and specified the size of the list to be the same as the number of rectangles. We then shuffle the list of colors to create a random color scheme for each rectangle.

Finally, we pass the color list to the plot function as a parameter to create the Treemap with random rectangle colors.

Changing Colors in the Treemap

There are times when we need to specify the colors in the Treemap manually. Different color schemes can be created depending on the type of data displayed.

The following code generates a list of colors that we can use to set custom colors for the rectangles in the Treemap. Example:

import matplotlib.pyplot as plt

import squarify

sizes = [50, 25, 20, 5]

labels = [‘Label 1’, ‘Label 2’, ‘Label 3’, ‘Label 4’]

colors = [‘#4285F4’, ‘#DB4437’, ‘#F4B400’, ‘#0F9D58’]

plt.figure(figsize=(6, 6))

squarify.plot(sizes=sizes, label=labels, alpha=.7, color=colors)

plt.axis(‘off’)

plt.show()

In the above example, we have created a list of colors and passed it as a parameter to the plot function using the ‘colors’ parameter. We have assigned different colors to different labels in the Treemap.

This technique can be useful when displaying data with a specific theme or color theme, which is useful when displaying the Treemap on a website or a presentation slide.

Conclusion

In this article, we have covered three topics related to Treemaps – adding labels to the rectangles, randomly selecting rectangle colors and changing colors in a Treemap manually. The Squarify library makes plotting Treemaps a simple and efficient process, and the examples shown above provide a good starting point for anyone looking to use Treemaps as an alternative to pie charts.

By using Python, users can create attractive and informative Treemaps with just a few lines of code. With the right combination of size, color and label, Treemaps can provide valuable insights into data that might be otherwise challenging to analyze.

Turning off the Plot Axis

By default, when we plot a Treemap using Matplotlib, we get the axis displayed on the plot window. In most cases, the axis may not be necessary, and it could interfere with the readability of the Treemap.

To remove the axis, we can add the `plt.axis(‘off’)` statement to the code. It can also be useful to increase the plot area to make the Treemap more visible when the axis is turned off.

Example:

import matplotlib.pyplot as plt

import squarify

sizes = [50, 25, 20, 5]

labels = [‘Label 1’, ‘Label 2’, ‘Label 3’, ‘Label 4’]

plt.figure(figsize=(8, 8))

squarify.plot(sizes=sizes, label=labels, alpha=.7)

plt.axis(‘off’)

plt.show()

In the above example, we have increased the plot dimensions and added the `plt.axis(‘off’)` statement to remove the axis. This makes the Treemap more visible and emphasizes the data’s importance and layout.

Plotting Treemap for a Dataset

Now, let’s look at how we can use the Squarify library to plot a Treemap for a real dataset. We will be using the Titanic dataset from the Seaborn library.

The dataset contains information about the passengers of the Titanic, including their survival status, age, gender, and class. First, we need to import the Seaborn library and the Titanic dataset.

Once we have the data, we can prepare it for plotting. We will be using the number of passengers that survived or died, grouped by gender and class, to plot the Treemap.

Example:

import seaborn as sns

import pandas as pd

import squarify

titanic = sns.load_dataset(‘titanic’)

titanic = titanic.groupby([‘sex’, ‘class’, ‘alive’], as_index=False).count()

titanic = titanic[[‘sex’, ‘class’, ‘alive’, ‘survived’]]

titanic.columns = [‘Gender’, ‘Class’, ‘Alive’, ‘Count’]

print(titanic)

In the above code, we have loaded the Titanic dataset using the `sns.load_dataset(‘titanic’)` statement. We then filtered the data to get the survival count by gender and class and assigned the results to the `titanic` variable.

We then selected only the columns we need for plotting, renamed the columns, and displayed the resulting data to confirm everything is prepared. Next, we generate the sizes, labels, and colors that we will use to plot the Treemap using the Squarify library.

Example:

import matplotlib.pyplot as plt

sizes = titanic[‘Count’].tolist()

labels = titanic[‘Gender’] + ‘ – ‘ + titanic[‘Class’] + ‘ – ‘ + titanic[‘Alive’]

colors = [plt.cm.Spectral(i/float(len(sizes))) for i in range(len(sizes))]

plt.figure(figsize=(12, 10))

squarify.plot(sizes=sizes, label=labels, color=colors, alpha=.7)

plt.axis(‘off’)

plt.show()

In the above code, we have generated the sizes of the rectangles from the ‘Count’ column of the dataset and assigned values to the ‘labels’ variable. We have also assigned colors to each rectangle and increased the plot area to get a better view of the Treemap.

The resulting Treemap shows the survival counts by gender and class in the Titanic dataset. The visualization is clear and easy to understand, with the largest rectangles representing the higher survival counts.

Conclusion

In this article, we have discussed how to turn off the plot axis and how to plot a Treemap using real data from a dataset. Treemaps are powerful visualization tools that allow us to display hierarchical data in a clear and organized manner.

By using Python and a few lines of code, we can create informative Treemaps that can provide invaluable insights into a dataset. With the Squarify library and Matplotlib, users can customize the appearance, size, and color of the Treemap to showcase data in a way that is visually appealing and easy to understand.

Treemaps are a powerful way of visualizing hierarchical data, offering users a better representation of data than pie-charts. In this article, we have covered various aspects of Treemaps, including their advantages, plotting Treemaps using the Squarify library, adding labels and different color schemes, and plotting Treemaps with real datasets.

With the help of Squarify and Python, users can create informative and visually appealing Treemaps with just a few lines of code. The flexibility and customization offered by Treemaps make them an essential tool for any data analyst or scientist that needs to display hierarchical data in a clear and orgnized manner.

Popular Posts