Box Plot in Python using Matplotlib

What is Box Plot?

A Box plot is a way to visualize the distribution of the data by using a box and some vertical lines. It is known as the whisker plot. The data can be distributed between five key ranges, which are as follows:

  1. Minimum: Q1-1.5*IQR
  2. 1st quartile (Q1): 25th percentile
  3. Median:50th percentile
  4. 3rd quartile(Q3):75th percentile
  5. Maximum: Q3+1.5*IQR

Here IQR represents the InterQuartile Range which starts from the first quartile (Q1) and ends at the third quartile (Q3).

Box Plot visualization

Box Plot in Python using Matplotlib

In the box plot, those points which are out of range are called outliers. We can create the box plot of the data to determine the following:

  • The number of outliers in a dataset
  • Is the data skewed or not
  • The range of the data

The range of the data from minimum to maximum is called the whisker limit. In Python, we will use the matplotlib module's pyplot module, which has an inbuilt function named boxplot() which can create the box plot of any data set.

Syntax:

In the boxplot() function, we have a lot of attributes which can be used to create a more attractive and amazing box plot of the data set.

  • data: The data should be an array or sequence of arrays which will be plotted.
  • notch: This parameter accepts only Boolean values, either true or false.
  • vert: This attribute accepts a Boolean value. If it is set to true, then the graph will be vertical. Otherwise, it will be horizontal.
  • position: It accepts the array of integers which defines the position of the box.
  • widths: It accepts the array of integers which defines the width of the box.
  • patch_artist: this parameter accepts Boolean values, either true or false, and this is an optional parameter.
  • labels: This accepts the strings which define the labels for each data point
  • meanline: It accepts a boolean value, and it is optional.
  • order: It sets the order of the boxplot.
  • bootstrap: It accepts the integer value, which specifies the range of the notched boxplot.

Example1:

We will create the random data set of the numpy array and create the box plot.

Output:

Box Plot in Python using Matplotlib

Explanation:

In the above code, first of all, we have imported the numpy and matplotlib libraries in the code. Then we created the random dataset and plotted the box plot using the boxplot() function.

Example2:

We can create multiple box plots simultaneously in the same file.

Output:

Box Plot in Python using Matplotlib

Explanation:

In the above code, we have four data sets using random methods of numpy. Then we have created the list of the four data sets and use this inside boxplot() function.

Example 3:

We can use some attributes of the boxplot() function to customize the plot.

Output:

Box Plot in Python using Matplotlib

Explanation:

In the above code, we have created the four datasets using random functions and set them in a list. Now we have set the different colors for each box plot using the list of colors and using the function set_facecolor().

We have set the line width of each box plot and also set the labels for each box plot. We have set attribute vert =0, which means all the plots will be in horizontal mode.






Latest Courses