Last Updated on: 30th May 2024, 03:27 pm
Matplotlib in Python offers a versatile toolkit for creating various types of visualizations. Line plots are ideal for showcasing trends over time, while scatter plots reveal relationships between variables.
Bar charts excel in comparing discrete categories, histograms display data distributions, and box plots identify central tendency and outliers.
Pie charts provide a visual representation of proportions, and area plots depict cumulative data or compositions. Violin plots combine box plots and kernel density plots to show data distribution.
3D plots visualize complex relationships in three-dimensional space, and heatmaps represent data using color gradients.
Each visualization serves distinct purposes, from exploring trends and relationships to analyzing distributions and compositions, catering to diverse data analysis needs.
Line Plot
Use Case: Line plots are commonly used to visualize trends over time or to show the relationship between two variables. They are useful for displaying continuous data points.
import matplotlib.pyplot as plt # Sample data x = [1, 2, 3, 4, 5] y = [2, 3, 5, 7, 11] # plot plt.plot(x, y) plt.title('Line Plot') plt.xlabel('x') plt.ylabel('y') plt.show()
Explanation:
plt.scatter(x, y, color='red', marker='o', label='Data Points')
: Plots the scatter plot with given x and y values.color
specifies the color of the points,marker
specifies the marker style, andlabel
assigns a label to the data points.plt.legend()
: Displays the legend.- Other lines have the same meaning as in the Line Plot.
Scatter Plot
Use Case: Scatter plots are useful for visualizing the relationship between two variables. They help identify correlations or patterns in the data and are particularly effective when dealing with a large number of data points.
# Sample data x = [1, 2, 3, 4, 5] y =[2, 3, 5, 7, 11] #Scatter plot plt.scatter(x, y) plt.title('Scatter Plot') plt.xlabel('x') plt.ylabel('y') plt.show()
Explanation:
import matplotlib.pyplot as plt
: Imports thematplotlib
library.x
andy
: Define the data to be plotted.plt.scatter(x, y)
: Creates a scatter plot withx
andy
data points.plt.title('Scatter Plot')
: Sets the title of the plot.plt.xlabel('x')
andplt.ylabel('y')
: Labels the x and y axes.plt.show()
: Displays the plot.
Bar Chart
Use Case: Bar charts are effective for comparing categorical data. They are often used to display discrete data points and show the relative sizes of different categories.
import matplotlib.pyplot as plt # Data x = ['A', 'B', 'C', 'D', 'E'] y = [10, 15, 7, 10, 12] # Plot plt.bar(x, y, color='skyblue') plt.xlabel('Categories') plt.ylabel('Values') plt.title('Bar Chart') plt.show()
Explanation:
plt.bar(x, y, color='skyblue')
: Plots the bar chart with given x categories and corresponding y values.color
specifies the color of the bars.- Other lines have the same meaning as in the Line Plot.
Nested Bar Plot Comparison
Use Case: This plot visualizes nested categorical data, showing the relationship between main and sub-categories through bar heights.
import matplotlib.pyplot as plt import numpy as np # Data categories = ['A', 'B', 'C', 'D'] outer_values = [20, 35, 30, 25] # Outer bars inner_values = [10, 20, 15, 10] # Inner bars # Plot fig, ax = plt.subplots() index = np.arange(len(categories)) # Index for the categories width = 0.4 # Width of the outer bars # Plot the outer bars ax.bar(index, outer_values, width, label='Outer Bars') # Plot the inner bars ax.bar(index, inner_values, width, label='Inner Bars') # Add labels, title, and legend ax.set_xlabel('Categories') ax.set_ylabel('Values') ax.set_title('Nested Bar Plot') ax.set_xticks(index) ax.set_xticklabels(categories) ax.legend() # Add value labels on top of each bar for i, (outer, inner) in enumerate(zip(outer_values, inner_values)): ax.text(i, outer + 1, str(outer), ha='center', va='bottom') ax.text(i, inner + 1, str(inner), ha='center', va='bottom') plt.show()
Explanation:
- Define categories and values for outer and inner bars.
- Create a figure and subplot for plotting.
- Create an index for the categories and specify the width of the bars.
- Plot the outer bars on the subplot.
- Plot the inner bars on the subplot.
- Set labels for the x-axis and y-axis.
- Set a title for the plot.
- Set positions and labels for the x-axis ticks.
- Add a legend to the plot.
- Add value labels on top of each bar.
- Display the plot.
Histogram
Use Case: Histograms are used to visualize the distribution of a single continuous variable. They display the frequency or count of data points within predefined intervals or bins.
import matplotlib.pyplot as plt import numpy as np # Data data = np.random.randn(1000) # Plot plt.hist(data, bins=30, color='cyan', edgecolor='black') plt.xlabel('Value') plt.ylabel('Frequency') plt.title('Histogram') plt.show()
Explanation:
np.random.randn(1000)
: Generates 1000 random data points from a standard normal distribution.plt.hist(data, bins=30, color='green', edgecolor='black')
: Plots the histogram with the given data, dividing it into 30 bins.color
specifies the color of the bars, andedgecolor
specifies the color of the edges of the bars.- Other lines have the same meaning as in the Line Plot.
Box Plot
Use Case: Box plots are useful for visualizing the distribution of a continuous variable and identifying outliers. They display key statistical measures such as median, quartiles, and potential outliers.
import matplotlib.pyplot as plt import numpy as np # Data data = np.random.randn(100) # Plot plt.boxplot(data) plt.xlabel('Data') plt.ylabel('Values') plt.title('Box Plot') plt.show()
Explanation:
np.random.randn(100)
: Generates 100 random data points from a standard normal distribution.plt.boxplot(data)
: Plots the box plot of the given data.- Other lines have the same meaning as in the Line Plot.
Pie Chart
Use Case: Pie charts are effective for showing the composition of a whole. They are commonly used to represent proportions or percentages of different categories within a dataset.
import matplotlib.pyplot as plt # Data labels = ['A', 'B', 'C', 'D'] sizes = [15, 30, 45, 10] # Plot plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=140) plt.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle. plt.title('Pie Chart') plt.show()
Explanation:
plt.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=140)
: Plots the pie chart with given sizes and labels.autopct
specifies the format for displaying the percentages, andstartangle
rotates the start of the pie chart by the specified angle.plt.axis('equal')
: Ensures that the pie chart is drawn as a circle.- Other lines have the same meaning as in the Line Plot.
Donut Plot
Use Case: Donut plots are similar to pie charts but with a hole in the center. They are used to represent the composition of a whole and are effective for comparing the proportions of different categories within a dataset.
import matplotlib.pyplot as plt # Data labels = ['A', 'B', 'C', 'D'] sizes = [20, 30, 40, 10] # Plot fig, ax = plt.subplots() ax.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=90, colors=['lightblue', 'lightgreen', 'lightcoral', 'lightsalmon']) ax.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle. # Draw a white circle at the center to create the donut centre_circle = plt.Circle((0, 0), 0.7, color='white', linewidth=0) ax.add_artist(centre_circle) plt.title('Donut Plot') plt.show()
Explanation:
labels
andsizes
: Define the labels and corresponding sizes of the sectors in the donut plot.fig, ax = plt.subplots()
: Create a figure and axis object.ax.pie(sizes, labels=labels, autopct='%1.1f%%', startangle=90, colors=['lightblue', 'lightgreen', 'lightcoral', 'lightsalmon'])
: Plots the pie chart with the given sizes, labels, percentage format, start angle, and colors.ax.axis('equal')
: Ensures that the pie chart is drawn as a circle.centre_circle = plt.Circle((0, 0), 0.7, color='white', linewidth=0)
: Creates a white circle at the center to form the donut.ax.add_artist(centre_circle)
: Adds the white circle to the plot.- Other lines have the same meaning as in the Line Plot.
Area Plot
Use Case: Area plots are similar to line plots but with the area below the line filled in. They are useful for visualizing cumulative data or for highlighting the magnitude of change over time.
import matplotlib.pyplot as plt # Data x = [1, 2, 3, 4, 5] y1 = [1, 2, 3, 4, 5] y2 = [5, 4, 3, 2, 1] # Plot plt.fill_between(x, y1, y2, color='skyblue', alpha=0.5) plt.xlabel('X-axis') plt.ylabel('Y-axis') plt.title('Area Plot') plt.show()
Explanation:
plt.fill_between(x, y1, y2, color='skyblue', alpha=0.5)
: Plots the area between two lines defined byy1
andy2
along the x-axis.color
specifies the color of the filled area, andalpha
controls the transparency.- Other lines have the same meaning as in the Line Plot.
Violin Plot
Use Case: Violin plots are used to visualize the distribution of a continuous variable across different categories. They combine aspects of box plots and kernel density estimation plots to provide insights into the data distribution.
import matplotlib.pyplot as plt import numpy as np # Data data = [np.random.normal(0, std, 100) for std in range(1, 4)] # Plot plt.violinplot(data) plt.xlabel('Data') plt.ylabel('Values') plt.title('Violin Plot') plt.show()
Explanation:
np.random.normal(0, std, 100)
: Generates 100 data points from a normal distribution with mean 0 and standard deviationstd
.plt.violinplot(data)
: Plots the violin plot of the given data.- Other lines have the same meaning as in the Line Plot.
Heatmap
Use Case: Heatmaps are effective for visualizing the magnitude of relationships between two variables. They are commonly used in fields such as finance, biology, and social sciences to identify patterns or correlations in large datasets.
import matplotlib.pyplot as plt import numpy as np # Data data = np.random.rand(10, 10) # Plot plt.imshow(data, cmap='hot', interpolation='nearest') plt.colorbar() plt.xlabel('X-axis') plt.ylabel('Y-axis') plt.title('Heatmap') plt.show()
Explanation:
np.random.rand(10, 10)
: Generates a 10×10 array of random numbers.plt.imshow(data, cmap='hot', interpolation='nearest')
: Displays the heatmap of the data using a colormap (hot
in this case).interpolation
specifies the interpolation method.plt.colorbar()
: Adds a color bar to indicate the mapping of values to colors.- Other lines have the same meaning as in the Line Plot.
Hexbin Plot
Use Case: Hexbin plots are used to visualize the distribution of a large number of points in a two-dimensional space. They are particularly useful when dealing with dense datasets and help identify patterns or clusters in the data.
import matplotlib.pyplot as plt import numpy as np # Data x = np.random.randn(1000) y = np.random.randn(1000) # Plot plt.hexbin(x, y, gridsize=30, cmap='inferno') plt.colorbar() plt.xlabel('X-axis') plt.ylabel('Y-axis') plt.title('Hexbin Plot') plt.show()
Explanation:
np.random.randn(1000)
: Generates 1000 random data points from a standard normal distribution.plt.hexbin(x, y, gridsize=30, cmap='inferno')
: Plots the hexbin plot with the given x and y values.gridsize
determines the number of hexagons in the x-direction.plt.colorbar()
: Adds a color bar to indicate the mapping of values to colors.- Other lines have the same meaning as in the Line Plot.
Each of these plots serves a specific purpose and can be chosen based on the type of data being visualized and the insights you want to derive from it.