Data Science with Python: Intro to Data Visualization with Matplotlib

Also, there is an awesome interactive version of this chart available here, in which you can play historic time series, search for a certain country, change the data on the axis and so on..Those rules help us make nice and informative plots instead of confusing ones.The first step is to choose the appropriate plot type..When there are no axis labels, we can try to look at the code to see what data is used and if we’re lucky we’ll understand the plot..What if we show this plot to your boss who doesn’t know how to make plots in Python?Third, we can add a title to make our plot more informative.Fourth, add labels for different categories when needed.Five, optionally we can add a text or an arrow at interesting data points.Six, in some cases we can use some sizes and colors of the data to make the plot more informative.Types of Visualizations and Examples with MatplotlibThere are many types of visualizations..Some of the most famous are: line plot, scatter plot, histogram, box plot, bar chart, and pie chart..For, example, if we use this code plt.xticks([1, 2, 3, 4, 5], ["1M", "2M", "3M", "4M", "5M"]), it will set the labels 1M, 2M, 3M, 4M, 5M on the x-axis.plt.yticks() – works the same as plt.xticks(), but for the y-axis.Line Plot: a type of plot which displays information as a series of data points called “markers” connected by straight lines..In this type of plot, we need the measurement points to be ordered (typically by their x-axis values)..This type of plot is often used to visualize a trend in data over intervals of time – a time series.To make a line plot with Matplotlib, we call plt.plot()..For example, we might want to add labels to the axis and title for the plot.Simple Line PlotScatter plot: this type of plot shows all individual data points..This type of plot can be used to display trends or correlations..In data science, it shows how 2 variables compare.To make a scatter plot with Matplotlib, we can use the plt.scatter() function..Again, the first argument is used for the data on the horizontal axis, and the second – for the vertical axis.Simple Scatter PlotHistogram: an accurate representation of the distribution of numeric data..The default value for the bins argument is 10.Simple HistogramSimple Histrogram OuputWe can see from the histogram above that there are:5 values between 0 and 33 values between 3 and 6 (including)2 values between 6 (excluding) and 9Box plot, also called the box-and-whisker plot: a way to show the distribution of values based on the five-number summary: minimum, first quartile, median, third quartile, and maximum.The minimum and the maximum are just the min and max values from our data.The median is the value that separates the higher half of a data from the lower half.. More details

Leave a Reply