We are going to learn about the most commonly used charts in Data Visualization, after reading this blog you'll never struggle in understanding the graph and chart during your Data analysis task.
Human eyes are more drawn to colors,figures and patterns.we can quickly identify yellow, red, square, triangle. our culture is visual, including everything from art to movies whatsApp, emojies, video games, advertisement on TV and what not.When we see any graph or chart wee quickly identify the trends and outlier behind the graph.
Data visualization is the graphical representation of data using intereactive graphs , charts, maps.Data visualization tool provides a accessible way to see and analyze trends, outlier and pattern in data.
There are many data visualization tools, but in this post we will learn the most common methods and tools used in data visualization . we will start by importing required libraries(note: mainly we will use matplotlib and seaborn for data visualization)
How to install matplotlib(matplotlib install)
pip install matplotlib
#Importing library import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline
1.Simple plot (matplotlib line plot)
let's start with plotting a simple line (matplotlib line plot) which is used to represent the relationship of a variable with another variable. say we have two variable X and Y.
</code to implement simple plot using python and matplotlib/>
We can also add customization to our graph.Here we will discuss some elementry customization. like
- How to add matplotlib title?
- How to give name to matplotlib axes?
- How to add colour to matplotlib plot?
- How to set limits on the matplotlib axes?
- How to use markers in matplotlib plot?
</ Code everything using matplotlib pyplot/>
2.Plotting two or more line on same plot
</Code plot multiple lines using matplotlib/>
- Here we plotted three matplotlib line plot on same graph. we can diffrentiate between them by providing a name(label) which is passed as an argument of plt.plot() function.
- The small rectangular box in the bottom corner giving information about the color of line is called legend.We can add a legend to our plot using matplotlib legend ( plt.legend() ).
3.Bar chart (Matplotlib bar chart)
A bar chart is one of the most frequently used data visualization techniques in machine learning. Bar chart represents the categorical data with horizontal or vertical rectangular bars where height of a bar represent the value of corresponding category of data.
A survey of 135 people asked them "Which is the nicest fruit?":
</Code to plot bar chart using python and matplotlib/>
4.Histogram (Matplotlib hist)
It represents the distribution of a continuous variable over a given interval or period of time. Histograms plot the data by dividing it into intervals called ‘bins’. Histogram is used to inspect the underlying frequency distribution (eg. Normal distribution).below is a simple example to illustrate the histogram!
</Code to plot Histogram using python and matplotlib/>
The key diffrence between bar chart and histogram are given below!
Histogram vs bar graph:
- A histogram indicates the distribution of non-discrete variables whereas the bar chart indicates the comparison of discrete variables.
- Histogram represents the quantative data where a bar chart represents thecategorical data.
- In histogram elements are grouped together ,so they are considered as ranges where in bar chart elements are considered as individual item.
5.Scatter plot (matplotlib scatter)
Another commonly used visualization graph ,this graph is very closed to a simple line graph ,in which rather than joining the points by a line, they are represented individually by a dot, circle or any other shape.
</Code to plot Scatter plot using python and matplotlib/>
That's all for this post ,i hope this will help you. thank you!
Our other awesome blog posts related to data Science and machine learning::
- Top 5 Deep Learning Interview Questions
- A Complete Guide to Real-time Object Detection with TensorFlow
- How to use Google Colab( Free GPU for Deep Learning
- Data Analysis with one line of code in Python
- Why Numpy Arrays are faster
Tell us in the comment box about the topics you want to explore in machine learning and Data Science.