## Introduction¶

Hello everyone, welcome back to another new blog where we will explore different ideas and concept one could perform while performing an EDA. In simple words, this blog is a simple walk-through of an average EDA process which might include (in top down order):

• Data Loading: From various sources (remote, local) and various formats (excel, csv, sql etc.)
• Data Check: This is very important task where we check the data types (numerical, categorical, binary etc) of a data. We often focus on number of missing values.
• Data Transformation: This includes filling up null values, or removing them from the table. We also do some data type conversions if required.
• Descriptive Analysis: This is the heart of any EDA because here, we do lots of statistical tasks like finding mean, median, quartiles, mode, distribution, relationships of fields. We also plot different plots to support the analysis. This is sometimes enough to give insights about the data and if the data is rich and we need to find more insights and make assumptions, we have to do Inferential Analysis.
• Inferential Analysis: This task sometimes is taken into the EDA part but most of the time we do inferential analysis along with model development. However, we do perform different tests (e.g Chi- Square Test) to calculate feature importance. Here we often do tests based on hypothesis and samples drawn from the population.

While walking through these major steps, one will try to answer different questions of analysis like how many times some categorical data has appeared, what is the distribution over a date, what is the performance over certain cases and so on.

### Installing Libraries¶

!pip install autoviz
!pip install seaborn
!pip install plotly