Table of Contents

Introduction

This blog is a part of our series Python for Stock Market Analysis.

Disclaimer: This blog is for educational purpose only and we do not recommend taking the knowledge gained from this blog to implement in real financial exercises.

This blog tries to implement preliminary metrics that are used in the stock market analysis. The dataset we will be using is available via yahoofinance.

Preliminary Actions

Install Libraries

Please install:

You might need to install pip install -U kaleido if you need to save plots as png image.

If you are new into plotly, then we have an awesome blog about it where we have done plots based on COVID 19 dataset.

Import Required Libraries

Download Stock Data of Apple

By default, we are allowed to download data from 1900-01-01

It seems that data is only available from 1980-12-12. The column names in the above fields are:

Perform EDA

EDA or Exploratory Data Analysis is the first step in any Data Analysis and lets do that in our Stock Data too. We have blogs about doing EDA, Statistical and Inferential Analysis please check them out for more about EDAs.

Checking for Null Value

It seems that we do not have any null rows present on the data.

View the Distribution

It gives us the frequency of value's some range. It is simply a histogram.

It seems that all values of the columns are left tailed.

View the Box Plot

Box Plot gives the clear picture of our descriptive nature of the data.

It seems that we have too many outliers but it does not matter right now.

Summary of our data

Box Plot already gave us the summary of the data. We can see that the average volume is 3.326112e+08 but will it give a true picture about the volume's flow over the course of the time? It won't because there will be certain rise and falls of the values over the time. Lets try to visualize it as line plot.

As we can see in the above plot that, the trend of the OHLC is in increasing order while Volume is not. The values of share increases/decreases but in overall, it seems to be increasing.

Moving Average

Moving average is a kind of average where we take the average of data within some time frame only. While looking at the time series data that have high volatility (e.g. standard deviation), the simple average DOES NOT give a clear picture of the mean or average value. One reason is that, in real world financial data, the amount/price does increase/decrease with some unexpected factors like COVID outbreak, or expected factors like Tesla's new car. So to get the figure that will well represent the average amount, we will take the average over some time only. By doing so, we wont be caring much about the history that is too much old and does not affect much to our present.

Simple Moving Average (SMA)

Simple Moving Average is the simplest example of the Moving Average where we take the data from some time frame and divide it by number of data points. The size of the time frame is often known as the window of movement. It is an example of Technical Indicator (heuristic or pattern-based signals produced by the price or volume).

A formula to calculate Simple Moving Average is:

$$ SMA = \frac{V_1 + V_2 + V_3 + ... + V_n}{n} $$

Where,

Lets try to implement this concept in our data, we will take window size or n as 5.

Plotting SMA of All

Looking over all above plots, we can see that even though there are some extreme rise and fall in the value, the SMA value was not much sensitive to them. The SMA is not sensitive to the unexpected Rise and Fall of the value or in other terms, it is less sensitive to Trough and Valley.

Weighted Moving Average (WMA)

In above case, we have taken Simple Moving Average where to treated the value of past 5 days as equally important in current day but it might not be the good idea to consider the value of 5 days earlier and the yesterday equally important. Why so? A simple reason is that the opening value of current date is not much related to the 100 days ago than the yesterday's value. So why consider those as same important?

WMA is simply a mean of values (of some time frame) multiplied by some weights over a specified period of time frame.

$$ WMA = \frac{V_1 * W_1 + V_2 * W_2 + V_3 * W_3 + ... + V_n * W_n}{W_1+W_2+W_3+...+W_n} $$

Where,

Purpose of WMA is to give some weights to values respective to their order based on time or rank so that we could prioritize their occurrence in mean. Lets give the weights [1, 2, 3, 4, 5] for the data in the window of size 5.

Plotting WMA of All

We can not see the much difference between WMA and SMA and it is because of the level (daily) of our data. Lets try to plot data of last 100 days only.

Now it is more clearer. Looking over the plot of open,

Exponential Moving Average (WMA)

It is similar to the WMA in the sense of giving weights to values but, instead of the linear weights, we will give exponential weights.

A general formula of EMA at time t is:

$$ EMA_t = \left[V_t * \left(\frac{s}{1+d}\right)\right] + EMA_y * \left[1-\left(\frac{s}{1+d}\right)\right] $$

Where,

Purpose of using EMA is to give high weights to more recent values and shows more sensitivity to more recent data. This average is more responsive to the latest price changes than SMA.

We do not have to use this scary formula from the scratch because pandas gives us some ways to do it with little code. Please refer to Pandas documentation for more info about EWM.

$$ y_0 = x_0 \\\ y_t = (1 - \alpha) y_{t-1} + \alpha x_t, $$

Where, alpha is either the value given by us or smoothing/(time periods+1). Smoothing is generally taken as 2 and time periods is taken as our requirement.

Plotting EMA of All

Instead viewing EMA of entire data, lets view it of last 100 days only.

Looking over the EMA,it seems that it is much more smoother than the other values. But the smoothness depends on the value of the smoothing. Based on EMA, lots of other important metrics are calculated in Stock Market Analysis and to note down few:

We will be exploring all above 4 metrics in the next blog please stay tuned for that.

Plotting Candlestick

Candlesticks are often used in stock data analysis for clear visualization and lets try that as well. We will use graph_objects of Plotly.

Moving Median

What if we used median instead of the mean? Lets copy and paste the codes written in above steps and calculate median instead of the mean.

EMA seems to be much near to the open and EMA is more sensitive towards the change than Simple Moving Median.

Moving Variance

Variance seems to be increasing when there is sudden change in the trend and it seems to be decreasing when the change seems to be normal.

Conclusion

In this blog, we have explored some of popular moving average algorithms used in the stock market analysis and in the next blog, we will explore some of the popular metrics that uses Moving Average as the base metric.