Hello and welcome back everyone to our second part of the new blog series Python for Stock Market Analysis. In the last part, we explored different types of moving averages like Simple Moving Average (SMA), Exponential Moving Average (EMA), Weighted Moving Average (WMA) and explored other moving metrics like Moving Median and Moving Variance. Until now we were looking only into the trend over the time and trend over the period of time. These simple metrics are used under the hood to make some assumptions in the stock markets. In this blog, we will explore some of popular metrics that are used in the stock markets which are based on Moving Averages.
Disclaimer: This blog is for educational purpose only and we do not recommend taking the knowledge gained from this blog to implement in real financial exercises.
Technical indicators in stock markets are categorized in many ways and some of the most common are:
All above 4 are used to either predict or alert us about the future of the stock. The indicators are often viewed in the terms of leading and lagging. Leading indicators give some kind of predictions about the price rise or trend by using short term moving averages (like EMA of period 12 in MACD (Moving Average Convergence Divergence)). Lagging indicators give the information that has happened and might continue to do so. Like EMA of different periods.
Before diving into the coding part, lets read our data.
import pandas as pd
import numpy as np
import plotly.express as px
import cufflinks
import plotly.io as pio
import yfinance as yf
import warnings
warnings.filterwarnings("ignore")
cufflinks.go_offline()
cufflinks.set_config_file(world_readable=True, theme='pearl')
pio.renderers.default = "notebook" # should change by looking into pio.renderers
pd.options.display.max_columns = None
symbols = ["AAPL"]
df = yf.download(tickers=symbols)
df.head()
[*********************100%***********************] 1 of 1 completed
Open | High | Low | Close | Adj Close | Volume | |
---|---|---|---|---|---|---|
Date | ||||||
1980-12-12 | 0.128348 | 0.128906 | 0.128348 | 0.128348 | 0.100326 | 469033600 |
1980-12-15 | 0.122210 | 0.122210 | 0.121652 | 0.121652 | 0.095092 | 175884800 |
1980-12-16 | 0.113281 | 0.113281 | 0.112723 | 0.112723 | 0.088112 | 105728000 |
1980-12-17 | 0.115513 | 0.116071 | 0.115513 | 0.115513 | 0.090293 | 86441600 |
1980-12-18 | 0.118862 | 0.119420 | 0.118862 | 0.118862 | 0.092911 | 73449600 |
# convert column names into lowercase
df.columns = [c.lower() for c in df.columns]
df.rename(columns={"adj close":"adj_close"},inplace=True)
Trend indicators are used as a basic way to visualize the flow of the stock's performance over the course of the time (daily, monthly, weekly, in last 3 weeks etc). We can apply these indicators in stocks's performances like volume, price and transactions. Trend indicators are different in kinds and we have explored some of them in the previous blog where have explored trend of Open, High, Low and Volume of the Apple's Floorsheet data. The trend itself doesnot predict anything about the price rise or fall on the future but we can make some kind of analogy based on the recent performance of the stock.
Despite the price being high/low throughout the day, most traders find the closing price to be most important to describe the performance of the stock on that day. So, we will calculate most single variate indicators based on Closing Price of that day.
Some of popular trend indicators are:
We will calculate these in our data next.
Moving Averages are common trend indicators that are a building blocks of popular indicators like GMMA (Guppy Multiple Moving Average), MACD (Moving Average Convergence Divergence)and PPO (Percentage Price Oscillator). But first lets write a simple function that could give us moving average of given window.
def moving_average(series, window=5, kind="sma"):
if kind=="sma":
return series.rolling(window=window, min_periods=window).mean()
elif kind=="ema":
return series.rolling(window=window, min_periods=window).mean()
elif kind=="wma":
return series.rolling(window=window, min_periods=window).apply(lambda x: np.average(x, weights=np.arange(1, window+1,1)))
tdf = df.copy()
window=30
tdf[f"close_sma_{window}"] = moving_average(tdf.close, window=window)
tdf[f"close_ema_{window}"] = moving_average(tdf.close, window=window, kind="ema")
tdf[f"close_wma_{window}"] = moving_average(tdf.close, window=window, kind="wma")
tdf
open | high | low | close | adj_close | volume | close_sma_30 | close_ema_30 | close_wma_30 | |
---|---|---|---|---|---|---|---|---|---|
Date | |||||||||
1980-12-12 | 0.128348 | 0.128906 | 0.128348 | 0.128348 | 0.100326 | 469033600 | NaN | NaN | NaN |
1980-12-15 | 0.122210 | 0.122210 | 0.121652 | 0.121652 | 0.095092 | 175884800 | NaN | NaN | NaN |
1980-12-16 | 0.113281 | 0.113281 | 0.112723 | 0.112723 | 0.088112 | 105728000 | NaN | NaN | NaN |
1980-12-17 | 0.115513 | 0.116071 | 0.115513 | 0.115513 | 0.090293 | 86441600 | NaN | NaN | NaN |
1980-12-18 | 0.118862 | 0.119420 | 0.118862 | 0.118862 | 0.092911 | 73449600 | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2022-02-28 | 163.059998 | 165.419998 | 162.429993 | 165.119995 | 165.119995 | 94869100 | 168.273667 | 168.273667 | 168.313550 |
2022-03-01 | 164.699997 | 166.600006 | 161.970001 | 163.199997 | 163.199997 | 83474400 | 167.944667 | 167.944667 | 167.986216 |
2022-03-02 | 164.389999 | 167.360001 | 162.949997 | 166.559998 | 166.559998 | 79724800 | 167.836667 | 167.836667 | 167.896883 |
2022-03-03 | 168.470001 | 168.910004 | 165.550003 | 166.229996 | 166.229996 | 76335600 | 167.836667 | 167.836667 | 167.793226 |
2022-03-04 | 164.490005 | 165.550003 | 162.110001 | 162.539993 | 162.539993 | 29556127 | 167.771000 | 167.771000 | 167.451506 |
cols = [c for c in tdf.columns if "close" in c and "adj" not in c]
tdf[cols].iplot(kind="line")
Above plot seems little bit spiked but if we zoomed it little bit, we could see the changes in the closing price and the performance of the moving averages. We can say that EMA are more closer toward the close's actual trend because it gives more importance to the latest values based on the decay term.
GMMA is a technical indicator where we use two groups of EMAs (total 12) and compare their flow over the time to make assumptions. Guppy in GMMA comes from the Australian trader named as Daryl Guppy.
def guppy_multiple_ma(tdf,col="close", sma=[], lma=[]):
"""
sma: [3, 5, 8, 10, 12, 15]
lma: [30, 35, 40, 45, 50, 60]
"""
if sma == []:
sma = [3, 5, 8, 10, 12, 15]#
if lma == []:
lma = [30, 35, 40, 45, 50, 60] #
for sm in sma:
tdf[f"sema_{col}_{sm}"] = tdf[col].ewm(span=sm, min_periods=sm, adjust=False).mean()
for lm in lma:
tdf[f"lema_{col}_{lm}"] = tdf[col].ewm(span=lm, min_periods=lm, adjust=False).mean()
return tdf
tdf = guppy_multiple_ma(tdf, col="close")
tdf
open | high | low | close | adj_close | volume | close_sma_30 | close_ema_30 | close_wma_30 | sema_close_3 | sema_close_5 | sema_close_8 | sema_close_10 | sema_close_12 | sema_close_15 | lema_close_30 | lema_close_35 | lema_close_40 | lema_close_45 | lema_close_50 | lema_close_60 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Date | |||||||||||||||||||||
1980-12-12 | 0.128348 | 0.128906 | 0.128348 | 0.128348 | 0.100326 | 469033600 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1980-12-15 | 0.122210 | 0.122210 | 0.121652 | 0.121652 | 0.095092 | 175884800 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1980-12-16 | 0.113281 | 0.113281 | 0.112723 | 0.112723 | 0.088112 | 105728000 | NaN | NaN | NaN | 0.118861 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1980-12-17 | 0.115513 | 0.116071 | 0.115513 | 0.115513 | 0.090293 | 86441600 | NaN | NaN | NaN | 0.117187 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1980-12-18 | 0.118862 | 0.119420 | 0.118862 | 0.118862 | 0.092911 | 73449600 | NaN | NaN | NaN | 0.118025 | 0.119358 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2022-02-28 | 163.059998 | 165.419998 | 162.429993 | 165.119995 | 165.119995 | 94869100 | 168.273667 | 168.273667 | 168.313550 | 164.531310 | 164.755509 | 165.630092 | 166.180578 | 166.644138 | 167.197609 | 168.512558 | 168.584183 | 168.520952 | 168.349104 | 168.091891 | 167.395141 |
2022-03-01 | 164.699997 | 166.600006 | 161.970001 | 163.199997 | 163.199997 | 83474400 | 167.944667 | 167.944667 | 167.986216 | 163.865653 | 164.237005 | 165.090071 | 165.638654 | 166.114271 | 166.697907 | 168.169812 | 168.285062 | 168.261393 | 168.125230 | 167.900052 | 167.257596 |
2022-03-02 | 164.389999 | 167.360001 | 162.949997 | 166.559998 | 166.559998 | 79724800 | 167.836667 | 167.836667 | 167.896883 | 165.212825 | 165.011336 | 165.416721 | 165.806171 | 166.182844 | 166.680668 | 168.065953 | 168.189225 | 168.178399 | 168.057176 | 167.847501 | 167.234724 |
2022-03-03 | 168.470001 | 168.910004 | 165.550003 | 166.229996 | 166.229996 | 76335600 | 167.836667 | 167.836667 | 167.793226 | 165.721411 | 165.417556 | 165.597449 | 165.883230 | 166.190098 | 166.624334 | 167.947504 | 168.080379 | 168.083355 | 167.977733 | 167.784069 | 167.201782 |
2022-03-04 | 164.490005 | 165.550003 | 162.110001 | 162.539993 | 162.539993 | 29556127 | 167.771000 | 167.771000 | 167.451506 | 164.130702 | 164.458368 | 164.918014 | 165.275369 | 165.628543 | 166.113792 | 167.598633 | 167.772580 | 167.812947 | 167.741310 | 167.578419 | 167.048936 |
Lets try to use candlestick to visualize OHLC and the trend at the same time. Any stick will be shown green if closing price is higher than opening and red if smaller than opening price. The top stick part is high, bottom stick part is low and top rectangle line reflects open if close is smaller else it reflects closing price. An example of candlestick is:
In above image, green represents where closing is greater than the opening price.
import plotly.graph_objects as go
layout = go.Layout(
autosize=False,
width=1000,
height=1000,
xaxis= go.layout.XAxis(linecolor = 'black',
linewidth = 1,
mirror = True),
yaxis= go.layout.YAxis(linecolor = 'black',
linewidth = 1,
mirror = True),
)
fig=go.Figure(layout=layout)
lastn = 1000
ldf = tdf[-lastn:]
fig.add_trace(go.Candlestick(x=ldf.index,
open=ldf['open'],
high=ldf['high'],
low=ldf['low'],
close=ldf['close'],
name = 'OHLC Market Data'))
for s in tdf.columns:
if "sema" in s:
fig.add_trace(go.Line(x=ldf.index, y=ldf[s], line=dict(
color='rgb(104, 204, 204)',
),
name=s.upper()))
if "lema" in s:
fig.add_trace(go.Line(x=ldf.index, y=ldf[s], line=dict(
color='rgb(255, 24, 24)',
), name=s.upper()))
fig.update_layout(
title= "AAPL Stock Data",
yaxis_title="Stock's Price in USD",
xaxis_title="Date")
fig.show()
Looking over the last 1000 days of the trends, there can be seen crossover in around November 2 where SEMA were crossing over the LEMA, that is the sign of the price fall. Similarly, after February 15, SEMA again crossed over the LEMA and that is the sign of the price rise.
Above plot is interactive in our interactive blog, please refer there for the interactive version of this blog.
def ppo(tdf, col="close", sm=12, lm=26):
tdf[f"sema_{col}_{sm}"] = tdf[col].ewm(span=sm, min_periods=sm, adjust=False).mean()
tdf[f"lema_{col}_{lm}"] = tdf[col].ewm(span=lm, min_periods=lm, adjust=False).mean()
tdf["ppo"] = (tdf[f"sema_{col}_{sm}"]-tdf[f"lema_{col}_{lm}"]) / tdf[f"lema_{col}_{lm}"] * 100
tdf["signal_line"] = tdf.ppo.ewm(span=9, min_periods=9, adjust=False).mean()
tdf["ppo_hist"] = tdf["ppo"]-tdf["signal_line"]
return tdf
tdf = df.copy()
tdf=ppo(tdf)
tdf
open | high | low | close | adj_close | volume | sema_close_12 | lema_close_26 | ppo | signal_line | ppo_hist | |
---|---|---|---|---|---|---|---|---|---|---|---|
Date | |||||||||||
1980-12-12 | 0.128348 | 0.128906 | 0.128348 | 0.128348 | 0.100326 | 469033600 | NaN | NaN | NaN | NaN | NaN |
1980-12-15 | 0.122210 | 0.122210 | 0.121652 | 0.121652 | 0.095092 | 175884800 | NaN | NaN | NaN | NaN | NaN |
1980-12-16 | 0.113281 | 0.113281 | 0.112723 | 0.112723 | 0.088112 | 105728000 | NaN | NaN | NaN | NaN | NaN |
1980-12-17 | 0.115513 | 0.116071 | 0.115513 | 0.115513 | 0.090293 | 86441600 | NaN | NaN | NaN | NaN | NaN |
1980-12-18 | 0.118862 | 0.119420 | 0.118862 | 0.118862 | 0.092911 | 73449600 | NaN | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2022-02-28 | 163.059998 | 165.419998 | 162.429993 | 165.119995 | 165.119995 | 94869100 | 166.644138 | 168.339878 | -1.007331 | -0.530792 | -0.476539 |
2022-03-01 | 164.699997 | 166.600006 | 161.970001 | 163.199997 | 163.199997 | 83474400 | 166.114271 | 167.959146 | -1.098407 | -0.644315 | -0.454093 |
2022-03-02 | 164.389999 | 167.360001 | 162.949997 | 166.559998 | 166.559998 | 79724800 | 166.182844 | 167.855506 | -0.996489 | -0.714750 | -0.281739 |
2022-03-03 | 168.470001 | 168.910004 | 165.550003 | 166.229996 | 166.229996 | 76335600 | 166.190098 | 167.735098 | -0.921095 | -0.756019 | -0.165076 |
2022-03-04 | 164.490005 | 165.550003 | 162.110001 | 162.539993 | 162.539993 | 29556127 | 165.628543 | 167.350275 | -1.028819 | -0.810579 | -0.218240 |
from plotly.subplots import make_subplots
fig=make_subplots(specs=[[{"secondary_y": True}]])
lastn = 1000
ldf = tdf[-lastn:]
fig.add_trace(go.Candlestick(x=ldf.index,
open=ldf['open'],
high=ldf['high'],
low=ldf['low'],
close=ldf['close'],
name = 'OHLC Market Data'))
for s in tdf.columns:
if "sema" in s:
fig.add_trace(go.Line(x=ldf.index, y=ldf[s], line=dict(
color='rgb(104, 204, 204)',
),
name=s.upper()))
if "lema" in s:
fig.add_trace(go.Line(x=ldf.index, y=ldf[s], line=dict(
color='rgb(255, 24, 24)',
), name=s.upper()))
clrred = 'rgb(222,0,0)'
clrgrn = 'rgb(0,222,0)'
clrs = [clrred if p<0 else clrgrn for p in ldf.ppo_hist]
fig.add_trace(go.Line(x=ldf.index, y=ldf.ppo, name="PPO"),secondary_y=True)
fig.add_trace(go.Bar(x=ldf.index, y=ldf.ppo_hist, name="PPO_Hist", marker=dict(color=clrs)),secondary_y=True)
fig.add_trace(go.Line(x=ldf.index, y=ldf.signal_line, name="Signal_Line"),secondary_y=True)
fig.update_layout(
title= "AAPL Stock Data (PPO Plot)",
yaxis_title="Stock's Price in USD",
xaxis_title="Date")
fig.show()
In above plot, we have changed the color of the histogram once the crossover happens. This allowed us to make assumptions based on the color. Also, we can see the performance of daily and the period of time at the same time by plotting candlestick.
MACD is often considered as a Oscillator Indicator but this does give trend and some sort of volatility over a period of time by subtracting the 26 period EMA from 12 period EMA. Period in this case can be day, week, month and so on thus the periods can be changed according to our need. This is exactly similar to the PPO except we do not take Percentage.
def macd(tdf, col="close", sm=12, lm=26):
tdf[f"sema_{col}_{sm}"] = tdf[col].ewm(span=sm, min_periods=sm, adjust=False).mean()
tdf[f"lema_{col}_{lm}"] = tdf[col].ewm(span=lm, min_periods=lm, adjust=False).mean()
tdf["macd"] = (tdf[f"sema_{col}_{sm}"]-tdf[f"lema_{col}_{lm}"])
tdf["signal_line"] = tdf.macd.ewm(span=9, min_periods=9, adjust=False).mean()
tdf["macd_hist"] = tdf["macd"]-tdf["signal_line"]
return tdf
tdf = df.copy()
tdf=macd(tdf)
tdf
open | high | low | close | adj_close | volume | sema_close_12 | lema_close_26 | macd | signal_line | macd_hist | |
---|---|---|---|---|---|---|---|---|---|---|---|
Date | |||||||||||
1980-12-12 | 0.128348 | 0.128906 | 0.128348 | 0.128348 | 0.100326 | 469033600 | NaN | NaN | NaN | NaN | NaN |
1980-12-15 | 0.122210 | 0.122210 | 0.121652 | 0.121652 | 0.095092 | 175884800 | NaN | NaN | NaN | NaN | NaN |
1980-12-16 | 0.113281 | 0.113281 | 0.112723 | 0.112723 | 0.088112 | 105728000 | NaN | NaN | NaN | NaN | NaN |
1980-12-17 | 0.115513 | 0.116071 | 0.115513 | 0.115513 | 0.090293 | 86441600 | NaN | NaN | NaN | NaN | NaN |
1980-12-18 | 0.118862 | 0.119420 | 0.118862 | 0.118862 | 0.092911 | 73449600 | NaN | NaN | NaN | NaN | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
2022-02-28 | 163.059998 | 165.419998 | 162.429993 | 165.119995 | 165.119995 | 94869100 | 166.644138 | 168.339878 | -1.695740 | -0.894153 | -0.801587 |
2022-03-01 | 164.699997 | 166.600006 | 161.970001 | 163.199997 | 163.199997 | 83474400 | 166.114271 | 167.959146 | -1.844876 | -1.084298 | -0.760578 |
2022-03-02 | 164.389999 | 167.360001 | 162.949997 | 166.559998 | 166.559998 | 79724800 | 166.182844 | 167.855506 | -1.672662 | -1.201970 | -0.470691 |
2022-03-03 | 168.470001 | 168.910004 | 165.550003 | 166.229996 | 166.229996 | 76335600 | 166.190098 | 167.735098 | -1.544999 | -1.270576 | -0.274423 |
2022-03-04 | 164.490005 | 165.550003 | 162.110001 | 162.539993 | 162.539993 | 29556127 | 165.628543 | 167.350275 | -1.721732 | -1.360807 | -0.360924 |
from plotly.subplots import make_subplots
fig=make_subplots(specs=[[{"secondary_y": True}]])
lastn = 1000
ldf = tdf[-lastn:]
fig.add_trace(go.Candlestick(x=ldf.index,
open=ldf['open'],
high=ldf['high'],
low=ldf['low'],
close=ldf['close'],
name = 'OHLC Market Data'))
for s in tdf.columns:
if "sema" in s:
fig.add_trace(go.Line(x=ldf.index, y=ldf[s], line=dict(
color='rgb(104, 204, 204)',
),
name=s.upper()))
if "lema" in s:
fig.add_trace(go.Line(x=ldf.index, y=ldf[s], line=dict(
color='rgb(255, 24, 24)',
), name=s.upper()))
clrred = 'rgb(222,0,0)'
clrgrn = 'rgb(0,222,0)'
clrs = [clrred if p<0 else clrgrn for p in ldf.macd_hist]
fig.add_trace(go.Line(x=ldf.index, y=ldf.macd, name="MACD"),secondary_y=True)
fig.add_trace(go.Bar(x=ldf.index, y=ldf.macd_hist, name="MACD_Hist", marker=dict(color=clrs)),secondary_y=True)
fig.add_trace(go.Line(x=ldf.index, y=ldf.signal_line, name="Signal_Line"),secondary_y=True)
fig.update_layout(
title= "AAPL Stock Data (MACD Plot)",
yaxis_title="Stock's Price in USD",
xaxis_title="Date")
fig.show()
Above plot looks similar to the PPO plot and it is because they both use same EMAs and only difference is the percentage. Looking over a zoomed version.
In this blog, we we explored some of popular trend indicators like GMMA, PPO and MACD. In the next blog, we will explore other indicators and so on. This blogging series will not end soon :P.