Text Classification using Naive Bayes, Scratch to the Framework

less than 1 minute read

So this is not a blog for introduction to naive bayes but implementation way for spam message classification. I have also created a YouTube video for this topic which is available on the link below.

https://www.youtube.com/watch?v=jlQPojZlX2Q

And the resources are available on the link below.

https://github.com/q-viper/ML-from-Basics

Steps to perform text classification are:-

Lower Casing text
Remove Punctuation
Perform Bag of words
Frequency of words
Find Bag of Words
Probability of word on class p(w/c)
Probability of class given word p(c/w)

In order to perform our classifier we have to preprocess our input data. For text processing, our data will be on text format so we will convert that into vector form. In general we will find a data frame where index will be the example and columns will be all the unique words from our training set. Then each cell will be probability of word on class. Then for the part of prediction, we will find p(c/w) using simple bayes formula:-

p(c/w) = p(w/c) * p(c) / p(w)

Please follow through the video for more information about the topic.

Thank you for reading the post and feel free to share it. :)

Twitter Facebook LinkedIn

Quassarian Viper

Text Classification using Naive Bayes, Scratch to the Framework

Comments

You May Also Enjoy

ImageBaker - Making Image Labelling Fun

Advent of Code 2022 with Python

Text Analysis with WordCloud in Python

WorldCup Tweet Sentiment Analysis in Python