A Comparative study of Text Classification methods: TF-IDF, FastText and BERT Embedding

Author(s): Asmi P, Gaya Nair and Shireen M T

Abstract: In social media information, text classification is an important role. Deeper understanding of text in machine learning methods to be able to accurately classify texts in many applications. Documents or Articles contain many words that are irrelevant for text classification. In this paper we discuss a comparative study of different classifications methods. 1)TF-IDF: Weight words method, translate each document into vector and evaluate the number of words in the document in a corpus. 2) FastText Embedding: feature learning technique where each word is represented as a bag of character n-grams. 3)BERT Embedding: one of the strong context and word representation.