A combined AraBERT and Voting Ensemble classifier model for Arabic sentiment analysis
A combined AraBERT and Voting Ensemble classifier model for Arabic sentiment analysis
Blog Article
For sentiment analysis of short texts (e.g.movie reviews, tweets, etc.), one approach is to build machine learning models that can determine their tones (positive, negative, neutral).
However, these natural language processing (NLP) studies are missing when there is a lack of high-quality and large-scale training data for specific languages such as Arabic.In this paper, we present three machine learning models designed read more to classify sentiment Arabic tweets developed for a Kaggle competition.We present a Voting Ensemble classifier taking advantage of both character-level and word-level features.We also propose an AraBERT (Arabic Bidirectional Encoder Representations from Transformers) model with preprocessing using Farasa Segmenter.
Finally, we combine these first two approaches as a third approach click here (Voting Ensemble classifier using AraBERT embeddings).Performance measures of results show improvement over previous efforts for all models.The third model exhibits strong performance with a 73.98% F-score score.
The work presented here could be useful for future studies and for new Arabic sentiment analysis online services or competitions.