Skip to content

Sentiment Analysis using TweetNLP on r/singapore opinion of SG presidential candidate Mr. Tan Kin Lian

License

Notifications You must be signed in to change notification settings

rosamundlim/TweetNLP-Sentiment-Analysis-Reddit-Singapore

Repository files navigation

TweetNLP-Sentiment-Analysis-Reddit-Singapore

Sentiment Analysis using TweetNLP on r/singapore opinion of SG presidential candidate Mr. Tan Kin Lian in 2023.

Background

On Sep 1 2023, Singaporeans will exercise their voting rights to choose the country's ninth president. Among the eligible candidates, Mr. Tan Kin Lian, was noted by several pundits as a controversial candidate. This is in part due to his various social media postings, one of which has resulted in AWARE, a women advocacy group to raise their concerns on his candidacy.

This is a sentiment analysis on top r/singapore threads regarding Mr. Tan Kin Lian's candidacy (r/singapore is likely to have a younger demographic) using the TweetNLP library that is based on the roberta model by cardiffnlp.

This is NOT intended to be an opinion piece on Mr. Tan's candidacy, and it only seeks to understand the sentiments of r/singapore users towards Mr. Tan.

Notable Tools utilized

  1. asyncpraw
  2. regex
  3. tweetNLP
  4. matplotlib/seaborn
  5. pandas

Scraping Reddit

Using the asyncpraw library, I extracted comments on 10 highly commented Reddit threads regarding Mr. Tan Kin Lian's candidacy as of 26 August 2022.

Preprocessing and using the pre-trained model via TweetNLP

Preprocessing focused on converting everything to lowercase, removing hyperlinks in comments, punctuations etc. Note that the TweetNLP module takes in a string. TweetNLP would be a relevant library to use as it is trained on social media, and how you communicate on Reddit is not very different from Twitter. Main issue is the idiosyncracy of Singlish on r/singapore that may be a challenge to a pre-trained model.

Results

After tagging each comment and classifying it into the emotion label, prepare the dataframe and relevant tabulations.

Overall sentiment: overall_sentiments (1) r/singapore mostly react negatively towards posts regarding Mr. Tan Kin Lian's candidacy, as you can observe the emotion labels that are being tagged by Tweetnlp model.

Sentiment group by reddit thread: article_sentiments (1) Now, group it according to the corresponding reddit thread in r/singapore and normalize each emotion within a group by the total count in each group to get a percentage value.

Most negative:

  1. Tan Kin Lian's 'pretty girls' posts ignite debate about objectifying women, assessment of presidential candidates
  2. Presidential hopeful Tan Kin Lian accuses various parties of smear campaign
  3. Tan Kin Lian rejects AWARE's concerns he 'objectifies' women, his daughter defends him

Most polarized (similar proportions of negative and positive emotions):

  1. Tan Kin Lian says he will channel public feedback if elected President

About

Sentiment Analysis using TweetNLP on r/singapore opinion of SG presidential candidate Mr. Tan Kin Lian

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published