Sentiment Analysis of TroveSkin Application

nicodemusnaisau
11 min readJan 27, 2024

--

Hello, everyone! I’d like to share the insights from my thesis, detailing the entire process and highlighting my findings throughout the end-to-end journey.

image from https://www.troveskin.com/

Introduction

Technology has transformed the lifestyle of society, and Indonesia currently ranks sixth in global internet usage. As a result, access to information has become faster and more efficient, reaching all layers of society easily. The use of technology provides significant opportunities for businesses to expand their market by digitizing their products.

Accurate and relevant data are crucial for businesses and corporate organizations to understand consumer behavior and meet service needs. This data can be analyzed to identify new business opportunities, enhance operational efficiency, and optimize marketing strategies.

TroveSkin, an application for tracking skincare products, has gained popularity with over a million downloads and a 4.6 overall rating on Google Play Store. Despite positive scores, user reviews sometimes express dissatisfaction with the provided services.

Problem Statement

The abundance of user reviews on TroveSkin requires an efficient sentiment analysis system to evaluate and understand user sentiments accurately. This project aims to apply Data Mining and sentiment analysis using the Naive Bayes method to classify sentiments expressed in TroveSkin’s user reviews on Google Play Store.

Objectives of the Project

  1. Analyze user satisfaction with TroveSkin features and provide recommendations based on sentiment data.
  2. Classify user reviews using the Naive Bayes algorithm and implement Data Mining to assess sentiment.
  3. Analyze insights from TroveSkin sentiment analysis to recommend service improvements, enhancing user satisfaction.

Research Benefits

  1. Enhance TroveSkin’s understanding of user preferences and needs, aiding in the development of more user-centric products.
  2. Facilitate sentiment analysis through a visual dashboard, aiding TroveSkin in understanding and responding to user sentiments effectively.
  3. Predict future sentiments using historical data for strategic planning and campaigns.

Scope of the Project The research is limited to:

  1. Data was obtained from the Google Play Store by scraping TroveSkin’s page.
  2. 3000 rows of data processed using Excel and Python for data preprocessing.
  3. Reviews focused on Indonesian user feedback.
  4. Naive Bayes algorithm used for sentiment analysis.
  5. Dashboard created using Tableau Public for visualization.
  6. Balanced class consideration during modeling.

Expected Outputs

  1. Visual dashboard for sentiment analysis to aid in optimizing and maintaining TroveSkin’s services.
  2. Word Cloud shows dominant keywords in user reviews.
  3. Reference for future researchers in similar developments.

Research Methodology

Data Scraping

In the research process, data extraction is conducted from the TroveSkin application’s Google Play Store page. To achieve this, the initial step involves installing the necessary library by using the command:

!pip install -qq google-play-scraper

The data scraping process requires specific parameters, facilitated by the Google-Play-Scraper library. The parameters used are

The objective is to obtain balanced data, achieved by retrieving data with a score filter ranging from 1 to 5. After obtaining the data population, the results of the entire scraping process are merged into a single CSV file.

The visual representation of the data distribution based on ratings is

Finally, the scraping results, featuring each rating merged into a CSV file, Once the scraping is complete, the next step involves labeling each review with positive or negative sentiment.

Labeling Data

The data labeling process consists of two stages: automated labeling based on review scores and manual verification by three annotators. The automated labeling assigns negative labels to scores less than or equal to 3 and positive labels to scores greater than or equal to 4. The subsequent manual verification involves three annotators assessing the alignment between the review content and the assigned score.

The automated labeling utilizes Python programming, creating a new field and setting conditions based on review scores. The resulting sentiment distribution is

Annotators verify and offer their perspectives based on the review content. Satisfaction with the application yields a positive sentiment, while dissatisfaction results in a negative sentiment.

Annotators’ sentiments are derived based on their verification of reviews

Discrepancies between positive and negative labels arise from variations in user reviews and assigned scores. The distribution details are explored in

Considering the significance of data balance for Naïve Bayes modeling, the author opts for balanced datasets of 1200 samples for each positive and negative class through random undersampling.

Preprocessing

Data preprocessing is a crucial step in ensuring the accuracy and suitability of data for subsequent analysis, particularly before word weighting. This report outlines the various stages of data preprocessing applied to user reviews.

Data Cleaning

In the data cleaning stage, characters such as numbers and punctuation are removed from the reviews. The table below illustrates the comparison before and after data cleaning.

Case Folding

Case folding involves removing unnecessary characters and standardizing words to lowercase.

Tokenizing

Tokenizing divides the text into smaller units known as tokens.

Normalization

Normalization involves removing unnecessary characters and standardizing words to lowercase.

Formalization

Formalization standardizes words not by proper language into appropriate Indonesian terms.

Stopword Removal

Stopword removal eliminates words without meaningful content.

Stemming

Stemming addresses non-standard words and words with prefixes or suffixes to obtain the base form.

TF-IDF

Feature extraction is a crucial step in preparing text data for analysis. TF-IDF (Term Frequency-Inverse Document Frequency) is a widely used method to measure the importance of words in a document based on how often they appear in that document and how common they are across the entire document collection.

TF-IDF is calculated by multiplying the term frequency (TF) with the inverse document frequency (IDF). TF measures how frequently a term appears in a specific document, while IDF assesses the importance of a term across the entire document collection.

Spliting Data

After the word weighting process, the next step is data splitting, which involves using K-fold cross-validation to test the model’s performance. The simulation involves testing with values of “n” ranging from 3 to 10 for data splitting.

The table below illustrates a simulation of dividing data into 5 folds using K-fold cross-validation. The total data is split into five sets, and each set is used as a test set in one of the five iterations.

To determine the percentage of data for training and testing, the following formulas are used:

Percentage of Test Data = (Number of Test Data / Total Data) * 100

Percentage of Training Data = (Number of Training Data / Total Data) * 100

Using these formulas, for example, the percentage of test data is calculated as follows:

Percentage of Test Data = (480 / 2400) * 100 = 20%

Percentage of Training Data = (1920 / 2400) * 100 = 80%

Thus, the percentage of test data is 20%, and the percentage of training data is 80%.

In the k-fold cross-validation method, automatic data separation is performed to obtain training data. Each iteration’s training data constitutes 80% of the total available data.

In each iteration, the K-fold selects a specific fold of data as the testing data, while the other folds are used as training data. Consequently, in each iteration, 20% of the data is used as test data to evaluate the performance of the model trained using the training data.

Model Testing and Cross-Validation Results

Following the modeling phase, the next step involves testing the model’s performance using a test dataset. This testing phase is conducted based on the k-fold cross-validation dataset split. After conducting several trials, it was determined that a k-value of 5 provides optimal accuracy and model evaluation. Therefore, the researcher utilized a fold split up to k = 5, with a data distribution of 80% for training and 20% for testing.

The table above presents the accuracy results for each iteration or fold during cross-validation. In each iteration, the model is evaluated using separate test data, and the accuracy of its predictions is calculated.

Explanation of Accuracy Results

  • Iterasi ke-1: The model achieves an accuracy of 0.81, indicating an 81% accuracy when tested on the first fold of data.
  • Iterasi ke-2: The model achieves an accuracy of 0.79, showing a 79% accuracy when tested on the second fold of data.
  • Iterasi ke-3: The model achieves an accuracy of 0.84, demonstrating an 84% accuracy when tested on the third fold of data.
  • Iterasi ke-4: The model achieves an accuracy of 0.75, indicating a 75% accuracy when tested on the fourth fold of data.
  • Iterasi ke-5: The model achieves an accuracy of 0.76, showing a 76% accuracy when tested on the fifth fold of data.

The average accuracy across all iterations is 0.80, representing the overall model’s average performance on all data folds. The accuracy results for each iteration provide insight into how well the model can predict on each testing data fold. In this case, the accuracy values vary from iteration to iteration, with a range between 0.76 and 0.84. An average accuracy of 0.80 indicates that, overall, the model performs well in making predictions.

Prediction

Manual Sentiment Prediction:

Based on the provided review, the sentiment prediction indicates a positive & negative sentiment.

Sentiment Prediction Using CSV Data:

The process of sentiment prediction involves the preparation of a CSV file containing review content. The file is then read into a Python program using Jupyter Notebook, and predictions are made for the sentiment of each review.

Dashboard Visualization

Analysis & Insight

Sentiment Proportion:

  • Objective: Classify user-generated content as positive or negative sentiment.
  • Insight: The pie chart indicates that the majority of TroveSkin users express positive sentiments, accounting for approximately 59.90% of the total reviews. Negative sentiments make up around 40.10% of the reviews.

Sentiment Trend Analysis:

  • Objective: Analyze sentiment trends over time to identify changes in user opinions.
  • Insight: The sentiment trend line shows that in July 2019, TroveSkin received the highest number of positive sentiments, totaling 635 positive reviews. This suggests a peak in positive user perceptions during that period.

Major Positive Sentiment Aspects:

  • Objective: Identify the most frequently mentioned positive aspects of TroveSkin.
  • Insight: The word cloud for positive sentiments highlights words such as “bagus” (good), “bantu” (helpful), “suka” (like), “keren” (cool), and “membantu” (assisting), indicating aspects users appreciate.

Major Negative Sentiment Aspects:

  • Objective: Identify the most frequently mentioned issues or challenges in negative feedback.
  • Insight: The word cloud for negative sentiments highlights words such as “bagus” (good), “lumayan” (okay), “kamera” (camera), “login,” “jelek” (bad), and “error,” suggesting users often start with positive words but follow with criticisms.

Root Cause Analysis:

  • Objective: Conduct an in-depth analysis of positive and negative sentiments to understand the root causes of satisfaction or dissatisfaction.
  • Insight: The content of reviews with high thumbs-up (>10) is analyzed to understand recurring themes affecting user experience.

Positive Findings:

  • Effectiveness: Users appreciate the app’s ability to enhance skin quality within a short time.
  • Beneficial and Useful: The app assists users in better facial skincare, identifying individual skin issues, and providing suitable product recommendations.
  • Product Recommendations: Users are pleased with the skincare product recommendations.
  • User-Friendly: Positive feedback on the app’s ease of use.
  • Engaging Content: Users enjoy interesting content and challenges.
  • Free and Reward Points: Users appreciate free skincare and reward points.

Negative Findings:

  • Inaccurate Face Analysis: Users report inaccuracies and inconsistency in face analysis results.
  • Upload and Recognition Issues: Problems with uploading face photos and face recognition failures.
  • Login and Installation Problems: Users face issues during login and installation.
  • App Errors and Technical Issues: Frequent app errors, server issues, slow loading, and network errors.
  • Disappointment with Paid Features: Users express disappointment with formerly free features now being paid.
  • Inconsistent Analysis Results: Some users feel the skin analysis results do not match their actual skin condition.

Recommendations:

  • Objective: Provide actionable recommendations based on positive and negative reviews to enhance user experience.

Positive Sentiment Recommendations:

  • Expand Skin Detection Features: Enhance comprehensive skin detection features.
  • Improve Product Recommendation Quality: Continuously enhance the accuracy of skincare product recommendations.
  • Diversify Offered Products: Increase the variety of recommended skincare products.
  • More Educational Content: Provide more educational content about skincare and common skin issues.
  • Skin Progress Tracking: Introduce a feature to track skin progress over time.

Negative Sentiment Recommendations:

  • Enhance App Performance: Address issues related to slow app performance and frequent crashes.
  • Responsive Feedback: Respond promptly and proactively to user feedback and support inquiries.
  • Improve Product Suitability: Ensure skincare product recommendations are more accurate and suitable.
  • Clear Instructions: Provide clear and easy-to-follow instructions, especially for users new to such apps.
  • Offer Alternative Features: Provide alternatives for users who prefer not to use paid features.
  • Consider User Privacy: Ensure robust data protection and offer privacy control options for users.

Conclusion:

  • Based on the analysis of 3000 sentiment reviews, the majority of TroveSkin users express positive sentiments. Recommendations focus on strengthening positive aspects and addressing issues highlighted in negative sentiments to improve overall user satisfaction. The analysis provides valuable insights for TroveSkin’s continuous improvement and user-focused enhancements.

--

--