back to catalog
PythonAPI Tools
works in free trial | Built on 9.6.0.cl

Snowflake Snowpark Sentiment Analysis Get code

Code by ThoughtSpot

Unlock the potential of sentiment analysis using Snowflake-Snowpark-Python and Amazon Beauty product reviews. By combining Snowpark's processing capabilities with ThoughtSpot’s AI-Powered Analytics, organizations can make complex data science insights readily available to their business users.

With this code, we apply Amazon Beauty product review data to perform sentiment analysis, process data with Snowpark Python, and visualize results via ThoughtSpot.

Rate this code

Thank you for rating this code!

Installation instructions

1.VSCode Installation

Download and install VSCode, then follow the installation guides for Mac, Linux, or Windows.

2.Extensions Installation

Open VSCode, click on the Extensions icon, search, and install ‘Jupyter’ and ‘Python’.

3.Python Libraries Installation

To perform sentiment analysis, ensure you install the following libraries via the terminal using the 'pip install ' or 'python -m pip install ' * Snowflake-snowpark-python * Snowflake-connector-python[pandas] * Pandas * Numpy * Matplotlib * Wordcloud * Nltk * Boto3 * Text2emotion

4.Connecting to Snowflake

Create a new .json file with your Snowflake account details for connection. { "account" : "ADD YOUR SNOWFLAKE ACCOUNT NAME ", "user" : "***********", "password" : "***********", "role" : "USER_ROLE", "warehouse" : "WAREHOUSE_NAME", "database" : "DBNAME", "schema" : "PUBLIC" }

5.Connecting to AWS S3 Bucket

Replace the placeholder AWS keys with actual keys in the script to read the S3 bucket JSON data. def read_json(bucket:str,filename:str)-> T.Variant: import boto3 import json import pandas as pd

6.Reading the Data

The Amazon review dataset consists of two files: AllBeauty.json, metaAll_Beauty.json. Scripts are provided to read and process the AWS data.

7.Data Cleaning and Merging

Employ the scripts for data cleaning and merging, creating a dataset ready for analysis.

8.Calculating Vader Sentiment Score and Sentiment

Using helper functions, calculate Vader sentiment scores. The cleaned data, with sentiment scores included, is then added to a Snowflake table named ‘BEAUTY_PRODUCT_REVIEWS’ for further analysis.

9.Emotion Detection Using Review Data

Apply the text2emotion library to determine the emotions expressed in each review. The results are stored in a Snowflake table named ‘EMOTIONS_OVERALL’.

10.Category-based Analysis

Categorize products and perform an analysis based on these categories. Results can be stored and analyzed in another Snowflake table ‘Category Analysis’ In the provided code, we extract only the necessary columns from the refined dataset. Notably, we apply a filter to obtain data from the top 10 product categories, focusing on those with substantial review counts.

11.Product Recommendations

Leverage sentiment scores to classify products as recommended or not based on sentiment expressed in reviews. We establish a classification threshold of 0.75 to determine whether a product should be categorized as recommended or not. Store the recommendations in a Snowflake table named ‘RECOMMENDATIONS’ for easy retrieval and analysis.

12.Examining Sentiment Trends Over Time

Examining sentiment trends over time involves tracking how sentiments expressed in reviews change across different periods. This analysis helps uncover patterns and fluctuations in sentiments, offering valuable insights into customer opinions and reactions. Store this data in the ‘SENTIMENT_TRENDS’ Snowflake table.

Conclusion

After connecting to Snowflake, ThoughtSpot’s AI-powered Analytics allow anyone to ask and answer questions from our Snowpark’s machine-learning results. That’s Snowpark for ThoughtSpot, bridging the gap between data science and business outcomes.

Get code