A forecast of future retail sales for FMCG stores using Snowflake's cloud data platform, Snowpark for Python data analysis, and an XGBoost model to generate accurate predictions. By leveraging Snowflake's infrastructure and ThoughtSpot's AI-Powered Analytics, we seamlessly generated sales forecasts for the next six months. This project demonstrated how advanced data science techniques combined with Snowflake and ThoughtSpot can extract powerful insights from data to guide key business decisions and dramatically improve sales forecasting.
First, we connected to the Snowflake database using Snowpark Python. This lets us easily access all the retail data needed.
Then we grabbed a historical sales dataset with info on sales and store demographics from different locations. This dataset had everything we needed to start digging in.
We cleaned up the data, got it into a nice Pandas DataFrame, and organized it around the 'Transaction_Date'. This made the data much easier to work with.
We decided to use XGBoost since it's great for problems like sales forecasting. We spent some time tuning it to fit our data just right - tweaking things like the learning rate and number of estimators.
To streamline training, we set up a Stored Procedure in Snowflake to re-train the model efficiently.
Then came the fun part - testing the model on three months of data to see how well it forecasted sales. The results were looking good!
Now our XGBoost model is ready to start forecasting sales for the next 6 months. This will give us crucial insights to make better decisions. For detailed instructions, please refer to https://github.com/sree-soundarya/Sales-forecasting/blob/main/forecasting.ipynb.