Duration: 2:00
Today's businesses rely on the most up-to-data information to take advantage of market changes, understand trends, and grow. The modern data stack is critical to unlocking this data by using modern technologies to store, transform, and visualize this data. During this tutorial, you will learn how to use some of the major components of the modern data stack to derive business insights using modern software development practices.
The key technologies in the modern data stack you will use are:
Snowflake is the Data Cloud for storing and managing data. It simplifies data pipelines, so you can focus on data and analytics instead of infrastructure management.
dbt is a transformation workflow that lets teams quickly and collaboratively deploy analytics code following software engineering best practices like modularity, portability, CI/CD, and documentation. Now anyone who knows SQL can build production-grade data pipelines. It transforms data in the warehouse, leveraging cloud data platforms like Snowflake.
ThoughtSpot is the UX layer for your data. It provides powerful, natural language search and rich visualization tools for self-service analytics. In addition, developers can share and embed data anywhere using APIs and SDKs in leading programming languages.
Let's get started.
- A trial Snowflake account with ACCOUNTADMIN access
- A ThoughtSpot account
- How to build scalable data transformation pipelines using dbt & Snowflake
- How to use ThoughtSpot to search and visualize data, and embed into web apps
- How to connect Snowflake,dbt, and ThoughtSpot to deliver immediate value from the modern data stack
Duration: 10:00
4. Create the database and schema
create database thoughtspot;
create schema thoughtspot.dbtqs;
5. Now create three tables:
Customers
create table thoughtspot.dbtqs.customers
( id integer,
first_name varchar,
last_name varchar
);
copy into thoughtspot.dbtqs.customers (id, first_name, last_name)
from 's3://dbt-tutorial-public/jaffle_shop_customers.csv'
file_format = (
type = 'CSV'
field_delimiter = ','
skip_header = 1
);
Orders
create table thoughtspot.dbtqs.orders
( id integer,
user_id integer,
order_date date,
status varchar,
_etl_loaded_at timestamp default current_timestamp
);
copy into thoughtspot.dbtqs.orders (id, user_id, order_date, status)
from 's3://dbt-tutorial-public/jaffle_shop_orders.csv'
file_format = (
type = 'CSV'
field_delimiter = ','
skip_header = 1
);
Payments
create table thoughtspot.dbtqs.payment
( id integer,
orderid integer,
paymentmethod varchar,
status varchar,
amount integer,
created date,
_batched_at timestamp default current_timestamp
);
copy into thoughtspot.dbtqs.payment (id, orderid, paymentmethod,
status, amount, created)
from 's3://dbt-tutorial-public/stripe_payments.csv'
file_format = (
type = 'CSV'
field_delimiter = ','
skip_header = 1
);
Once complete, execute the following statement to confirm everything is set up correctly:
select * from thoughtspot.dbtqs.customers;
select * from thoughtspot.dbtqs.orders;
select * from thoughtspot.dbtqs.payment;
Great! Now it’s time to set up dbt Cloud.
Duration: 5:00
Check to make sure your role is set as the ACCOUNTADMIN role.
Note: If you do not see Partner Connect in the dropdown,you are using the new Snowsight UI. In this case, Partner Connect is under the Admin menu. Select Admin from the left hand navigation and choose Partner connect there.
Duration: 7:00
To add sources to your project:
version: 2
sources:
- name: dbtqs_raw
database: thoughtspot
schema: dbtqs
tables:
- name: customers
- name: orders
Duration: 12:00
Now that we have our project set up, let's make some new models via dbt. Click the models folder and tap ..., then select New file. Name it models/customers.sql. Paste the following query and save.
with customers as (
Select
id as customer_id,
First_name,
Last_name
from {{ source('dbtqs_raw', 'customers') }}
),
orders as (
Select
id as order_id,
user_id as customer_id,
Order_date,
Status
from {{ source('dbtqs_raw', 'orders') }}
),
customer_orders as (
Select
Customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders from orders
group by 1
),
final as (
Select
Customers.customer_id,
Customers.first_name,
Customers.last_name,
Customer_orders.first_order_date,
Customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from final
Save your changes, and enter dbt run in the command line prompt at the bottom of the dbt browser window and tap enter.
If everything is successful, you should see a pass message.
Duration: 12:00
Select
id as customer_id,
First_name,
Last_name
from {{ source('dbtqs_raw', 'customers') }}
Create a second new SQL file, models/stg_orders.sql, with the SQL from orders in our original query.
Select
id as order_id,
user_id as customer_id,
Order_date,
Status
from {{ source('dbtqs_raw', 'orders') }}
Then, edit models/customers.sql to include these models.
with customers as (
select * from {{ ref('stg_customers') }}
),orders as (
select * from {{ ref('stg_orders') }}
),
customer_orders as (
Select
Customer_id,
min(order_date) as first_order_date,
max(order_date) as most_recent_order_date,
count(order_id) as number_of_orders
from orders
group by 1
),
final as (
Select
Customers.customer_id,
Customers.first_name,
Customers.last_name,
Customer_orders.first_order_date,
Customer_orders.most_recent_order_date,
coalesce(customer_orders.number_of_orders, 0) as number_of_orders
from customers
left join customer_orders using (customer_id)
)
select * from final
If everything looks right, your Lineage view after saving models/customers.sql should look like this.All that is left to do is build your models using dbt run. This time, when you performed a dbt run, separate views/tables were created for stg_customers, stg_orders and customers. dbt inferred the order to run these models. Because customers depend on stg_customers and stg_orders, dbt builds customers last. You do not need to explicitly define these dependencies.
Duration: 10:00
dbt provides integrated testing and documentation tools to make it very easy to ensure your models perform as expected. The ThoughtSpot dbt wizard also uses the generated documents to introspect the schema.
To add tests to your project:
version: 2models: - name: customers columns: - name: customer_id tests: - unique - not_null - name: stg_customers columns: - name: customer_id tests: - unique - not_null - name: stg_orders columns: - name: order_id tests: - unique - not_null - name: status tests: - accepted_values: values: ['placed', 'shipped', 'completed', 'return_pending', 'returned'] - name: customer_id tests: - not_null - relationships: to: ref('stg_customers') field: customer_id
Adding documentation to your project allows you to describe your models in rich detail, and share that information with your team. Here, we're going to add some basic documentation to our project.
Update your models/schema.yml file to include some descriptions, such as those below.
version: 2models: - name: customers description: One record per customer columns: - name: customer_id description: Primary key tests: - unique - not_null - name: first_order_date description: NULL when a customer has not yet placed an order. - name: stg_customers description: This model cleans up customer data columns: - name: customer_id description: Primary key tests: - unique - not_null - name: stg_orders description: This model cleans up order data columns: - name: order_id description: Primary key tests: - unique - not_null - name: status tests: - accepted_values: values: ['placed', 'shipped', 'completed', 'return_pending', 'returned'] - name: customer_id tests: - not_null - relationships: to: ref('stg_customers') field: customer_id
Run dbt docs generate to generate the documentation for your project. dbt introspects your project and your warehouse a series of json files with metadata about your project, including lineage. Once generated, tap the Documentation tab to see your generated docs alongside those of the dbt platform.
Duration: 5:00
All that is left is to deploy your job/project.
Duration: 10:00
Now that you've created your models, the next step is helping business users derive insights from the data. This is where ThoughtSpot works as the user experience layer for the modern data stack. ThoughtSpot provides self-service analytics to allow users to find the answers they need and uncover additional insights they need to make better, more informed decisions.
The first step is to sign up for a free trial account. In a few minutes you will receive a verification email. Complete the steps to access your ThoughtSpot environment. Initially, you will be presented with a product tour, feel free to follow through the tour or tap the ThoughtSpot logo to skip.
The first thing that we need to do is create a connection to Snowflake. Tap the data tab from the top menu , then connections from the left hand navigation, and finally the Snowflake tile. Make sure you have your snowflake account open in another browser window. You will need some configuration information to complete the wizard
Follow the connection wizard using the information below, giving your connection the name ts+dbt_tutorial.
Field name | Value |
Account name or snowflake URL | This is the 8 digit snowflake instance number. You can access via the snowflake URL line. |
User | The username you signed up with in Snowflake |
Password | Your snowflake password |
Role | PC_DBT_ROLE |
Data Warehouse | PC_DBT_WH |
Database | PC_DBT_DB |
Once you have entered your configuration details and tapped continue, ThoughtSpot will retrieve a list of schemas your account has access to. You should see a schema with a unique name based on your id. For example, in the image below, my schema is DBT_QWALL If you do not, double check your configuration details above.
Go ahead and tap Create Connection. For now, all we need is the snowflake connection to be established correctly.
Duration: 10:00
Now that we have our connection to Snowflake established, we can take advantage of the models created in dbt to dynamically create worksheets and liveboards within ThoughtSpot.You can think of a worksheet as a normalized view of the data designed for the business, and liveboards as dynamic dashboards with the ability to continuously drill into the information to uncover insights.
ThoughtSpot understands dbt yaml definitions to help us dynamically create everything. Tap on data in the top menu, then this time, choose utilities. The utilities section contains a number of tools to help you with data on the platform. We are going to take advantage of the dbt integration wizard. Go ahead and tap on the open dbt integration wizard button.
Within the wizard, select ts+dbt_tutorial as the data warehouse. This will automatically set the database to PC_DBT_DB. Next, connect to your dbt project via dbt cloud with the information below.
Field name | Value |
API Key | Within dbt Cloud, tap the cog in the top right, select Account Settings > API Access. Scroll down and copy the API key. |
Account Id | Within dbt Cloud, make sure you are in the project you created earlier in this tutorial, then get account ID from the URL as shown below |
Project Id | You can grab the project id from same URL as account id |
Once you have entered all the fields, tap next. You will be presented with a list of folders to import. Your project has just one folder, creatively named models which contains your 3 models. Select that folder and tap Next.
Review the dbt tables you want. In our tutorial, we want them all, so you can tap Finish.
The wizard will now create tables and worksheets (logical views of data) within ThoughtSpot based on your models. If everything is successful, you will see two worksheets that are ready to search on. The great thing is that all your modeling work to create the Customers table is applied to your worksheet. Tap Exit to finish the wizard.
We can take advantage of some additional features in ThoughtSpot to make it easier for data to be searched. Many businesses use different terms across business units or in different countries. For example, some parts of the company might use "Last Name" whereas others use "Surname". ThoughtSpot allows you to assign synonyms to columns for easier searching. We will add two to the customers table.
Tap Data, and select Customers_Worksheet from the list, being careful not to select the Customers table by accident. There is no problem doing so, but worksheets are the logical view that we want to work with here.
Then, scroll to the right until you see the Synonyms column. In the Last_Name row, add “Surname”. Then in the Most_Recent_Order row add “Last Order” as a synonym. Finally, tap Save Changes.
Duration: 10:00
Let's go ahead and start searching our data. Tap Search Data from the top menu then, tap choose sources. Select CUSTOMERS_WORKSHEET and tap Done.
You will now be in the ThoughtSpot search interface. Go ahead and enter the following search term in the search box: Surname, NUMBER_OF_ORDERS monthly and tap Go. This will create a line chart which isn't exactly what we want. Tap the chart icon and select Stacked Column.
You will now see a chart showing March was a big month in sales. Hover over the colored blocks to see how many orders a customer made in that month. (Customer surnames are a single letter in the sample data we loaded into Snowflake). You might have noticed that we used the Surname synonym too. Nice job!
Rename the search to "Monthly orders", then tap the ... and select Save, and finally Save answer.
Next, we want to pin it to a Liveboard. You can think of a Liveboard as a collection of related Answers to help you run your business. We want a liveboard to know more about our orders and customers. Tap the Pin button, Select Create Liveboard and call it “Customer Insights”. Then tap the blue tick and Pin to pin it to your new board.
Let's create one more answer. Tap Search data to start a new search. Your customers_worksheet should still be selected. This time, we want to keep track of repeat customers. Add the following search: count CUSTOMER_ID NUMBER_OF_ORDERS >= 2 and Tap Go. ThoughtSpot knows this is a single result and will display the data accordingly. Rename the answer to "Gold Customers" and tap Pin to add it to your Liveboard.
Finally, tap on Liveboards from the Top Navigation and select Customer Insights.
Congratulations. You've created your first visualizations using ThoughtSpot and dbt. We only scratched the surface of ThoughtSpot analytics, but it demonstrates how easily you can leverage dbt models to quickly uncover insights in your data.
Duration: 10:00
Now that we have our answers and liveboards created, the next step is to embed these same components into a web app. This is a very common scenario for internal portals, adding analytics to existing apps, and sharing with customers for B2B scenarios. Think about how often you use embedded analytics in your daily lives: on your smartwatch to track fitness activity, within your banking app to keep track of credit scores and budgets, and of course, rideshare and shopping apps. Almost every app we use is driven by data.
ThoughtSpot makes it incredibly easy to embed these insights. To get started, whenever you embed components from ThoughtSpot into your webapp,you need to know their unique identifier and what interactive attributes you can set. The easiest way to do this, and have ThoughtSpot generate much of the code for you, is to use the Developer Playground.
Tap on Develop from the top navigation bar and select Playground from the right hand menu. Then, make sure the Select feature to embed is Search. Tap Select saved search, and choose “Monthly Orders”. As soon as you select this, the code window will update highlighting the unique identifier, answerId. Copy and paste this id to your clipboard. We will use this in our web app shortly.
You can also experiment with checking any of the user experience customizations and instantly see the code change. With the Developer Playground, you can save time searching through docs for the parameters you need. Just point and click to get the code you need. The Developer Playground is a great resource for experimenting with what you can do with the platform. When you are ready, tap Run to see the results rendered in the Playground.
Next, choose Liveboard from the Select feature to embed dropdown, then select Customer Insights and tap run. Copy and paste the liveboardId to your clipboard.
Duration: 15:00
Now that you have your answerId and liveboardId, it’s time to start embedding them into your web app. We are going to use the Visual Embed SDK to add our Answers and Liveboards to a React-based web app. To make things easy, we will use CodeSandbox, a web based IDE and runtime environment for building web apps.
Using CodeSandbox means we don’t need to spend time configuring our local environment for React development. The good news is that ThoughtSpot Everywhere uses the languages and developer processes you already know and love. If you already have your local environment setup for React development, feel free to use that.
In your browser, go to codesandbox.io and tap the Create a Sandbox button on the top right of the page, then select the React template. This will create a very simple web app, with code on the left, and rendered page on the right.
Next, add the ThoughtSpot Visual Embed SDK as a dependency to your project. Type @thoughtspot into the dependency pane, then select the Visual Embed SDK from the autocomplete dropdown.
Replace the contents of App.js with the following code, and save your changes. Your ThoughtSpot trial URL may begin with my1.thoughtspot.cloud or my2.thoughtspot.cloud, so match it accordingly. Save your changes.
import "./styles.css";import { init, AuthType } from "@thoughtspot/visual-embed-sdk";init({ thoughtSpotHost: "https://my1.thoughtspot.cloud", authType: AuthType.None});export default function App() { return ( <div className="App"> <h1>Hello ThoughtSpot Everywhere </h1> <p>This is a simple demo embedding ThoughSpot in a React App.</p> </div> );}
You will notice that the code includes some logic to authenticate with ThoughtSpot. In this tutorial, we are using AuthType.None. This will prompt the user to log in when the page loads. This is fine for the tutorial, but not recommended for a production app. For a detailed overview of security options supported by the Visual Embed SDK, please check out the online documentation.
Let’s embed the “Monthly orders” Search Answer. Right click on the src folder in your project, and select New File. Call it SearchComponent.js. Then, add the following code, replace YOUR-ANSWER-HERE with the searchId you pasted into your clipboard, and save your changes. The Visual Embed SDK comes with React components to make embedding analytics incredibly easy. In addition to React, you can use Typescript, Vanilla JavaScript, or other frameworks just as easily.
import "./styles.css";import { init, AuthType } from "@thoughtspot/visual-embed-sdk";import { SearchEmbed } from "@thoughtspot/visual-embed-sdk/react";export const SearchComp = () => { return ( <SearchEmbed frameParams={{ height: "80vw" }} answerId={"YOUR-ANSWERID-HERE"} /> );};
Next, update App.js to import the SearchComponent and render the results by adding the component within your App() function:
import { SearchComp } from "./SearchComponent";init({ thoughtSpotHost: "https://try.thoughtspot.cloud", authType: AuthType.None});export default function App() { return ( <div className="App"> <h1>Hello ThoughtSpot Everywhere </h1> <p>This is a simple demo embedding ThoughSpot in a React App.</p> <SearchComp /> </div> );}
As soon as you save your changes, the right hand frame will reload prompting you to log into ThoughtSpot. Once authenticated your Search Answer will render. You can click and interact with the results drilling down to further uncover insights.
Embedding a Liveboard component is very similar to embedding a Search component. Go ahead and create a new page in the src directory of your CodeSandbox project. Call it LiveboardComponent.js, and add the following code, replacing YOUR-LIVEBOARD-HERE with the liveboardId from your clipboard. If you no longer have the liveboardId, you can always get it from the Developer Playground.
import { LiveboardEmbed } from "@thoughtspot/visual-embed-sdk/react";export const LiveboardComp = () => { return ( <LiveboardEmbed frameParams={{ height: "80vw" }} liveboardId={"YOUR-LIVEBOARDID-HERE"} /> );};
Then, modify App.js to import the LiveboardComponent and render that component instead of the SearchComponent. Your completed App.js should look similar to this:
import "./styles.css";import { init, AuthType } from "@thoughtspot/visual-embed-sdk";import { SearchComp } from "./SearchComponent";import { LiveboardComp } from "./LiveboardComponent";init({ thoughtSpotHost: "https://try.thoughtspot.cloud", authType: AuthType.None});export default function App() { return ( <div className="App"> <h1>Hello ThoughtSpot Everywhere </h1> <p>This is a simple demo embedding ThoughSpot in a React App.</p> <LiveboardComp /> </div> );}
Save your changes, and the Liveboard will now render within your app. Great job!
Duration: 5:00
Throughout this tutorial you've created, connected, and implemented the major aspects of the modern data stack: you started with a Snowflake database and loading data into a schema, then leveraged dbt to model your data using modern development practices, and finally connected that dbt model to ThoughtSpot to derive analytics from the data to deliver business insights. Finally, you took the ThoughtSpot analytics and embedded them into a React-based web app to make it easier for all users to benefit. Sit back and congratulate yourself. It's time to have a well earned coffee.