Multi-tenancy with groups

Multi-tenancy with groups

ThoughtSpot recommends using Orgs for multi-tenancy. If you don’t intend to enable Orgs on your instance, use the following options to segregate and isolate your tenants' data:

  • Multi-tenancy at the ThoughtSpot user level

    Each instance of ThoughtSpot has users, who belong to various groups. ThoughtSpot groups are the best mechanism for all access control and security within ThoughtSpot. Groups serve the purpose of folders, roles, and row-level security assignment in ThoughtSpot. When configured correctly, users from one organization never see content, groups, or other users from different organizations.

    Because search is the primary organization method within ThoughtSpot and group membership is the mechanism for access control, when viewing the server as an administrator, all of the users, groups and content will be available and the multi-tenanted nature presented to the individual users may not be obvious at a glance.

    Creating and auditing the groups, group membership and the sharing settings are best accomplished via the REST API. All settings and configurations are available through the ThoughtSpot UI, but at production scale all synchronization between the web application and ThoughtSpot is typically accomplished via the REST API.

  • Multi-tenancy at the data level

    ThoughtSpot connects to cloud data warehouses (CDW) to retrieve data. CDWs can be configured as multi-tenant or single-tenant.

    There are two aspects of groups which interact to create the “wall” between customer organizations: “shared content” and the “sharing visibility” property of groups and users.

Multi-tenancy at the user level🔗

Each group and user has a "sharing visibility" property, which can be set to SHAREABLE or NOT SHAREABLE.

When a group is NOT SHAREABLE, the users within that group cannot share to that group or to other users in that group.

NOT SHAREABLE groups provide access to content (or privileges) without breaking user separation. Each user has no concept they belong to the group; they simply have access to the content.

When a user is set to SHAREABLE, other users within the same SHAREABLE group as that user can share content to them individually.

Note

Any user with Share with all privilege, including administrators, can share objects with any group regardless of the shareable property. Every user is automatically added to the NOT SHAREABLE "All" group within ThoughtSpot.

Only grant the separate "Share with all privilege" to app developer / system level users, not to end customer users

The non-admin users can only share to:

  1. SHAREABLE groups they belong to

  2. SHAREABLE users within those groups.

SHAREABLE users' names are never seen by other users unless both users belong to the same SHAREABLE group.

To allow for multi-tenanted self-service, SHAREABLE groups can be created for each organization, so that only the users of each organization can share with one another. The choice of whether to make users shareable is up to you; if you prefer all sharing to be done at the group level, you can set all users to NOT SHAREABLE.

Note

Avoid using the same group for access control and other privileges. Because a user can share with anyone in a group they belong to, they can potentially share restricted data.

Multi-Tenancy at the data level🔗

We will call a database multi-tenanted when a single set of credentials can see all of the data for all customer organizations. In a multi-tenanted database, there is typically a column named “customer_id” or “tenant_id” on every row of data within the database; we’ll call it the tenant key. Filtering against this tenant key splits the data for each customer organization.

If the rules of multi-tenancy are more complex than a simple tenant key on each row of data, there should at least be one table which defines the tenants and can be joined in a way to filter the other tables appropriately.

If the cloud data warehouse you are connecting to is multi-tenanted as described above, then you will only need one set of data objects in ThoughtSpot. The tables will be configured with row-level security (RLS), so that every query will filter against the tenant key in the database.

If there is a separate database connection for each customer organization, then the database pattern is single-tenant, and you will need to create objects for each tenant. ThoughtSpot represents all data and content objects using the ThoughtSpot Modeling Language, with REST APIs for automating the creation or deployment of each tenant’s data and content objects via TML.

Access controls on content🔗

ThoughtSpot controls content access through the concept of sharing. Content in ThoughtSpot belongs to its creator (owner), and by default they are the only user who knows the content exists. This allows for self-service creation of new searches and Liveboards.

A user only sees content when it is:

  • Shared directly with the user

  • Shared with a group the user belongs to

  • Created or owned by the user

When content is shared to a group, all of the members of that group will have access to the content.

Content can be shared as Can View / READ_ONLY or Can Edit / MODIFY (UI / REST API). In general, all content provided to end users from the app developer would be shared as Can View / READ_ONLY.

Sharing can occur through the UI (including when embedded) or via the security REST APIs.

What content should be shared?🔗

While you can share individual tables from connections to users, the best practice is to create worksheets and only share the relevant worksheets to end users. Any Liveboards and saved answers shared to users should only connect to worksheets.

Remember to share the Worksheet as READ_ONLY along with the Liveboards and answers so the users can access self-service features such as changing filter values.

Row level security (RLS) groups🔗

Row level security (RLS) is used to filter the results of database queries to only show a user the data they should have access to.

RLS rules in ThoughtSpot use the username or the group names of the groups the user belongs to as part of all queries.

RLS groups must have names that exactly match values in the database. When RLS rules are in place, a user’s set of groups is placed into the WHERE clauses of queries in the form of WHERE [field] IN ('group_1', 'group_2', …​).

RLS groups should be set to NOT SHAREABLE. This way you know that content sharing only occurs via the content access groups. It is much simpler to audit content access by using separate groups for each functionality.

RLS can be considerably more complex than just splitting at the tenant level and ThoughtSpot does facilitate these more complex models (see the comprehensive examples and best practices guide). However, the basics of RLS to split at the tenant key level are always present and require the creation of the RLS groups.

Column level security (CLS) groups🔗

Column level security (CLS) can be configured at the individual table level through sharing. As with row level security groups, the best practice is to create separate groups specifically for the CLS groups.

Best practices for multi-tenant database and single-tenant databases models🔗

There are two basic architectures for storing different tenants' data within cloud data warehouses. The following sections describe a best practice starting point for deploying in ThoughtSpot depending on which of the architectures you have chosen for your CDW. REST APIs are available to deploy these patterns at scale. You can create groups, create users, add users to groups, publish content from TML objects, and share that content with the appropriate groups.

Multi-tenant database model🔗

The "multi-tenant database model" is designed on the following principles:

  • A single database to connect to, with a tenant key value that can be filtered on to retrieve data just for a single customer organization

  • Multiple customer organizations in ThoughtSpot

  • Content (answers and Liveboards) provided by the app developer

  • Users within the customer organizations can create their own content, and can share it with other users within their own organizations only

The multi-tenant database model is simpler to implement within ThoughtSpot than the single-tenant databases model. Because data security is enforced via RLS in the multi-tenant database model, ThoughtSpot only requires a single version of any object to serve all tenants. Even if your production databases are split as single tenants, you may choose to bring everything into a single database within your cloud data warehouse to enable this model.

Content provided by app developer🔗

The app developer (the ThoughtSpot customer) will create at minimum the data model objects within ThoughtSpot and typically some “pre-built” searches and Liveboards. Because there is a single database connection, there is only a need for one of each object. Row level security at the table level will ensure that each user only sees data from their organization, even though they are connecting to the same Liveboards and worksheets.

Objects created by the application developer to be shared with all users can be published by a single group that all users belong to; we’ll call this the “app content group” (the actual group name can be whatever you like, something like “prod standard reports”). The application group should be configured as NOT SHAREABLE, because every user will belong to this group.

In most cases, only worksheets should be shared to the end users, while the tables within the Worksheet do not (this is allowed by the default ThoughtSpot configuration). Thus there should be a separate group for just the tables; we’ll call this the “app data model group”.

If you want, you can publish all content in the application group from a single user representing the app developer or the application itself.

Content belonging to individual tenants🔗

To allow users to create their own content and share only within their organization, you will create at least one group for each tenant. This group should be set to SHAREABLE, since only those users within the group will see that content. If the app developer will be building custom content per tenant, you could create a separate group for that content, set to NOT SHAREABLE.

Summary of access groups for multi-tenant database model🔗

The following table lists the access groups needed for this model. There will also be privilege groups, data access groups, and development and test content groups. You can name the groups anything you’d like, with a naming scheme that makes sense to you. The "group type" names here are just indications of the purpose of those groups.

Reminder: when a group is set to NOT SHAREABLE, administrators can still share content to that group. NOT SHAREABLE groups are used for content provided by the app developer to end users.

Group typeContent shared to groupUsers in groupSharability

prod data model group

tables

app developer

NOT SHAREABLE

standard content group

worksheets, answers, Liveboards

all users

NOT SHAREABLE

tenant content groups (1 per tenant)

answers, Liveboards

tenant users per group

SHAREABLE

Multi-tenant database model

Single-tenant databases model🔗

The "single-tenant databases model" is designed on the following principles:

  • Each customer organization has its own database to connect to, with only that customer organization’s data present when making the database connection. Every database is similar in structure (table names and column names / data types).

  • Multiple customer organizations in ThoughtSpot

  • Content (answers and Liveboards) are provided by the app developer in the form of templates

  • Users within the customer organizations can create their own content, and can share it with other users within their own organizations only

If you have the choice between designing your cloud data warehouse along single-tenant or multi-tenant model, it will be simpler to implement in ThoughtSpot using the multi-tenant model.

Content provided by app developer🔗

Single-tenant databases require separate connections in ThoughtSpot for each database in most cases. There will then be separate objects on the ThoughtSpot Server for each connection. Because all of the objects other than the connection will be very similar, the deployment pattern can be handled through templating: there will be a set of template objects which are deployed for each tenant.

We can describe the template as the parent content, with child objects that descend from the template.

The template content itself will be built by the app developer, but will not be accessible to the customer organizations. Instead, there will be a deployment process that copies the template content and makes the necessary changes, and then publishes to the appropriate group for each customer.

Content provided by app developer to each tenant group🔗

Each tenant should have a group used to give access to the content provided by the app developer—a tenant application group. Only the application developer would publish content to this group, and it should be set to NOT SHAREABLE.

Content belonging to individual tenants🔗

To allow users to create their own content and share only within their organization, you will create at least one group for each tenant, separate from the application tenant group. This group can be set to SHAREABLE, or you may want additional groups below the main tenant group, representing different sets of users who belong to that tenant, and then make those child groups the ones that are SHAREABLE.

Summary of access groups for single-tenant databases model🔗

The following table lists the access groups needed for this model. There will also be privilege groups, data access groups, and development and test content groups. You can name the groups anything you’d like, with a naming scheme that makes sense to you. The "group type" names here are just indications of the purpose of those groups.

Reminder: when a group is set to NOT SHAREABLE, administrators can still share content to that group. NOT SHAREABLE groups are used for the content provided by the app developer to end users.

Group typeContent shared to groupUsers in groupSharability

prod template group

Template tables, worksheets, answers, Liveboards

app developer

SHAREABLE

standard data groups (1 per tenant)

tables (connected to tenant connection)

app developer

NOT SHAREABLE

standard content groups (1 per tenant)

worksheets, answers, Liveboards

tenant users per group

NOT SHAREABLE

tenant content groups (1 per tenant)

answers, Liveboards

tenant users per group

SHAREABLE

Single-tenant database model

Development and test content groups🔗

Most software development processes involve creating content in a restricted “development” environment, and then once the changes are finished, placing it in a “test” environment. Within a single ThoughtSpot instance, development and test content can be considered as another tenant’s, with access restricted to only app developer users.

For both of the multi-tenancy patterns above, add additional groups for dev and test with only members of your app development team.

Privilege groups🔗

Privileges in ThoughtSpot control the set of product features a user has access to. Privileges are assigned to users through groups.

A user’s privilege set is additive based on the groups they belong to; the user at all times has the full set of any privilege from any group they belong to. This is also to say that privileges do not apply only to content shared to the group.

The simplest best practice for assigning privileges to users is to create privilege groups, set to not shareable, with no content shared to them. When configured this way, a privilege group acts like a role definition, and users from any tenant can all belong to one of the server-wide privilege groups.

The REST API returns a user’s privilege set as part of the response from the GET /user/ endpoint.

Group hierarchy🔗

ThoughtSpot groups can be hierarchical; one group can be the parent of another group and so forth. We recommend not to use hierarchical groups in a multi-tenanted situation.

When groups are hierarchical, the rules for how privileges and row-level security are derived become complex. In particular, row-level security is achieved by returning the string value of the names of all groups a user belongs to. Hierarchical groups can vastly inflate the number of group names returned in an RLS query, reducing performance and introducing complexity in auditing.

Test user accounts🔗

As mentioned above, you will want to use REST API automation to synchronize the group structures and audit that you have configured them correctly. Another tool for auditing is to create test user accounts — user accounts that belong to the app developer, but are configured as if they are part of a customer organization.

Depending on your internal security policies, you may only want your test user accounts to log in to content attached to test data, rather than production customer data. In this case, you will create a full suite of test content groups simulating at least two “customers”, and test user accounts for each “access level” that exists for the end customer users.

Tags🔗

Tags are available in ThoughtSpot to label content and assist in searching. Content can be tagged with multiple tags.

Tags can be used as part of searches using the Metadata REST APIs, with the caveat that it is an inclusive list; the response will include all content with any of the tags sent, as opposed to only including content with the full set of tags.

Tags do not provide tenant separation🔗

Tags have no ownership and exist at the Server level, and all tags that exist are visible to all users at any time. Tags are visible in many places within the UI, particularly in the following places:

  • Data Source selector within search

  • Pages that list the existing answers, Liveboards, worksheets, and tables.

Why does this matter, even if you are only embedding Liveboards? SSO into ThoughtSpot creates a session that allows the user to go directly into the ThoughtSpot web UI if they find the underlying URL. While the URL is not obvious when embedding ThoughtSpot content, it is also not difficult to determine with basic knowledge of the web development tools built into web browsers.

Tags can be used for other distinctions and filtering🔗

A good use case for tags would be a “standard reports” tag, to identify content provided by the app developer. When using the REST API to determine the content a given user has access to, the “standard reports” tag would allow you to divide between content created by the app developer and content created by the tenants themselves.