ABAC via tokens

ABAC via tokens

In Attribute-Based Access Control (ABAC) implementation, security entitlements are sent in as lists of attributes at session creation time via the authentication service.

The ABAC via tokens feature in ThoughtSpot 10.4.0.cl and later versions involves generating a token with filter rules and parameter values using the auth/token/custom API endpoint.

Important

ABAC attributes🔗

The most common best practice is to define filter rules within the token and place the is_mandatory_token_filter: true property on every column in a Worksheet or Model where a filter rule is expected. This will deny any access to data if a user has not been assigned values for the expected set of fields.

Administrators can set the following attributes for a user via the authentication token, along with the capability to assign the user to ThoughtSpot groups:

  • Filter rules
    Can filter multiple values of any data type. Binds to any Column in any Model or Worksheet with a matching column name in ThoughtSpot (and not the underlying database table column name).

  • Parameter values
    Binds a single value to any Parameter in any Worksheet or Model by Parameter Name and Type match. Parameters can be used in Worksheet formulas and then as Worksheet filters.

Important

The filter rules require passing the exact ThoughtSpot Worksheet or Model column name the values will not bind to any column. You must coordinate between the team that maintains the Worksheets and Models and the team that builds the token request service if any changes will be made to a Model or Worksheet. For the same reasons, the end users of an embedded app cannot have edit access to any Worksheet or Model using ABAC RLS via tokens. When column names change, ensure that the is_mandatory_token_filter: true property is set on every column where a filter rule is expected.

The request for a token with ABAC details can persist the set of filters and Parameter values to user sessions within ThoughtSpot, after which all sessions and scheduled reports will use the persisted values until they are changed by another token generation request.

Token request🔗

The ABAC message to ThoughtSpot is encoded in JSON Web Token (JWT) format. This token can be used as a bearer token for Cookieless trusted authentication or REST API access. You can also use it as a sign-in token to create a session, in which case, we recommend that the ABAC user properties be persisted, to ensure scheduled exports remain secure after the session ends.

Note

ThoughtSpot compresses the size of the JWT token by default to ensure that larger payloads, for instance, more complex filtering conditions can be passed via JWT. ThoughtSpot recommends leaving that compression on to ensure all JWT tokens can get properly interpreted by the application regardless of their size, and to obfuscate the values passed in the JWT payload. However, if you want to decode the JWT token and decode the values of the token (at the expense of compression), contact ThoughtSpot Support.

Request format🔗

The Custom access token API provides the fields to define the various ABAC attributes within an access token request.

The following properties of the Custom Access Token request are used for ABAC:

  • persist_option

  • filter_rules

  • parameter_values

  • objects

The format for the objects in each section follows the equivalent formats in the Visual Embed SDK for runtime filters and runtime Parameters.

An example of setting both filter_rules and parameter_values without any persistence:

"persist_option": "NONE",
"filter_rules": [
   {
     "column_name": "Region",
     "operator": "IN",
     "values": ["West", "Southwest"]
   },
   {
     "column_name": "Product Type",
     "operator": "IN",
     "values": ["Shirts", "Swimwear"]
   }
 ],
 "parameter_values": [
   {
     "name": "Secured",
     "values": ["rxzricmwfe87q7bh7jyg"]
   }
 ]

Apply to specific objects🔗

By default, any specified filter or parameter will bind to any content with an exact match for the column or Parameter name.

You can choose the filter or Parameter to apply only to specific ThoughtSpot objects by including an objects array in the following format:

"objects": [
   {
     // example of the format
     "type": "{OBJECT_TYPE}",
     "identifier": "{id or name of the object}"
   },
   {
      "type":"LIVEBOARD",
      "identifier": "9bd202f5-d431-44bf-9a07-b4f7be372125"

   }
]

Token-based override versus persisting values🔗

When using a persist_option other than NONE, the filter_rules and parameter_values defined in the token request using the auth/token/custom API endpoint are stored in the user > access_control_properties object.

Persisted values for a user are used by ThoughtSpot for any scheduled reports, as well as when no other token is provided.

Important
  • When persist_option is set to NONE, it only acts as an override, with the values tied only to the specific token. The REST API token request with any values where persist_option is not NONE acts as an "update the user object" API endpoint even if you don’t use the token generated from the API request.

  • Note that the stored properties of the user are updated when the token request is successful, rather than at the first use of the token.

The persist_option has the following possible values :

valueavailable versionbehavior

NONE

10.4.0.cl and later

User properties are not updated by the token request.

APPEND

10.4.0.cl and later

New attributes in the token request are added to existing properties of the user object.

RESET

10.4.0.cl and later

All persisted attributes on the user object are cleared. New attributes defined in the API request are still encoded in the token.

REPLACE

10.5.0.cl and later

All persisted rules and attributes of the user object are replaced with the set specified in the token request.

Note

In 10.4.0.cl, the REPLACE behavior can be achieved by making a RESET request followed by an APPEND request, then passing only the APPEND request token to the browser.

Filters and parameters must be persisted for them to apply to user sessions when using cookie-based trusted authentication or scheduled reports.

persistauthentication typebehavior

NONE

Cookieless Trusted Authentication

Attributes assigned to the token override the user’s access control properties, without updating the user object.

NONE

Cookie-based Trusted Authentication

Attributes assigned to the token will not be considered. The user logs in using a session cookie and the properties from the previous session persist.

APPEND or REPLACE

Cookieless Trusted Authentication

Attributes assigned to the token override the user’s access control properties, but the user object is also updated

APPEND or REPLACE

Cookie-based Trusted Authentication

Token request updates the user’s access control properties and the token is used by the Visual Embed SDK to start a session.

APPEND or REPLACE

Discard token after request

Token request updates the user object.

Persisting values opens up use cases for ABAC outside of the cookieless Trusted Authentication pattern: even if users authenticate via SAML, OIDC, or go directly into ThoughtSpot via username and password, an administrator can make a token request with a persist_option to write filter_rules and parameter_values to the user object.

Reset persisted values🔗

To fully remove all persisted filter_rules or parameter_values from a user object, set "persist_option" : "RESET" in the token request.

The requested token can still be used for ABAC if you included filter_rules or parameter_values in the request.

Token request test page🔗

A downloadable, customizable web page for testing all of the ABAC and trusted authentication capabilities is available on GitHub.

The username specified in the test page must have Administrator privilege or you can supply the secret_key for the ThoughtSpot instance to request a token for any user along with setting their ABAC properties.

See the trusted authentication documentation for full details on proper setup to allow trusted authentication.

Filter rules for multi-value RLS🔗

The ABAC via tokens pattern allows for setting arbitrary filters and overriding the values of existing Worksheet parameters. These two capabilities can be combined in various ways to create secure and unbreakable RLS.

Deny all by default🔗

Starting in ThoughtSpot 10.4.0.cl, you can add is_mandatory_token_filter: true to the TML definition of any column in a Worksheet or Model.

ThoughtSpot checks to see if the logged-in user has any filter_rules defined for a column marked with is_mandatory_filter: true, and denies access to any data if a filter rule for the matching column is not found.

Show All🔗

The way to show all values for a column protected by is_mandatory_token_filter: true is to pass the special keyword ["TS_WILDCARD_ALL"] as the value for the column in the filter_rules.

Columns without is_mandatory_token_filter: true will show all values if there is no filter_rule for that column.

Build the ABAC token request🔗

The token request service must have the following to build a token request for ABAC:

  1. Filter rules for defining multi-value conditions on columns

  2. Parameter values for use in Worksheet or Model formulas

The filter rules must be built by:

  1. Retrieving user data entitlements

  2. Translating entitlements into ThoughSpot filter_rules

Retrieve entitlements🔗

The value of the ABAC pattern is that you can send different combinations of filters for different types of users.

You can retrieve the attribute names and values from any source: the embedding application’s session details, an entitlement REST API, a query to a different database, etc.

Translate entitlements into filter rules🔗

Filter rules match on the name property of a column as defined in ThoughtSpot, not the column’s name in the underlying database table.

The token request service MUST know the ThoughtSpot column names that will be used for each of the attributes, so you’ll need to coordinate between ThoughtSpot Worksheet designers and the token request service to make sure the matching column names and values are being sent.

Important

End users of an embedded app cannot have edit access to any Worksheet using ABAC RLS via tokens.

You must follow proper development and deployment practices for all your customer-facing production environments as well as the production token request service.

As mentioned in the preceding section, the format for filter rules within the token matches with runtime filters in the Visual Embed SDK. In general, RLS entitlements are lists of values using the IN operator, but you can pass in filters on numeric and time columns using the full set of operators.

All values are passed into the token as arrays of strings, even if the column is a numeric, boolean, or date type in ThoughtSpot and the database. The column data type will be respected in the query issued to the database.

For example, let’s assume three attributes that are needed to filter down a user on a multi-tenanted database: Customer ID, Region, and Product Type.

The following is what the token request would look like if restricting on all three attributes:

"filter_rules": [
   {
     "column_name" : "Customer ID",
     "operator": "EQ",
     "values": ["492810"]
  },
   {
     "column_name": "Region",
     "operator": "IN",
     "values": ["West", "Southwest"]
   },
   {
     "column_name": "Product Type",
     "operator": "IN",
     "values": ["Shirts", "Swimwear"]
   }
 ]

A user might be entitled to all access on any given column (you might drop some levels of a hierarchy if you include more granular columns to filter on for that user).

The following is a request where a different user can see all Region, but still has restrictions on Customer ID and Product Type, using the TS_WILDCARD_ALL value to allow data even when the column expects a filter from the token:

"filter_rules": [
   {
     "column_name" : "Customer ID",
     "operator": "EQ",
     "values": ["TS_WILDCARD_ALL"]
   },
   {
     "column_name" : "Customer ID",
     "operator": "EQ",
     "values": ["492810"],
  },
   {
     "column_name": "Product Type",
     "operator": "IN",
     "values": ["Shirts", "Swimwear"],
   }
 ]

Because the filter_rules section is entirely within the control of the token request service, you have full flexibility to generate any set of filters for any type of user within the token.

Parameters to filter via formulas🔗

The basic pattern for using a Parameter to filter a Worksheet or Model includes these steps:

  1. Create Parameters in Worksheet

  2. Make formula that evaluates the Parameter’s default value and the expected values from the token

  3. Make Worksheet filter based on the formula, set to true.

Parameters are defined at the Worksheet level within ThoughtSpot. Parameters have a data type and a default value set by the Worksheet author.

You can also add is_hidden: true to a Parameter definition using TML, which allows the flexibility to use as many parameters as desired for any type of formula to be used as a Worksheet filter, without cluttering the visible UI.

To use a Parameter, you’ll create a formula on the Worksheet or Model. Worksheet filters can reference Worksheet formulas once they have been created, which creates the security layer out of the result derived from the formula.

All of these Worksheet-level features are set by clicking Edit on the Worksheet, then expanding the menu on the left sidebar:

Worksheet Edit Sidebar

Create the Worksheet security formula🔗

A Parameter doesn’t do anything on its own. You need a formula to evaluate the Parameter’s value.

Any security formula to be used on a Worksheet should result in a boolean true or false, and then the Worksheet filter will be set to only return data when the formula returns true.

Logic for groups to see all data

In any security formula you build, you may want a clause that gives access to all data to certain groups.

You can chain together any number of if…​then…​else clauses within a formula, including using the ts_groups or ts_username values, to build out your preferred logic:

if ( 'data developers' in ts_groups ) then true else if ( parameter_name = field_name ) then true else false

Formulas comparing a parameter to a column

Parameters can be used in a formula to directly match a value in a column, or any other type of function you’d like to use:

if ( parameter_name = field_name ) then true else false

Set a Worksheet filter on the Worksheet security formula🔗

The last step is to set a Worksheet filter on the formula you just created to evaluate the 'check parameter'.

  1. Click the Add new icon (+) next to Filters.

  2. Click the formula you created (at the end of the list) in the Filter dialog.

    Create New Filter on Worksheet

  3. Click Add values in bulk.

    Choose add values in bulk

  4. Type in the value true in the bulk dialog box, and then click Done.

    Type in true in bulk values box

  5. Click Done on the Filter dialog.
    You should see it listed on the Edit Worksheet page:

    Completed Worksheet filter

Use Parameters with pass-through functions🔗

ThoughtSpot SQL Pass-through functions allow you to send arbitrary SQL to the data source, while passing in values to substitute from ThoughtSpot.

The basic form of the SQL Pass-through function is:

sql_passthrough_function("<sql_statement>", <ThoughtSpot variable 1>, <ThoughtSpot variable 2>,…​)

The proper pass-through function to use in most cases is sql_bool_op, which can be used in a filter set to true as shown above.

The list of variables after are substituted into the SQL statement using curly braces in the order listed, starting at 0:

sql_bool_op ( "tableName.columnName IN ({0}, {1})" , parameterName0, parameterName1)

If you do not get all your data types correct, the ThoughtSpot-generated query will cause errors at the data warehouse level, which you will see in ThoughtSpot.

Fully-qualify all column references

If you are using a column that is part of the current ThoughtSpot Worksheet, you can simply reference that column using the substitution syntax of the pass-through functions:

sql_bool_op ( "{0} IN ({1}, {2})" , columnInWorksheet, parameterName0, parameterName1)

If you are referencing a field NOT in ThoughtSpot, perhaps a column that is not part of the JOINed data model or of a complex data type ThoughtSpot does not natively recognize, you need to fully-qualify the column name with at minimum tableName.columnName syntax in your query, so that the SQL is not ambiguous if a similar column name exists on multiple tables.

You may need a full database.schema.tableName.columnName syntax, in the standard syntax of the particular data warehouse being used, for the SQL to work within the rest of the ThoughtSpot-generated query.

Sub-queries to solve multi-step filtering scenarios

Many reporting solutions require very complex logic to establish a user’s data entitlements.

Traditionally, this can be solved by either writing dynamically generated SQL queries into an application or by placing the logic within a database in a way that the results of the logic can be referenced easily by other SQL queries. This may involve stored procedures, user-defined functions, temporary tables or any of the many other tools provided by a given RDBMS.

ThoughtSpot does not allow for the writing of arbitrary dynamic SQL. Pass-through functions must be written and remain the same for all users, with ThoughtSpot Parameters being the only aspect that can be changed dynamically.

Writing a sub-query as a pass-through function, with a parameter whose value is provided securely via ABAC, provides a method for accessing any number of tables or features within your data warehouse, while guaranteeing the filter will be applied to all searches made using the ThoughtSpot Worksheet.

The basic form of the sub-query formula is:

sql_bool_op('{0} IN (SELECT DatabaseField FROM FullyQualifiedTable WHERE OtherField = {1})', WorksheetField , Param)

The SQL when expanded out looks like:

ws.FieldNameAlias IN (
   SELECT DatabaseField
   FROM FullyQualifiedTable
   WHERE OtherField = '{ParameterValue}'
)

You could similarly call a UDF or stored procedure that returns a column of the same type as the WorksheetField:

sql_bool_op('{0} IN (udf_function_name({1}))', WorksheetField , Param)

The overall pattern is to use the Parameter value, sent in via ABAC, to retrieve a specific set of values set within the database, using whatever techniques are available.

An example workflow might be:

  1. User logs into embedding application.

  2. A stored procedure is called in the database to grab their latest entitlements and store those in a table, with a single "entitlement_session_id" returned to the application.

  3. The "entitlement_session_id" is sent as an ABAC parameter as part of the ThoughtSpot token request for the user.

  4. Worksheets that need these entitlements use the combination of pass-through function with parameter + formula + filter so that all queries in ThoughtSpot include a WHERE clause with the sub-select.

Known limitations🔗

  • The ABAC via tokens method requires using trusted authentication and using Worksheets or Models as data sources for Liveboards and Answers, rather than individual Table objects.

  • Several features within ThoughtSpot, such as autocompletion in Search on values within columns or the suggestions in Explore mode, use ThoughtSpot indexing. Due to the runtime nature of ABAC via tokens, ThoughtSpot indexing will not be restricted by the values supplied in a token.

    You must turn off indexing for any field that needs to be restricted by RLS when using ABAC via tokens for RLS, or also include an RLS Rule on fields that must also be filtered for the Indexing system.