TML modification

TML modification

TML is slightly different for every object type, but all follow a general pattern that allows for programmatic editing.

You can use the thoughtspot_tml Python library to access and transform any of the properties discussed within this document, most of which are covered in the README and the common_attribute_changes.py example script.

Best practicesπŸ”—

Because of the flexibility of the various data object types, there is no particular hierarchy to TML files, but rather just the following rules:

  • Data objects can only be joined to other objects from a single connection

  • Any single search (an answer or a visualization on a Liveboard) can only connect to data objects from a single connection The connection is thus the natural packaging level for importing TML files, except for Liveboards which may contain visualizations connected to different connections.

This is to say, for a given connection you can always upload all TML files connected to that connection safely except Liveboards.

One best practice recommendation is to only connect answers and Liveboards to worksheets. Although they can connect directly to tables and views, it is far simpler to adjust a single reference to one worksheet within the answer and Liveboard TML than many individual tables. The only exception is if you use a single table or view for the underlying search answer.

ThoughtSpot object model hierarchy and relationships

Export with fqn: propertyπŸ”—

Always export TML files with export_fqns=true (for versions lower than 9.0.0.cl, see Notes for older releases).

The fqn: property provides the GUID of associated objects, which can be swapped out when moving packages of TML files between environments. The following is an example of the connection: property within a Table TML:

connection:
 name: Demo Connection
 fqn: 2aa36dbd-dda6-4497-a6db-bc47e128862e

You can always remove an fqn: property when uploading a set of TML with the intention of creating all new objects.

Name properties are identifiers in connected objectsπŸ”—

When editing entire packages of exported TML files to be deployed in a different environment, you must consider changes to identifiers in other "downstream" TML files. ThoughtSpot will do the necessary updates on an existing data model.

One of the most important examples is that the name: property of a column in a Table is used as part of the column identifier in connected Worksheet objects.

A Table object might have a column with a display name property (name:) that has been modified to be different than the db_table: property (the actual name of the column in the database itself):

table:
  name: FACT_RETAPP_SALES
  db: RETAILAPPAREL
  schema: PUBLIC
  db_table: FACT_RETAPP_SALES
  connection:
    name: Demo Connection
    fqn: 2aa36dbd-dda6-4497-a6db-bc47e128862e
  columns:
  - name: Sales Id
    db_column_name: SALESID
    properties:
      column_type: MEASURE
      aggregation: SUM
    db_column_properties:
      data_type: INT64

The connected worksheet TML referencing the same column would be as shown in the following example. Note that it is the formatted name property from the Table TML, not db_table_name.

...
  worksheet_columns:
  - name: Product Name
    column_id: FACT_RETAPP_SALES_1::Sales Id
    properties:
      column_type: ATTRIBUTE
...

The pattern for column_id is a combination of the id from the table_paths section along with the name property of the column in the table object: {Worksheet.table_paths[n].id}::{Table.columns[n].name}

This is the section in Worksheet where the first part of the identifier is defined as the id property (table property references the tables: section above in the TML file):

table_paths:
  - id: DIM_RETAPP_PRODUCTS_1
    table: DIM_RETAPP_PRODUCTS
    join_path:
    - join:
      - C_DIM_RETAPP_PRODUCTS
  - id: FACT_RETAPP_SALES_1
    table: FACT_RETAPP_SALES
    join_path:
    - join:
      - C_FACT_RETAPP_SALES

ConnectionsπŸ”—

Connections contain the details and credentials necessary to connect to a database.

Connections do not have a TML representation, but can be created and modified via the data connection REST APIs. There is a connection YAML file (not TML but similar) that can be exported to save all details except for the password, but you must translate that back into a REST API call to create or update.

ThoughtSpot does not enforce uniqueness on connection names, but best practice is to give connections unique names.

TablesπŸ”—

Table objects represent an actual table (or view) within a database.

When importing a table through the ThoughtSpot UI, a table’s display name will default to the table’s name in the data warehouse, but you can change the display name to whatever you’d like. You can specify the display name within the TML document when creating a table via TML import by changing the name property, while leaving the db_table property as it was originally imported.

guid: <table_guid>
table:
  name: <table_name>
  db: <database_name>
  schema: <schema_name>
  db_table: <database_table_name>
  connection:
    name: <connection_name>
    fqn: <connection_guid>
 ...

Change table propertiesπŸ”—

You can replace the connection: name: or fqn: property to 'repoint' the table to a different data environment.

The db, schema, and db_table properties may also be changed easily along with connecton, assuming the schema/structure (field names and data types) are identical to the original object.

Change or add columnsπŸ”—

You can adjust the properties of existing columns or even add new column objects with TML.

When updating an existing object, do not change both name and db_column_name at the same time. The TML parser will consider this a delete of the original column and an add of a new column.

If you are modifying a package of TML for deployment to a new environment, changes to name property have downstream effects in connected Worksheet files.

  columns:
  - name: Sales Id
    db_column_name: SALESID
    properties:
      column_type: MEASURE
      aggregation: SUM
    db_column_properties:
      data_type: INT64

Join relationships and row-level security rulesπŸ”—

To import joins, you must import all the related Table TML files together at one time

Joins do not have separate TML representations, but are defined in the table TML in the joins_with section.

Row-level security (RLS) rules are defined in the rls_rules section of the table TML. Similar to the joins_with section, RLS rules must reference existing table objects for the TML to validate.

If you cannot upload your Table TMLs all at once and you are encountering errors with Import, you can try the following process to minimize any unresolved references:

  1. Remove the rls_rules and joins_with sections from the TML document on the initial import of each table TML. Save the GUID of each newly created object where it can be referenced to the original GUID.

  2. Check that all table objects have been created successfully on the ThoughtSpot server.

  3. Do a second import of the TML documents with the rls_rules and joins_with sections. To update the new objects rather than the original objects, specify the GUIDs of the newly created table objects on the server in the TML documents' guid property.

WorksheetsπŸ”—

Worksheets combine several tables, including Views, into a coherent data model optimized for searches. The TML syntax for worksheets defines all aspects of the worksheet, including the tables it joins together, the columns and their properties, filters, and so on.

If you want to change the values for an existing worksheet object, the tables, joins and table_paths sections are the most important.

The tables section is a list of table objects that exist on the ThoughtSpot Server. The name property is all that is included in an exported TML file, and this matches the name property of the table object. If there is more than one table object on the server with identical name properties, you must use the fqn property to specify the GUID of the particular table you want. However, the string value of name is used in the joins section, so the correct process for adding an fqn property is as follows:

So you go from

  tables:
  - name: <table_name_1>

to

  tables:
  - id : <table_name_1>
    fqn : <GUID_of_table_name>

The name property, which is now transformed into the id property, is used in the joins and table_paths sections that follow. Under joins, the source and destination properties take the string id property of a table in the tables list. In a TML file exported from ThoughtSpot, you won’t have to make any changes, because this value will already be set to what was defined in the name property, and we’ve maintained that value by switching it to the id property.

Under table_paths, the table property also uses the values we moved to id. The list of join names under join_path will need to match the text value of the name element of an item in the joins. This should be valid as exported and not require any changes, but if you do change the name value of a join, you will need to update the value in the join_path list in table_paths.

  joins:
  - name: <join_name_1>
    source: <source_table_name>
    destination: <destination_table_name>
    type: [RIGHT_OUTER | LEFT_OUTER | INNER | OUTER]
    on: <join_expression_string>
    is_one_to_one: [ false | true ]
  - ...
  table_paths:
  - id: <table_path_name_1>
    table: <table_name_1>
    join_path:
    - join:
      - <join_name_1>
      - <join_name_n>

AnswersπŸ”—

The answer TML syntax defines all aspects of a saved search and how it is visualized. The tables property is used to point to ThoughtSpot table, view, or worksheet objects, whichever the answer is connected to.

As mentioned above, it is simpler to connect an answer to a single worksheet, so that you only have to update one reference in the tables section.

guid: <answer_guid>
answer:
  name: <answer_name>
  description:
    This is a multi-line description of the answer
    Description line 2
  tables:
  - id: <table_id>
    name: <table_name_1>
    fqn : <GUID_of_table_name>
 ...

LiveboardsπŸ”—

Liveboards include many different visualizations and define a layout of the visualizations elements.

It is best practice to only use one worksheet for all visualizations on a Liveboard, however each visualization on a Liveboard can connect to different data objects.

Visualizations on a LiveboardπŸ”—

The individual elements on a Liveboard are referred to as visualizations and are defined in the visualizations section.

A visualization on a Liveboard is not linked in any way to any answer object outside that Liveboard.

When you pin an answer to a Liveboard, the full definition of the answer is copied into the Liveboard definition at that time. However, the capabilities and definitions for answers stored separately or on a Liveboard are identical.

Within the TML, the visualizations section uses the exact same syntax as a separate answer stored on its own, with the guid property of an individual answer replaced by an id property. The value for the id property is typically the word Viz and a number, Viz 1, Viz 10.

Note

If your instance is running 8.9.0.cl, do not create visualizations on a single Liveboard that connect to different worksheets with the same name (this is possible, as worksheet names are not unique) if you wish to do programmatic transformations, because it is impossible to add in the FQN properties based on just name if names are not unique.

Modifying Liveboard TMLπŸ”—

liveboard:
  name: <Liveboard_name>
  description:
    This is a multi-line description of the Liveboard
    Description line 2
  visualizations:
  - id: <viz_id_1>
    answer:
 ...

The layout section is an ordered list with a size property for each visualization tile. If a visualization is added programmatically, it needs a matching entry in the layout section to determine how it appears within the Liveboard. Adjustments to the order within the Liveboard can be made by reordering the layout list. The visualization_id property must match to an existing id value in the visualizations list.

  layout:
    tiles:
    - visualization_id: <visualization_id_1>
      size: <viz_id_1_size>
    - visualization_id: <visualization_id_2>

ViewsπŸ”—

Views transform a saved search into a data source, allowing for analysis that would require complex sub-queries in SQL. Views can be joined with other data objects in a worksheet. The best practice is to make views available to answers and Liveboards through a worksheet.

The TML syntax for views is similar to that of worksheets, in that it defines links to table objects on the ThoughtSpot server and join overrides using the joins and table_paths sections. What truly distinguishes a view is the search_query element, which contains a string using the ThoughtSpot search syntax.

guid: <view_guid>
view:
  name: <view_name>
  description:
    This is a multi-line description of the View.
    Description line 2
  tables:
  - name: <table_name_1>
    id : <optional_table_id>
    fqn : <GUID_of_table_name>
  - name: <table_name_2>
  - name: <table_name_n>
  joins:
  - name: <join_name_1>
    source: <source_table_name>
    destination: <destination_table_name>
    type: [RIGHT_OUTER | LEFT_OUTER | INNER | OUTER]
    on: <join_expression_string>
    is_one_to_one: [ false | true ]
  table_paths:
  - id: <table_path_name_1>
    table: <table_name_1>
    join_path:
    - join:
      - <join_name_1>
      - <join_name_n>
 ...
  search_query: <query_string>
 ...

SQL ViewsπŸ”—

A SQL View defines an underlying query directly to a connection.

The connection property can be changed similar to a table object’s TML either using the fqn property, while the sql_query property can be changed to a different valid text string.

guid: <sql_view_guid>
sql_view:
  name: <view_name>
  description:
    This is a multi-line description of the View.
    Description line 2
  connection:
    name: <connection_name>
    fqn: <connection_GUID>
  sql_query: <sql_query_string>