Git integration and version control

Git integration and version control

When embedding or deploying a third-party application in their environments, most organizations use defined practices at various stages of their SDLC process. Developers typically use a version control system and CI-CD pipeline to push their code from development to testing and production environments. Similarly, when deploying ThoughtSpot, you may want to publish your ThoughtSpot content from a development environment to a staging or production cluster.

ThoughtSpot objects such as Worksheets, Liveboards, and Answers are stored as ThoughtSpot Modeling Language (TML) content. Users can download these TML files, edit these files locally, and import the updated content into ThoughtSpot. TML files are also useful when migrating content from one ThoughtSpot instance to another.

With The Git integration feature, ThoughtSpot provides the ability to connect your deployment instance to a Git repository, push TML files to CI/CD pipelines, and deploy commits from your Git repository to your production environment.

Note

ThoughtSpot currently supports GitHub / GitHub Enterprise for CI/CD.

Git integration overviewπŸ”—

The Git integration feature supports the following capabilities:

  • ThoughtSpot integration with Git and CI/CD workflows
    Ability to connect your ThoughtSpot instance to a Git repository and deploy commits via REST API.

  • Ability to version control ThoughtSpot content
    Ability to build or modify your content locally on a development instance and push commits to a remote Git branch via APIs and version your updates.

Supported deployment scenariosπŸ”—

The git integration supports the following deployment scenarios:

  • Move content from a ThoughtSpot development instance to a production instance.

  • Deploy multiple environments on the same ThoughtSpot instance using the Orgs feature. For example, you can create separate Orgs for Dev, Staging, and Prod environments. The content built from the Dev Org can be deployed on Staging and Prod Orgs using REST API v2.0 version control endpoints.

Note

ThoughtSpot’s Git integration does not support moving objects within the same Org or application instance. For example, it does not support moving objects in an environment where multi-tenancy is implemented using groups.

How it worksπŸ”—

The ThoughtSpot content deployment process with version control APIs and Git integration includes the following steps:

  1. Enable Git integration on ThoughtSpot.

  2. Connect your ThoughtSpot environment to the Git repository.
    You can connect your ThoughtSpot development and production environments to the dev and production branches on your Git repository. The general practice is to use the main branch in the Git repository as a production branch to publish content.

  3. Push changes to the Git branch mapped to your ThoughtSpot environment.

  4. Validate merge before deploying changes to the destination environment.

  5. Deploy commits to your production environment and publish your changes.

The following figure illustrates a simple Git integration workflow with ThoughtSpot Dev and Prod environments.

Git integration workflow

Lifecycle management via git APIsπŸ”—

ThoughtSpot recommends the following lifecycle management flow:

  • Implement changes in a ThoughtSpot development environment, and then commit these changes in a Git development branch

  • Merge the Git development branch into one deployment branch

  • Deploy changes from the Git deployment branch into the ThoughtSpot production/staging environment to update your target environment.

  • Use one repository per ThoughtSpot version control project. Your ThoughtSpot development, staging, and production environments should all be using the same Git repository. This will make it easier to move objects from dev to prod (via merging branches).

  • Use one commit branch per environment. This is where the ThoughtSpot code will get committed. Do not commit content from different ThoughtSpot environments into the same branch. Each environment uses different unique identifiers (GUIDs) to identify files. Using the same branch to store files from multiple ThoughtSpot environments will result in corrupt branches, errors, and merge conflicts when deploying content to a ThoughtSpot production environment.

    As a best practice, use the commit API to submit TML changes to Git. This ensures that deleted and renamed files are properly synchronized.

  • Use a dedicated branch for version history. As described earlier, a given object’s unique identifier will be different between its development and production versions. If you wish to implement version history in a production environment, use a dedicated branch for version history. Do not use a branch that is already used to manage or deploy development objects.

  • Use a dedicated branch for all Git configuration files. Dedicate some branches such as dev and main for ThoughtSpot content and store all Git configuration files created by ThoughtSpot in a separate branch. This will make it much easier to compare ThoughtSpot content across branches.

  • Validate the changes before merging or deploying, to ensure the TML content in target environments can import changes without conflicts.

    The following figure illustrates the lifecycle management with git and best practices for commit and deploy workflows:

Git integration workflow
Note

ThoughtSpot does not recommend committing changes to Git directly and deploying these changes back in a ThoughtSpot development environment.

Get startedπŸ”—

Before you begin, check if your Git integration setup meets the following prerequisites:

  • To commit objects from Thoughtspot to a Git repository, you require at least view permission for all objects that will be committed as part of the operation.

  • To deploy or revert objects from a Git repository to ThoughtSpot, you require edit access to all objects that will be updated as part of the deployment. If the deployment contains Worksheets, Views, or Tables, users require Can manage data (DATAMANAGEMENT) privilege for deploy, commit, and revert operations.

  • You have a GitHub or GitHub Enterprise account and access to a repository. Ensure that your account has one of the following types of access tokens:

    • Personal access token (Classic)
      Make sure the access token has the repo scope that grants full access to public and private repositories, commit and deployment status, repository invitations, and security events.

    • Fine-grained personal access token
      Make sure the token allows Read access to metadata and Read and Write access to code and commit statuses.

  • The branch used as a configuration branch does not have any branch protection rule.

  • The branches in the Git repository are set up as described in Recommended configuration.

Enable Git integrationπŸ”—

To configure Git branches and workflows, the Git integration feature must be enabled on your ThoughtSpot Dev and Prod environments. To enable this feature on your instance, contact ThoughtSpot Support.

Connect your ThoughtSpot environment to the Git repositoryπŸ”—

To connect your ThoughtSpot instance to a Git repository using REST API, send a POST request with the following parameters to the /api/rest/2.0/vcs/git/config/create REST API v2.0 endpoint.

Request parametersπŸ”—

ParameterDescription

repository_url

String. The HTTPS URL of the Git repository; for example, https://github.com/user/repo.git.

username

String. Username to authenticate to the Git repository.

access_token

String. Access token to authenticate to the Git repository.

org_identifier

String. ID of the Org. Define this parameter only if the Orgs feature is enabled on your ThoughtSpot cluster and separate Orgs are configured for development and production environments.

branch_names

Array of strings. List of Git branches to configure.

commit_branch_name

String. Name of the remote branch where objects committed from this Thoughtspot instance will be versioned. Replaces default_branch_name, which is deprecated in 9.10.5.cl.

default_branch_name
Optional

String. Deprecated in 9.10.5.cl. In earlier versions, this parameter was used to configure the name of the default Git branch to use for all operations on the cluster.

enable_guid_mapping

Boolean. Enables GUID mapping and generates a GUID mapping file. Starting from 9.7.0.cl, this attribute is set to true by default. To know more about GUID mapping, see GUID mapping.

configuration_branch_name

String. Name of the branch where the configuration files related to operations between Thoughtspot and version control repository should be maintained. Replaces guid_mapping_branch_name, which is deprecated in 9.10.5.cl.

Note

If no branch name is specified, by default, the ts_config_files branch is considered. Ensure this branch exists before configuration.

guid_mapping_branch_name
Optional

String. Deprecated in 9.10.5.cl. In earlier versions, this parameter was used to configure the name of the branch for the GUID mapping file.

Request exampleπŸ”—

The following example shows the API request format for connecting ThoughtSpot to a GitHub repository.

curl -X POST \
  --url 'https://{ThoughtSpot-Host-Dev}/api/rest/2.0/vcs/git/config/create' \
  -H 'Authorization: Bearer {Bearer_token}  \
  -H 'Accept: application/json'\
  -H 'Content-Type: application/json' \
  --data-raw '{
  "repository_url": "https://github.com/user/repo.git",
  "username": "ts-git-user",
  "access_token": "{ACCESS_TOKEN}",
  "org_identifier": "dev"
  "branch_names": [
    "dev",
    "main"
  ],
  "commit_branch_name": "dev",
  "configuration_branch_name": "_ts_config"
}'

If the API request is successful, the ThoughtSpot instance will be connected to the Git repository. Make sure you connect all your environments (Dev, Staging, and Prod) to the GitHub repository.

The following example shows the API request parameters to connect a ThoughtSpot Prod instance to the Git repo. Note that GUID mapping is enabled in the API request.

curl -X POST \
  --url 'https://{ThoughtSpot-Host-Prod}/api/rest/2.0/vcs/git/config/create' \
  -H 'Authorization: Bearer {Bearer_token}  \
  -H 'Accept: application/json'\
  -H 'Content-Type: application/json' \
  --data-raw '{
  "repository_url": "https://github.com/user/repo.git",
  "username": "ts-git-user",
  "access_token": "{ACCESS_TOKEN}",
  "enable_guid_mapping": true,
  "org_identifier": "prod"
  "branch_names": [
    "prod"
  ],
  "enable_guid_mapping": true,
  "commit_branch_name": "prod",
  "configuration_branch_name": "_ts_config"
}'

GUID mapping and configuration filesπŸ”—

ThoughtSpot maintains a set of configuration files to facilitate the CI/CD process for developers. Typically, it includes:

  • One mapping file per production environment
    This file documents the GUID mapping for ThoughtSpot development objects from the source cluster, and their equivalent objects in the production environment to which commits are deployed.

  • One deploy file per production environment
    This file tracks the last commit_id of the last successful deploy operation.

GUID mapping

The version control API automatically generates a GUID mapping file when deploying commits and saves this file in a Git branch. The mapping file records the GUIDs for each TML object as shown in this example:

[
   {
      "originalGuid":"7485d3b6-4b4e-41a2-86be-e031d1322cc9",
      "mappedGuid":"3eeec11e-fbf7-40dc-a549-2f465f640778",
      "counter":0
   }
]
  • originalGuid refers to the GUID of the object on the source environment, for example, a Dev cluster.

  • mappedGuid refers to the GUID of the object on the destination environment, for example, staging or prod cluster.

  • counter shows the number of times the mapped object was used in deploy operations.

If GUID mapping is enabled, ThoughtSpot uses the GUID mapping file to map the object GUIDs and automatically updates the object references in your TML content.

The following figure illustrates how GUIDs are mapped during deployments:

GUID mapping
  • To update the repository details or access token, send a POST request with Git configuration parameters to the /api/rest/2.0/vcs/git/config/update API endpoint.

  • To get repository configuration information, send a POST request to /api/rest/2.0/vcs/git/config/search API endpoint.

  • To delete the repository configuration, send a POST request to the /api/rest/2.0/vcs/git/config/delete endpoint.

For more information about these endpoints, see the API documentation in the REST API v2.0 Playground.

Commit filesπŸ”—

ThoughtSpot users with data management (Can manage data) privilege can download TML files to their local environment, edit TML files, and import them into ThoughtSpot via UI or REST API. With Git integration, users can also push commits from a ThoughtSpot instance to a Git branch via /api/rest/2.0/vcs/git/branches/commit API call.

Request parametersπŸ”—

ParameterDescription

metadata

Array of Strings. Specify the type and GUID of the metadata object.

delete_aware

Boolean. When delete_aware is true, upon committing files, a check is run between the files in the Git branch and the objects in the ThoughtSpot environment. If an object exists in the Git branch, but not in the ThoughtSpot instance or Org, the object will be deleted from the Git branch. The delete_aware parameter is enabled by default.

Note

The delete_aware property requires you to associate one ThoughtSpot environment or Org to one commit branch in Git. Associating multiple ThoughtSpot environments to the same Git commit branch will result in files getting unintentionally deleted across your environments during a commit operation.

branch_name
Optional

String. Name of the branch in the Git repository to which you want to push the commit. If you do not specify the branch name, the commit will be pushed to the commit_branch_name defined for the Git connection configuration.

comment

String. Add a comment to the commit.

Request exampleπŸ”—

The following example shows the API request with Liveboard and Worksheet objects to commit to Git.

curl -X POST \
  --url 'https://{ThoughtSpot-Host}/api/rest/2.0/vcs/git/branches/commit' \
  -H 'Authorization: Bearer {Bearer_token}\
  -H 'Accept: application/json'\
  -H 'Content-Type: application/json' \
  --data-raw '{
  "metadata": [
    {
      "identifier": "e9d54c69-d2c1-446d-9529-544759427075",
      "type": "LIVEBOARD"
    },
    {
      "identifier": "cd252e5c-b552-49a8-821d-3eadaa049cca",
      "type": "LOGICAL_TABLE"
    }
  ],
  "delete_aware": true,
  "comment": "Add objects",
  "branch_name": "prod"
}'

ResultsπŸ”—

During this operation, a check is performed to compare the objects in the Git branch with the objects in the ThoughtSpot environment.

  • If an object exists in the Git branch, but not in the ThoughtSpot instance or Org, the object will be deleted from the Git branch.

  • If the object does not exist in the Git branch, it will be added to the Git branch specified in the API request or commit_branch_name configured for the Git connection.

  • If the object exists on both the Git branch and ThoughtSpot cluster or Org and there are no changes detected in the commit, the API returns a warning message with a list of objects that were not updated as part of the commit.

The following figure illustrates the commit operation with the delete_aware property enabled:

Commit changes

Search commitsπŸ”—

ThoughtSpot provides a REST API endpoint to search commits for a given TML object. A POST call to the /api/rest/2.0/vcs/git/commits/search endpoint with metadata identifier and type in the request body fetches a list of commits.

Revert a commitπŸ”—

To undo the changes committed to a repository, revert to a previous commit and restore an earlier version of an object using the /v2/vcs/commits/{commit_id}/revert API endpoint.

Request parameters
ParameterDescription

commit_id

String. ID of the commit to which you want to revert.

metadata
Optional

Array of Strings. Specify the type and GUID of the metadata object. If a metadata object is not specified, the API request reverts all objects that were modified as part of the specified commit_id.

branch_name
Optional

String. Name of the branch to which the revert operation must be applied. If you do not specify the branch name, the API will revert the commit to the default branch configured on that ThoughtSpot instance.

revert_policy

String. Action to apply when reverting a commit. The allowed values are:

  • ALL_OR_NONE (Default)
    Reverts all objects. If the revert operation fails for one of the objects provided in the commit, the API returns an error and does not revert any object.

  • PARTIAL
    Reverts partial objects. This option reverts the subset of ThoughtSpot objects that validate successfully even if the other objects in the commit fail to import.

Request exampleπŸ”—

The following example shows the API request for reverting a commit.

curl -X POST \
  --url 'https://{ThoughtSpot-Host}/api/rest/2.0/vcs/git/commits/afc0fea831558e30d7064ab019f49243b1f09552/revert' \
  -H 'Authorization: Bearer {Bearer_token}\\
  -H 'Accept: application/json'\
  -H 'Content-Type: application/json' \
  --data-raw '{
  "metadata": [
    {
      "identifier": "e9d54c69-d2c1-446d-9529-544759427075",
      "type": "LIVEBOARD"
    }
  ],
  "commit_id": "afc0fea831558e30d7064ab019f49243b1f09552",
  "branch_name": "dev"
}'

ResultsπŸ”—

If the API request is successful, the Git branch is reverted to the specified commit ID.

Validate mergeπŸ”—

To merge updates, create a pull request to push changes from your dev branch to main. ThoughtSpot doesn’t provide REST APIs to merge content from one branch to another. Before accepting the merge request in the Git repository, you can validate the merge on your ThoughtSpot instance using REST API.

To validate the content of your dev branch against your prod environment, send a POST request from your prod instance to the /api/rest/2.0/vcs/git/branches/validate API endpoint.

Request parametersπŸ”—

ParameterDescription

source_branch_name

String. Name of the source branch from which changes need to be picked for validation.

target_branch_name

String. Name of the target branch into which the TML changes will be merged.

Request exampleπŸ”—

The following example shows the API request with Liveboard and Worksheet objects to commit to Git.

curl -X POST \
  --url 'https://{ThoughtSpot-Host}/api/rest/2.0/vcs/git/branches/validate' \
  -H 'Authorization: Bearer {Bearer_token}\
  -H 'Accept: application/json'\
  -H 'Content-Type: application/json' \
  --data-raw '{
  "source_branch_name": "dev",
  "target_branch_name": "main"
}'

ResultsπŸ”—

After validating the merge, check for conflicts. Resolve issues if any with a new commit and merge your changes to the main branch.

Deploy commitsπŸ”—

To deploy commits to the Staging or Prod instance, send a POST request to the /api/rest/2.0/vcs/git/commits/deploy API endpoint. The API will deploy the head of the branch unless a commit_id is specified in the API request.

Building a release version for a Prod environment on the same instance requires swapping in the correct GUIDs. If you have enabled GUID mapping in the Git configuration on your deployment instance, the version control APIs will automatically generate a GUID mapping file and update object references when deploying your commits to the destination environment.

Note

Parallel deployment to multiple organizations within a single cluster is not supported. Developers must run deployments to each organization sequentially.

Request parametersπŸ”—

ParameterDescription

commit_id
Optional

String. ID of the commit to deploy on the cluster. By default, the command will deploy the head of the branch. To deploy a specific version, specify the commit_id.

branch_name

String. Name of the branch from which the commit must be picked for deployment. If you do not specify the branch name, the commit from the default branch is deployed.

deploy_type

String. Specify one of the following options:

  • DELTA (default)
    Deploys only the changes that were applied at the specified commit_id. For example, if three TML files were updated in the commit_id specified in the API request, only those changes will be deployed.

  • FULL
    Deploys all the files in the Git branch, including the files from the commit_id specified in the request and all other files that were already committed.

deploy_policy

String. Action to apply when deploying a commit. The allowed values are:

  • ALL_OR_NONE (Default)
    Deploys all changes or none. This option cancels the deployment of all ThoughtSpot objects if at least one of them fails to import.

  • PARTIAL
    Deploys partial objects. This option imports the subset of ThoughtSpot objects that validate successfully even if other objects in the same deploy operations fail to import.

  • VALIDATE_ONLY
    Runs validation to detect if your destination environment can import the changes without conflicts. Use this when the TML content is modified between source and destination environments and if you do not want the TML content in your destination branch to be modified after a pull request from your dev branch.

Request exampleπŸ”—

curl -X POST \
  --url 'https://{ThoughtSpot-Host}/api/rest/2.0/vcs/git/commits/deploy' \
  -H 'Authorization: Bearer {Bearer_token}'\
  -H 'Accept: application/json'\
  -H 'Content-Type: application/json' \
  --data-raw '{
  "deploy_type": "DELTA",
  "deploy_policy": "ALL_OR_NONE",
  "commit_id": "afc0fea831558e30d7064ab019f49243b1f09552",
  "branch_name": "main"
}'

ResultsπŸ”—

If the API request is successful, the changes are applied to the objects in the prod environment. A tracking file is generated in the Git branch used for storing configuration files. This file includes the commit_id specified in the API request.

The subsequent API calls to deploy commits will consider the saved commit_id and deploy_type specified in the API request:

  • If deploy_type is set as DELTA, all the changes between the last tracked commit id and the new commit_id specified in the API request will be deployed to the destination environment or Org.

  • If the deploy_type is FULL`, all the files from the commit_id specified in the API request will be deployed. If any object or file is deleted in the commit specified in the API request, it will be deleted from the destination environment during deployment.