Natural language search best practices

ThoughtSpot Sage enables customers to ask business questions in natural language, and get AI-generated answers back in the search results page. This guide provides tips to help you get the best performance from the natural language search feature. For more information about natural language search, see Natural language search.

Early Access

Natural language search is an Early Access feature and is disabled by default. To enable it, contact your administrator. Once enabled, your ThoughtSpot administrator can grant you access to this feature by assigning the Can preview ThoughtSpot Sage user privilege to your user group.

Worksheet best practices

ThoughtSpot’s natural language capability in general works well with a well-structured Worksheet that has properly named and non-ambiguous columns. The underlying data doesn’t have restrictions on the number of columns or any join restrictions. However, we do specify some best practices when creating Worksheets to get the best natural language search accuracy:

  • Column names

    • It is best to avoid similar column names. However, similar column names don’t necessarily impact the accuracy if you have good usage of columns in Search Data or saved Answers or Liveboards. The usage data helps Sage disambiguate effectively amongst similar columns.

    • Sage prefers use of underscores or spaces when naming columns.

    • We advise using easily understandable names. Avoid abbreviations and specific terms used within the business unit or organization.

  • Synonyms

  • Column Values

    • We advise you to have column values as flattened values or single items rather than JSON.

  • Date Columns

    • We advise you to have few date columns as many keywords such as growth or percentage change would depend on date columns. For new use cases, it might be difficult for the system to pick the right date columns.

  • Indexing

    • When creating a new use case with low usage on ThoughtSpot, we advise using index priority. Adjusting the index priority for your most popular columns helps ThoughtSpot prioritize those columns when generating Answers.

    • Enabling value indexing improves value accuracy.

Search best practices

Follow these best practices when searching using natural language for the most accurate Answers:

  • ThoughtSpot doesn’t support personal pronouns yet, therefore, it’s best to avoid using personal pronouns, such as "I," "My," "Mine." GPT can’t understand them using the data we send it. Instead, use identifiers that exist in your data sources, such as names, email addresses, and so on.

  • "Why" questions are not supported yet, therefore, it’s best to avoid asking "why" questions that require reasoning or text-based answers. For example, avoid asking "Why did my sales go down in Q2?" These questions aren’t supported yet.

  • Our system is not equipped to answer descriptive questions about the data sources. For example, don’t ask "How many date columns does the Worksheet have?"

  • If you are unsuccessful in getting an AI-generated Answer, try making the question more specific. It is advisable to use actual column names or synonyms as entered in the Worksheets, if you know them.

  • Use of feedback helps the system learn, so we highly encourage users to provide feedback.

  • A good general guidance for effectively using natural language search would be to ask questions that would be answered using Search Data. If an Answer is expressible in search tokens, it can be answered in natural language.