filters and depends on data sources.
The Tableau performance recorder
The hardest part about optimising in Tableau is knowing where to start. This is where the performance recorder can be used to break down what goes on under the hood when Tableau is updating a visualisation.
It can be found under: Help -> Settings and Performance -> Start Performance Recording
Clicking on this will start the recording. While turned on, the performance recorder will record important information about what Tableau does when you interact with workbooks and dashboards. The recording can be stopped in the same place where it was started. And after doing so, Tableau will begin to compile a report giving both an overview and a detailed breakdown of the key events that took place while recording.
1. Use an extract
Extracting data and storing it in memory is likely to be optimal unless you need real-time / zero latency data and have the associated infrastructure to support that. For the purposes of this guide, we have assumed that your use case allows for an extract to be used.
If you need to have a live database connection, then avoid RAWSQL and custom SQL. Instead, use a sample extract to build the dashboards before changing to the live connection. If you don’t have an analytics database but need the live connection, then most cloud providers have turnkey solutions.
2. Limit and optimise data
The background for this tip is fairly straightforward: The more data Tableau has to go through, the longer it takes. There are however a few ways of minimising the amount of data used in Tableau.
- Use extract and data source filters
Most datasets contain a certain amount of data that is completely irrelevant for the analysis. This can be data older than what is in scope or sales data for products not in your department. Having this data in the extract will only slow down queries as Tableau has to look through irrelevant data every time something needs to be computed. This is where the extract and data source filters are useful. These filters make it possible to remove data from the final dataset, and in that way, limit the data Tableau has to query.
Make sure the data is only at the required level of granularity. If your data is unnecessarily granular and you are ultimately going to be using a summary level, rather than the detail, you will be slowing down performance with the extra data being processed.
Performance will be improved, and far noticeably for larger datasets, if you aggregate the data to the level required for the analysis or dashboard required. An example of a bad practice of this could be importing transaction level data to create monthly sales reports.
Clearly, in instances like this, it is important to make sure you have carried out an effective requirements analysis with your stakeholders so that you are clear on the level of granularity that is going to be needed for your project (amongst other things).
Creating an aggregate can be done either in a database view or by using the built-in aggregation function in Tableau – to do this, visit the extract settings menu.
- Hide unused fields
The number of columns in a dataset has a larger impact on performance than the number of rows. Therefore, removing unused columns can significantly speed up performance – especially if the dataset is very wide but only a few columns are needed. You might have found that most data sources contain more fields than what is actually required when building a Tableau dashboard. If retained, these unnecessary fields will take up space and be a part of the extract, adding to the amount of data Tableau will have to go through each time a query is performed.
This is an easy thing to fix without having to resort to building new SQL views or creating custom SQL queries. Simply click “Hide All Unused Fields” in the dropdown menu in the data pane. This keeps the fields but excludes them from being a part of the extract that is used by Tableau – doing this has an impact on both query speed and workbook size.
Any hidden fields can later be shown by clicking “Show Hidden Fields” in the same dropdown.
- Use a context filter
A new, temporary dataset is created when context filters are applied. This happens before other filters are applied or calculations are performed . Applying a context filter can therefore have a great impact on performance, as it limits the amount of data that will need to be queried.
Knowing this, it can be tempting to apply context filters to everything. However, this is not good practice as it takes time to compute and create a new dataset. While context filters do have other uses, they should only be used for performance improvements if they can reduce the amount of data to at least one tenth of the original size, as the added computation time will outweigh the performance increase in other situations.
- Use the correct data types
This can be multiple different issues: numbers stored as strings, dates as datetime, or something which is essentially a Boolean value stored as a string.
Using incorrect data types can have an impact on performance as some comparisons are quicker than others. Typically, this is as follows:
Boolean > Int > Float > String > Date > Datetime
A quick solution here is to make sure that numbers are either integers or floats, that dates are dates, and that datetime is only used in cases when the time is also needed. Lastly, some fields can also be replaced with other data types, e.g. a field with yes/no can be made into a Boolean value.
3. Optimise Filters
- Reduce the number of filters
Only relevant and necessary filters should be used in a report. Each filter added is more work for Tableau, so irrelevant or redundant filters will only increase the time spent on querying the underlying dataset.
- Use the correct filters
Include filters are quicker than exclude filters. When Tableau encounters an exclude filter, it will have to load in the entire domain of the filtered field before starting the actual filtering. This will of course increase the time required to filter the data.
Instead, create a new calculated field returning True/False and filter on this instead. If the filter needs to be user changeable, then use a parameter to set the True/False values in the calculated field.
- Include all filter values
Avoid using the “include only relevant values” option on filters unless necessary. While it can make the filter list nicer and more dynamic, it requires extra computational power as the list has to be recomputed every time a filter is changed.
4. Optimise calculations
- Use the correct statements
When testing for multiple conditions in a calculated field, using ELSEIF instead of ELSE IF will keep the evaluation in the same statement instead of moving it to a new nested statement. Nested statements will come with a performance cost and are easily avoidable.
It is also good to make sure that the tests come in the correct order, starting with the most likely outcome. If 90% of the dataset is case A and only 10% case B, then test for case A before case B.
- Limit the use of nested calculations
Don’t reference the same calculated field in another calculation more than necessary. Nesting calculations like this adds unnecessary complexity because a new query is run every time a calculated field is referenced – I.e., if a calculated field is referenced three times in the same calculation, then running that calculation will require the calculated field to be calculated three times overall. Instead, find another way of achieving the same outcome with fewer references to the field. This can include changing multiple IF statements referencing the same field to a CASE or changing the calculation method used.
- Avoid COUNTD
Distinct counting (COUNTD) is one of the slowest aggregations in Tableau. It should be avoided if possible, as it has to be computed every time something changes – often at great computational cost. Using distinct counts is often important to the analysis, so completely avoiding any counts isn’t feasible. There are however multiple alternatives to COUNTD that could be worth using instead.
5. Optimise visualisations
Try reducing the number of marks in your visualisations. While it can be tempting to have highly detailed visualisations containing multiple different graphs of a highly granular level, it is not advisable. Adding more marks means that Tableau will need more processing power to render the visualisation. By limiting how much is shown by filtering, aggregating, or splitting the visualisation into multiple worksheets, the rendering time can be improved.
One final thing that can be done to improve rendering times is to limit the number of worksheets in a dashboard. Each worksheet will have to run its own queries, so the more worksheets in a single dashboard, the more queries Tableau will have to render. Instead, spread the worksheets out across multiple dashboards if possible. This will both speed up the dashboard and mitigate the information overload that comes with dashboards being crammed full of highly detailed graphs.