Tuesday, 16 June 2015

Tableau Performance Checklist


The List
The Tableau Performance Checklist is divided into seven main categories. You’ll find those categories with their subsequent best practices in the master list below:

Data:

ü  Keep analysis simple: Work with a subset of your data. Extract a sample if needed.
ü  Bring in only the data needed for analysis:Consider adding a data source filter or using an extract. If using a join, minimize the number of joined tables.
ü  Use “Describe” to explore :dimensions in new data sets without having to load them into a viz (keyboard shortcut CTRL+E). 
ü  Remove unsed columns:(measures/dimensions) in order to minimize extract refresh time or custom SQL query time.
ü  Create a published TDS file:for your business team to use rather than each analyst creating their own data source. This includes all metadata associated with dimensions, measures, calculated fields, hierarchies, sets, parameters and naming conventions. :
ü  Use Extracts:wherever possible to accelerate performance. Hide unused and confidential fields. Roll up data granularity by pre-aggregating or filtering. Break hierarchies to only visible dimensions.  

      Filtering:

ü  Minimize the number of quick filters: Use dashboard filter actions where possible.
ü  Avoid selecting “Only Relevant Values” for your quick filters. This requires sequential queries. Do not use this when not needed.
ü  Avoid high-cardinality quick filters (multi-select or drop-down lists). High-cardinality quick filters are slow to load and render.
ü  Avoid quick filters or actions that drive context filters. These require reloading the context table and should be avoided wherever possible.
ü  Keep range quick filters simple. The more complex the range, the slower the query.
ü  Replace quick filters showing “Only Relevant Values” and high count of quick filters with dashboard filter actions. They will cascade as your user interacts, and they perform faster.
ü  Don’t be lazy with user filters. Security by user filters can impact performance on Tableau Server as the server cannot share connections and query caches if user filters are active. Consider building a summary view that is a user-agnostic overview using a pre-aggregated extract with underlying data hidden. For a detailed view, restrict it to specific users or active directory groups instead of user filters.

    Custom SQL:

ü  Limit custom SQL connections as they can be inefficient. Where possible, create a view on the database server to implement your custom SQL and connect Tableau to your view.
ü  Avoid parameters in custom SQL in Tableau. Tableau wraps the custom SQL in a subquery that many databases don’t handle well. Consider building a view in the database or use a multi-table join with filters.
ü  Watch for useless clauses, e.g. ORDER BY. Tableau is going to re-sort the data once loaded anyway. 

     Calculations:

ü  Use calculated fields carefully. Think about the data type as you code the calculation.
ü  Number and Boolean > date > string calculations when it comes to performance.
ü  Limit blended calculations. They require sequentially querying multiple data sources and can be time consuming. Where possible, create a view on the database server.
ü  Avoid row-level calculations involving parameters.

  
     Rendering:

ü  Avoid high mark counts. More marks = longer rendering time.
ü  Limit the use of detailed text tables with lots of marks.
ü  Minimize the file size of any images or custom shapes where possible. As a general rule of thumb, keep images under 50kb.
ü  If using custom shapes, use transparent background PNGs instead of JPGs. Views will render cleaner, and shape files will take up less space.


     Local Computations:

ü  Even if a workbook is published to Tableau Server, local computations still impact performance. Leverage the power of Tableau Server whenever possible by limiting local computations such as groups, hierarchies, reference lines, table calculations and blending.
ü  Table calculations are powerful, but they can be slow. They are dependent on the local computation engine and can require substantial memory.
ü  Data blending builds a secondary temp table in cache. Although pre-aggregated, it is still computed locally. In v9.0, Tableau will begin processing queries in parallel, but it will be dependent on the data source. Until v9.0 releases, all queries run in series (sequentially).

    Dashboard Layout:
ü  Limit the number of worksheets on a dashboard. If you have more than four visualizations on a dashboard, strongly reconsider.
ü  Fix dashboard size relative to end-user consumption. Automatic sizing is less efficient than specifying dashboard size

1 comment:

  1. Such an ideal piece of blog. It’s quite interesting to read content like this. I

    appreciate your blog Tableau Online Training

    ReplyDelete