Thursday, 18 June 2015

Tableau file types and extensions

1. Tableau Workbook (.twb): 

.twb This file type is probably the most common that you will see and create when working with Tableau. It is in XML format (try editing it in a text editor) and contains all the information on each sheet and dashboard that is contained within your workbook. Information such as what fields are being used in each view and how measures are being aggregated, the formatting and styles applied and any other setup you’ve made to a sheet or dashboard (e.g. whether a quick filter is shown). It also includes data source connection information and any metadata you have created for that connection (see more on this below under .tds). 
  • To create a .twb file, from Tableau Desktop, select File > Save

2. Tableau Packaged Workbook (.twbx):

.twb Tableau Workbook  as described above holds all the information Tableau requires to draw your viz, it does not include the data itself. A Packaged Workbook however, combines the information in a workbook and bundles it with any local data – i.e. data that is not on a server. You can think of it as a zip file, and indeed if you rename the .twbx file as a .zip you can open it with windows to see the .twb and the corresponding data files. A .twbx will also include any custom images, as well as any custom geocoding you may have used in your work.
The primary reason you would save your work as a Packaged Workbook is so that you can share it with other Tableau Desktop users, or for others to open using Tableau Reader.
  • To create a .twbx file, from Tableau Desktop, select File > Save As and then select the .twbx option from the dropdown menu at the bottom of the Save As dialogue box.

3. Tableau Datasource (.tds):


.tds When you connect to your data for the fist time, you may have a little bit of data ‘modelling’ to do – setting the right data types, changing default aggregations, setting default colours, creating some custom calculated fields etc etc. You are giving Tableau information about the data you will be using – you are setting up its ‘metadata’. When you want to connect to this data again, you don’t want to really go through all this data modelling a second time so instead you can save your metadata as a .tds file (again, it is saved in XML format) and connect to your data though this file instead. You could also distribute this file so that your colleagues have access to the nice formatting and custom fields you have worked to set up.

Tableau is clever enough to pick up new columns/fields in the data source if they appear and column ordering does not matter but if column names change or disappear completely, you will need to reconfigure.

  • To create a .tds file, from Tableau Desktop, right click on your data source connection and select Add to Saved Data Sources. Alternatively you can publish the .tds to Tableau Server by right clicking and selecting Publish to Server instead.

4.Tableau Data Extract (.tde):

.tde Tableau Data Extracts are highly optimised, highly compressed, subsets of your data stored in a columnar database file. When you connect to data using Tableau you can either connect ‘Live’ or you can extract the data into a .tde. Data extracts are used to radically improve performance, particularly when connecting to slow databases or slow files (e.g. CSVs), as well as enabling additional functionality (try doing a count distinct whilst connected live to Excel) and offline analysis.
You can also use extracts to perform some pre-aggregation of your data, and it can stop load / contention problems that may arise if you are connecting live to a database. You’ll also HAVE to use an extract if you want to publish to Tableau Public.
The primary disadvantage to using an extract is that your Tableau viz is no longer pointing to the ‘live’ data source – if that data source updates then your viz will not until you refresh the extract. Fortunately refreshing an extract is only a few clicks away, or you can set up your Tableau Server to refresh the extract on a schedule.
  • To create a .tde file, when you first connect to data, chose the Import all data or the Import some data option. If you are already connected live, right click on your data source connection and select Extract Data.
5.Tableau Packaged Datasource (.tdsx):

 .tdsx Just like the fact that a .twb does not contain any of the data but a .twbx does, a .tds file only contains the information about the data, not the data itself. A Tableau Packaged Datasource (.tdsx), however, contains the data too.
You would create this type of file instead of a .tds if you wanted to share the connection information with someone else who did not have access to the underlying data (for example if it was stored on your local machine)
  • To create a  . tdsx file, from the Add to Saved Data Source dialogue box, change the file type from the dropdown at the bottom.
6.Tableau Bookmark (.tbm):

.tbm A slightly lesser known Tableau file type is the Tableau bookmark. This file is a bit like an export of one single worksheet, which you can then import into another workbook to save you recreating the view from scratch. Tableau 8.1 introduces functionality to help copy and paste worksheets from one workbook to another, so this file type may become used less but it can still be handy if you regularly use a particular view in many of the workbooks you create (a header page or appendix, for example)
  • To create a .tbm file, click Window > Bookmark > Create Bookmark. To reuse a bookmark, clcik Window > [bookmark name]. Note that you cannot create a bookmark from a dashboard page.

7.Tableau Map Source(.tms):

 .tms When plotting maps with Tableau, the software will connect to it’s mapping provider (Urban Mapping) to load the relevant map tiles in the background to plot your data points against. From the Map menu in Tableau Desktop, you have the option to add your own WMS server so that images from this source are loaded, rather than images from Urban Mapping. After you have added a new mapping source, you can share this set up with others by creating a distributing a Tableau Map Source file.
  • To create a .tms file, click on Map > Background Maps > WMS Servers and from the WMS Server Connections dialogue window, select Export. If you want this mapping source to always be available to your workbooks, add the .tms file to your Mapsources directory within My Tableau Repository.

8.Tableau Preferences (.tps):

 .tps The Tableau Preferences file can be used to create custom colour  palettes so that using consistent colours (e.g your corporate colour  schemes) across all your workbooks is made easier. 
  This file is kept in your My Tableau Repository directory and is held in        XML format.

Wednesday, 17 June 2015

How to learn Tableau Desktop

 Tableau is super easy to learn, but in the end, it doesn’t come for free.  You’ll have to put a

 little effort  in, but Tableau have also done a great job of making the trip up the learning curve
 an enjoyable and easy one.Here’s how to get from beginner to expert in a few days.

STEP 1:  Start by downloading the free trial, unless your boss has been kind enough to 
purchase the software for you.Download from HERE.

STEP 2:  Watch the product tour.  This gives a great overview of the product in under 
10 minutes and  should leave you wanting to know more!  You can find the product tour HERE.

STEP 3:  Start using one of the data sources provided within the Tableau installation files – 
Coffee Store, or Superstore sales to produce visualisations.  These data sources are relatively
 rich and contain  interesting data which allow you to evaluate the mapping features for example.

STEP 4:  Use the on-demand videos to continue your training – the top level index for these can 
be found  HERE.  

These appear to be long (3 hours +), but in reality they are broken up into a series of short clips 
which are very goal orientated.
Start with the introductory videos, and move into advanced  topics.Treat the Advanced  as
 reference manuals – you do not need to remember everything in these sections,
 just knowing that these sections exist is enough – come back to them when the need arises.

Tuesday, 16 June 2015

Tableau Performance Checklist


The List
The Tableau Performance Checklist is divided into seven main categories. You’ll find those categories with their subsequent best practices in the master list below:

Data:

ü  Keep analysis simple: Work with a subset of your data. Extract a sample if needed.
ü  Bring in only the data needed for analysis:Consider adding a data source filter or using an extract. If using a join, minimize the number of joined tables.
ü  Use “Describe” to explore :dimensions in new data sets without having to load them into a viz (keyboard shortcut CTRL+E). 
ü  Remove unsed columns:(measures/dimensions) in order to minimize extract refresh time or custom SQL query time.
ü  Create a published TDS file:for your business team to use rather than each analyst creating their own data source. This includes all metadata associated with dimensions, measures, calculated fields, hierarchies, sets, parameters and naming conventions. :
ü  Use Extracts:wherever possible to accelerate performance. Hide unused and confidential fields. Roll up data granularity by pre-aggregating or filtering. Break hierarchies to only visible dimensions.  

      Filtering:

ü  Minimize the number of quick filters: Use dashboard filter actions where possible.
ü  Avoid selecting “Only Relevant Values” for your quick filters. This requires sequential queries. Do not use this when not needed.
ü  Avoid high-cardinality quick filters (multi-select or drop-down lists). High-cardinality quick filters are slow to load and render.
ü  Avoid quick filters or actions that drive context filters. These require reloading the context table and should be avoided wherever possible.
ü  Keep range quick filters simple. The more complex the range, the slower the query.
ü  Replace quick filters showing “Only Relevant Values” and high count of quick filters with dashboard filter actions. They will cascade as your user interacts, and they perform faster.
ü  Don’t be lazy with user filters. Security by user filters can impact performance on Tableau Server as the server cannot share connections and query caches if user filters are active. Consider building a summary view that is a user-agnostic overview using a pre-aggregated extract with underlying data hidden. For a detailed view, restrict it to specific users or active directory groups instead of user filters.

    Custom SQL:

ü  Limit custom SQL connections as they can be inefficient. Where possible, create a view on the database server to implement your custom SQL and connect Tableau to your view.
ü  Avoid parameters in custom SQL in Tableau. Tableau wraps the custom SQL in a subquery that many databases don’t handle well. Consider building a view in the database or use a multi-table join with filters.
ü  Watch for useless clauses, e.g. ORDER BY. Tableau is going to re-sort the data once loaded anyway. 

     Calculations:

ü  Use calculated fields carefully. Think about the data type as you code the calculation.
ü  Number and Boolean > date > string calculations when it comes to performance.
ü  Limit blended calculations. They require sequentially querying multiple data sources and can be time consuming. Where possible, create a view on the database server.
ü  Avoid row-level calculations involving parameters.

  
     Rendering:

ü  Avoid high mark counts. More marks = longer rendering time.
ü  Limit the use of detailed text tables with lots of marks.
ü  Minimize the file size of any images or custom shapes where possible. As a general rule of thumb, keep images under 50kb.
ü  If using custom shapes, use transparent background PNGs instead of JPGs. Views will render cleaner, and shape files will take up less space.


     Local Computations:

ü  Even if a workbook is published to Tableau Server, local computations still impact performance. Leverage the power of Tableau Server whenever possible by limiting local computations such as groups, hierarchies, reference lines, table calculations and blending.
ü  Table calculations are powerful, but they can be slow. They are dependent on the local computation engine and can require substantial memory.
ü  Data blending builds a secondary temp table in cache. Although pre-aggregated, it is still computed locally. In v9.0, Tableau will begin processing queries in parallel, but it will be dependent on the data source. Until v9.0 releases, all queries run in series (sequentially).

    Dashboard Layout:
ü  Limit the number of worksheets on a dashboard. If you have more than four visualizations on a dashboard, strongly reconsider.
ü  Fix dashboard size relative to end-user consumption. Automatic sizing is less efficient than specifying dashboard size