to DV or to D3 – that is the question

The most popular (among business users) approach to visualization is to use a Data Visualization (DV) tool like Tableau (or Qlikview or Spotfire), where a lot of features already implemented for you. Recent prove of this amazing popularity is that at least 100 million people (as of February 2013),  used Tableau Public as their Data Visualization tool of choice, see

http://www.tableausoftware.com/about/blog/2013/2/crossing-100-million-milestone-21304

However, to make your documents and stories (and not just your data visualization applications) driven by your data, you may need the other approach – to code visualization of your data into your story and visualization libraries like  popular D3 toolkit can help you. D3 stands for “Data-Driven Documents”. The Author of D3 Mr. Mike Bostock designs interactive graphics for New York Times – one of latest samples is here:

http://www.nytimes.com/interactive/2013/02/20/movies/among-the-oscar-contenders-a-host-of-connections.html

and NYT allows him to do a lot of Open Source work which he demonstartes at his website here:

https://github.com/mbostock/d3/wiki/Gallery .

overview

Mike was a “visualization scientist” and a computer science PhD student at #Stanford University and member of famous group of people, now called “Stanford Visualization Group”:

http://vis.stanford.edu/people/

This Visualization Group was a birthplace of Tableau’s prototype – sometimes they called it  “a Visual Interface” for exploring data and other name for it is Polaris:

http://www.graphics.stanford.edu/projects/polaris/

and we know that creators of Polaris started Tableau Software. One of other Group’s popular “products” was a graphical toolkit (mostly in JavaScript, as oppose to Polaris, written in C++) for Visualization, called ProtoVis:

http://mbostock.github.com/protovis/

– and Mike Bostock was one of ProtoViz’s main co-authors. Less then 2 years ago Visualization Group suddenly stopped developing ProtoViz and recommended to everybody to switch to D3 library

https://github.com/mbostock,

authored by Mike. This library is Open Source (only 100KB in ZIP format) and can be downloaded from here:

http://d3js.org/d3.v3.zip

Cubism

In order to use D3, you need to be comfortable with HTML, CSS, SVG, Javascript programming, DOM (and other Web Standards); understanding of jQuery paradigm will be useful too. Basically if you want to be at least partially as good as Mike Bostock, you need to have a mindset of a programmer (I guess in addition to business user mindset), like this D3 expert:

http://www.jasondavies.com/

Most of successful early D3 adopters combining even 3+ mindsets: programmer, business analyst, data artist and even sometimes data storyteller. For your programmer’s mindset you may be interested to know that D3 has a large set of Plugins, see:

https://github.com/d3/d3-plugins

and rich #API, see https://github.com/mbostock/d3/wiki/API-Reference

You can find hundreds of D3 demos, samples, examples, tools, products and even a few companies using D3 here: https://github.com/mbostock/d3/wiki/Gallery

ChordDiagram705x235

5000 Points: Local Rendering is here

Human eye cannot process effectively more than a few (thousands) datapoints per View.

LocalRenderingBlue

Additionally, in Data Visualization you have other restrictions:

  • number of pixels on your screen (may be 2-3 millions maximum) available for your View (Chart or Dashboard).
  • time to render millions of Datapoints can be too long and may create a bad User Experience (too much waiting).
  • time to load your Datapoints into your View; if you wish to have a good User Experience, than 2-3 seconds is maximum user can wait for. If you have a live connection to datasource, than 2-3 seconds mean a few thousands of Datapoints maximum.
  • again, more Datapoints you will put in your View, more crowded it will be and less useful and less understandable your View will be for your users.

Recently, some Vendors started to add new reason for you (called Local Rendering) to restrict yourself in terms of how much of Datapoints you need to put into your DataView: usage of Client-side hardware (especially its Graphical Hardware) for so called “Local Rendering”.

Local rendering means that Data Visualization Server will send DataPoints instead of Images to Client and Rendering of Image will happened on Client-side, using capability of modern Web Browsers (to use Client’s Hardware) and HTML5 Canvas technology.

5000MarksBlueGreenGrey

For example, the new  feature in Tableau Server 8 will automatically switch to Local Rendering if number of DataPoints in your DataView (Worksheet with your Chart or Dashboard) is less then 5000 DataPoints (Marks in Tableau Speak). In addition to faster rendering it means less round-trips to Server (for example when you hover your mouse over Datapoint, in old world it means round-trip to Server) and faster Drill-down, Selection and Filtering operations.

Update 3/19/13: James Baker from Tableau Software explains why Tableau 8 Dashboards in Web Browser feel more responsive:

http://www.tableausoftware.com/about/blog/2013/3/quiet-revolution-rendering-21874

James explained that “HTML5’s canvas element” is used as drawing surface. He underscored that it’s much faster to send images rather than data because image size does not scale up linearly. James included a short video shows incremental filtering in a browser, one of the features of Local Rendering.

LocalRenderingPink