Motion Map Chart with Tableau

8 years ago Hans Rosling demoed on TED the Motion Chart, using Gapminder’s Trendalizer. 7 years ago Google bought Trendalizer and incorporated into Google Charts.

A while ago, for my own education and for demo purposes, I implemented various Motion Charts using:

Google+

To implement Motion Chart in Tableau, you can use Page Shelf and place there either a Timing dimension (I used Dimension “Year” in Tableau example above) or even Measures Names (Average Monthly Home Value per ZIP Code) in my implementation of Motion Map Chart below.

AverageHomeValuePerZipCode

Tableau’s ability to move through pages (automatically when Tableau Desktop or Tableau reader are in use and manually when Data Visualization hosted by Tableau Server and accessed through Web Browser) enabling us to create all kind of Motion Charts, as long as Visualization Author will put onto Pages a Time, Date or Timestamp variables, describing a Timeline. For me the most interesting was to make a Filled Map (Chart Type supported by Tableau, which is similar to Choropleth Map Charts) as a Motion Map Chart, see the result below.

As we all know, 80% of any Data Visualization are Data and I found the appropriate Dataset @Zillow Real Estate Research here: http://www.zillow.com/blog/research/data/ . Dataset contains Monthly Sales Data for All Homes (SFR, Condo/Co-op) for entire US from 1997 until Current Month (so far for 12604 ZIP Codes, which is only 25% of all USA ZIP codes) – average for each ZIP Code area.

This Dataset covers 197 Months and contains about 2.5 millions of DataPoints. All 5 Dimensions in Dataset are very “Geographical”: State, County, Metro Area, City and ZIP code (to define the “Region” and enable Tableau to generate a Longitude and Latitude) and each record has 197 Measures – the Average Monthly Home Prices per Given Region (which is ZIP Code Area) for each available Month since 1997.

In order to create a Motion Filled Map Chart, I put Longitude as Column and Latitude as Row, Measure Values as Color, Measure Names (except Number of Records) as Pages, States and Measure Names as Filters and State and ZIP code as Details and finally Attribute Values of County, Metro Area and City as Tooltips. Result I published on Tableau Public here:

http://public.tableausoftware.com/views/zhv/ZillowHomeValueByZIP_1997-2013#1 ,

so you can review it online AND you can download it and use it within Tableau Reader or Tableau Desktop as the automated Motion Map Chart.

For Presentation and Demo purposes I created the Slides and Movie (while playing it don’t forget to setup a Video Quality to HD resolution) with Filled Map Chart colored by Home Values for entire USA in 2013 as a Starting points and with 22 follow-up steps/slides: Zoom to Northeast Map, colored by 2013 Values, Zoom to SouthEastern New England 2013, start the Motion from Southeastern New England, colored  by 1997 Home Values per each ZIP Code and then automatic Motion through all years from 1997 to 2014, then Zoom to Eastern Massachusetts and finally Zoom to Middlesex County in Massachusetts, see movie below:

Here the content of this video as the presentation with 24 Slides:

Now I think it is appropriate to express my New Year Wish (I repeating it for a few years in a row) that Tableau Software Inc. will port the ability to create AUTOMATED Motion Charts from Tableau Desktop and Tableau Reader to Tableau Server. Please!

Happy New 2014!

My Best Wishes for 2014 to all visitors of this Blog!

New2014

2013 was very successful year for Data Visualization (DV) community, Data Visualization vendors and for this Data Visualization Blog (number of visitors per grew from average 16000 to 25000+ per month).

From certain point of view 2013 was the year of Tableau – it went public, Tableau has now the largest Market Capitalization among DV Vendors (more than $4B as of Today) and its strategy (Data to the People!) became the most popular among DV users and it had (again) largest YoY revenue growth (almost 75% !) among DV Vendors. Tableau already employed more than 1100 people and still has 169+ job openings as of today. I wish Tableau to stay the Leader of our community and to keep their YoY above 50% – this will not be easy.

Qliktech is the largest DV Vendor and it will exceed in 2014 the half-billion dollars benchmark in revenue (probably closer to $600M by end of 2014) and will employ almost 2000 employees. Qlikview is one of the best DV product on market. I wish in 2014 Qlikview will create Cloud Services, similar to Tableau Online and Tableau Public and I wish Qlikview.Next will keep Qlikview Desktop Professional (in addition to HTML5 client).

I wish TIBCO will stop trying to improve BI or make it better – you cannot reanimate a dead horse; instead I wish Spotfire will embrace the approach “Data to the People” and act accordingly. For Spotfire my biggest wish is that TIBCO will spin it off the same way EMC did with VMWare. And yes, I wish Spofire Cloud Personal will be free and enabled to read at least local flat files and local DBs like Access.

2014 (or may be 2015?) can witness new, 4th DV player coming to competition: Datawatch bought recently Panopticon and if it will complete integration of all products correctly and add features which other DV vendors above already have (like Cloud Services), it can be very competitive player. I wish them luck!

TibxDataQlikQwchFrom051713To122413

Microsoft released in 2013 a lot of advanced and useful DV-related functionality and I wish (I recycling this wish for many years now) that Microsoft finally will package the most its Data Visualization Functionality in one DV product and add it to Office 20XX (like they did with Visio) and Office 365 instead of bunch of plug-ins to Excel and SharePoint.

It is a mystery for me why Panorama, Visokio and Advizor Solutions still relatively small players, despite all 3 of them having an excellent DV features and products. Based on 2013 IPO experience with Tableau may be the best way for them to go public and get new blood? I wish to them to learn from Tableau and Qlikview success and try this path in 2014-15…

For Microstrategy my wish is very simple – they are only traditional BI player who realised that BI is dead and they started in 2013 (actually before then 2013) a transition into DV market and I wish them all success they can handle!

I also think that a few thousands of Tableau, Qlikview and Spotfire customers (say 5% of customer base) will need (in 2014 and beyond) more deep Analytics and they will try to complement their Data Visualizations with Advanced Visualization technologies they can get from vendors like http://www.avs.com/

My best wishes to everyone! Happy New Year!

y16_84590563

Notes about Spotfire 6 Cloud pricing

2 months ago TIBCO (Symbol TIBX on NASDAQ) anounced Spotfire 6 at TUCON 2013 user conference. This as well a follow-up release  (around 12/7/13) of Spotfire Cloud supposed to be good for TIBX prices. Instead since then TIBX lost more then 8%, while NASDAQ as whole grew more then 5%:

TIBXvsNasdaqFrom1014To121313

For example, at TUCON 2013 TIBCO’s CEO re-declared “5 primary forces for 21st century“(IMHO all 5 “drivers” sounds to me like obsolete IBM-ish Sales pitches) – I guess to underscore the relevance of TIBCO’s strategy and products to 21st century:

  1. Explosion of data (sounds like Sun rises in the East);

  2. Rise of mobility (any kid with smartphone will say the same);

  3. Emergence of Platforms (not sure if this a good pitch, at least it was not clear from TIBCO’s presentation);

  4. Emergence of Asian Economies (what else you expect? This is the side effect of the greedy offshoring for more then decade);

  5. Math trumping Science  (Mr. Ranadive and various other TUCON speakers kept repeating this mantra, showing that they think that statistics and “math” are the same thing and they do not know how valuable science can be. I personally think that recycling this pitch is dangerous for TIBCO sales and I suggest to replace this statement with something more appealing and more mature).

Somehow TUCON 2013 propaganda and introduction of new and more capable version 6 of Spotfire and Spotfire Cloud did not help TIBCO’s stock. For example In trading on Thursday, 12/12/13 the shares of TIBCO Software, Inc. (NASD: TIBX) crossed below their 200 day moving average of $22.86, changing hands as low as $22.39 per share while Market Capitalization was oscillating around $3.9B, basically the same as the capitalization of 3 times smaller (in terms of employees) competitor Tableau Software.

As I said above, just a few days before this low TIBX price, on 12/7/13, as promised on TUCON 2013, TIBCO launched Spotfire Cloud and published licensing and pricing for it.

Most disappointing news is that in reality TIBCO withdrew itself from the competition for mindshare with Tableau Public (more then 100 millions of users, more then 40000 active publishers and Visualization Authors with Tableau Public Profile), because TIBCO no longer offers free annual evaluations. In addition, new Spotfire Cloud Personal service ($300/year, 100GB storage, 1 business author seat) became less useful under new license since its Desktop Client has limited connectivity to local data and can upload only local DXP files.

The 2nd Cloud option called Spotfire Cloud Work Group ($2000/year, 250GB storage, 1 business author/1 analyst/5 consumer seats) and gives to one author almost complete TIBCO Spotfire Analyst with ability to read 17 different types of local files (dxp, stdf, sbdf, sfs, xls, xlsx, xlsm, xlsb, csv, txt, mdb, mde, accdb, accde, sas7bdat,udl, log, shp), connectivity to standard Data Sources (ODBC, OleDb, Oracle, Microsoft SQL Server Compact Data Provider 4.0, .NET Data Provider for Teradata, ADS Composite Information Server Connection, Microsoft SQL Server (including Analysis Services), Teradata and TIBCO Spotfire Maps. It also enables author  to do predictive analytics, forecasting, and local R language scripting).

This 2nd Spotfire’s Cloud option does not reduce Spotfire chances to compete with Tableau Online, which costs 4 times less ($500/year). However (thanks to 2 Blog Visitors – both with name Steve – for help), you cannot use Tableau online without licensed version of Tableau Desktop ($1999 perpetual non-expiring desktop license with 1st year maintenance included and each following year 20% $400 per year maintenance) and Online License (additional $500/year for access to the same site, but extra storage will not be added to that site!) for each consumer. Let’s compare Spotfire Workgroup Edition and Tableau Online cumulative cost for 1, 2, 3 and 4 years for 1 developer/analyst and 5 consumer seats :

 

Cumulative cost for 1, 2, 3 and 4 years of usage/subscription, 1 developer/analyst and 5 consumer seats:

Year

Spotfire Cloud Work Group, 250GB storage

Tableau Online (with Desktop), 100GB storage

Cost Difference (negative if Spotfire cheaper)

1

$2000

$4999

-$2999

2

$4000

$8399

-$4399

3

$6000

$11799

-$5799

4

$8000

$15199

$7199

UPDATE: You may need to consider some other properties, like available storage and number of users who can consume/review visualizations, published in cloud. In sample above:

  • Spotfire giving to Work Group total 250GB storage, while Tableau giving total 100GB to the site.
  • Spotfire costs less than Tableau Online for similar configuration (almost twice less!)

Overall, Spotfire giving more for your $$$ and as such can be a front-runner in Cloud Data Visualization race, considering that Qlikview does not have any comparable cloud options (yet) and Qliktech relying on its partners (I doubt it can be competitive) to offer Qlikview-based services in the cloud. Gere is the same table as above but as IMage (to make sure all web browsers can see it):

SFvsTBCloudPrice

3rd Spotfire’s Cloud option called Spotfire Cloud Enterprise, it has customizable seating options and storage, more advanced visualization, security and scalability and connects to 40+ additional data sources. It requires an annoying negotiations with TIBCO sales, which may result to even larger pricing. Existence of 3rd Spotfire Cloud option decreases the value of its 2nd Cloud Option, because it saying to customer that Spotfire Cloud Work Group is not best and does not include many features. Opposite to that is Tableau’s Cloud approach: you will get everything (with one exception: Multidimensional (cube) data sources are not supported by Tableau Online) with Tableau Online, which is only the option.

Update 12/20/13:  TIBCO announced results for last quarter, ending 11/30/13 with Quarterly revenue $315.5M (only 6.4% growth compare with the same Quarter of 2012) and $1070M Revenue for 12 months ended 11/30/13 (only 4.4% growth compare with the same period of 2012). Wall Street people do not like it today and TIBX lost today 10% of its value, with Share Price ending $22 and Market Capitalization went down to less then $3.6B. At the same time Tableau’s Share Price went up $1 to $66 and Market Capitalization of Tableau Software (symbol DATA) went above $3.9B). As always I think it is relevant to compare the number of job openings today: Spotfire – 28, Tableau – 176, Qliktech – 71

DV footprints on Disk and in Memory, Part 2

My previous blogpost, comparing footprints of DV Leaders (Tableau 8.1, Qlikview 11.2, Spotfire 6) on disk (in terms of size of application file with embedded dataset with 1 million rows) and in Memory (calculated as RAM-difference between freshly-loaded (without data) application and  the same application when it will load appropriate application file (XLSX or DXP or QVW or TWBX) got a lot of feedback from DV Blog visitors. It even got mentioning/reference/quote from Tableau Weekly #9 here:

http://us7.campaign-archive1.com/?u=f3dd94f15b41de877be6b0d4b&id=26fd537d2d&e=5943cb836b and the full list of Tableau Weekly issues is here: http://us7.campaign-archive1.com/home/?u=f3dd94f15b41de877be6b0d4b&id=d23712a896

The majority of feedback asked to do a similar Benchmark – the footprint comparison for larger dataset, say with 10 millions of rows. I did that but it required more time and work,  because the footprint in memory for all 3 DV Leaders depends on the number of visualized Datapoints (Spotfire for years used the term Marks for Visible Datapoints and Tableau adopted these terminology too, so I used it from time to time as well, but I think that the correct term here will be “Visible Datapoints“).

Basically I used the same dataset as in previous blogpost with main difference that I took subset with 10 millions of rows as a opposed to 1 Million rows in previous Benchmarks. The Diversity of used Dataset with 10 Million rows is here (each row has 15 fields as in previous benchmark):

I removed from benchmarks for 10 million rows the usage of Excel 2013 (Excel cannot handle more the 1,048,576 rows per worksheet) and PowerPivot 2013 (it is less relevant for given Benchmark). Here are the DV Footprints on disk and in Memory for Dataset with 10 Million rows and different number of Datapoints (or Marks: <16, 1000, around 10000, around 100000, around 800000):

Main observations and notes from benchmarking of footprints with 10 millions of rows as following:

  • Tableau 8.1 requires less (almost twice less) disk space for its application file .TWBX then Qlikview 11.2 (.QVW) for its application file (.QVW) or/and Spotfire 6 for its application file (.DXP).

  • Tableau 8.1 is much smarter when it uses RAM then Qlikview 11.2 and Spofire 6, because it takes advantage of number of Marks. For example for 10000 Visible Datapoints Tableau uses 13 times less RAM than Qlikview and Spotfire and for 100000 Visible Datapoints Tableau uses 8 times less RAM than Qlikview and Spotfire!

  • THe Usage of more than say 5000 Visible Datapoints (even say more than a few hundreds Marks) in particular Chart or Dashboard often the sign of bad design or poor understanding of the task at hand; the human eye (of end user) cannot comprehend too many Marks anyway, so what Tableau does (in terms of reducing the footprint in Memory when less Marks are used) is a good design.

  • For Tableau in results above I reported the total RAM used by 2 Tableau processes in memory TABLEAU.EXE itself and supplemental process TDSERVER64.EXE (this 2nd 64-bit process almost always uses about 21MB of RAM). Note: Russell Christopher also suggested to monitor TABPROTOSRV.EXE but I cannot find its traces and its usage of RAM during benchmarks.

  • Qlikview 11.2 and Spotfire 6 have similar footprints in Memory and on Disk.

DV footprints on Disk and in Memory, Part 1

More than 2 years ago I estimated the footprints for the sample dataset (428999 rows and 135 columns) when it encapsulated in text file, in compressed ZIP format, in Excel 2010, in PowerPivot 2010, Qlikview 10, Spofire 3.3 and Tableau 6. Since then everything upgraded to the “latest versions” and everything 64-bit now, including Tableau 8.1, Spotfire 5.5 (and 6), Qlikview 11.2, Excel 2013 and PowerPivot 2013.

I decided to use the new dataset with exactly 1000000 rows (1 million rows) and 15 columns with the following diversity of values (Distinct Counts for every Column below):

Then I put this dataset in every application and format mentioned above – both on disk and in memory. All results presented below for review of DV blog visitors:

Some comments about application specifics:

  • Excel and PowerPivot XLSX files are ZIP-compressed archives of bunch of XML files

  • Spotfire DXP is a ZIP archive of proprietary Spotfire text format

  • QVW  is Qlikview’s proprietary Datastore-RAM-optimized format

  • TWBX is Tableau-specific ZIP archive containing its TDE (Tableau Data Extract) and TWB (XML format) data-less workbook

  • Footprint in memory I calculated as RAM-difference between freshly-loaded (without data) application and  the same application when it will load appropriate application file (XLSX or DXP or QVW or TWBX)

Happy Shopping for your Data Visualization Lab!

Since we approaching (in USA that is) a Thanksgiving Day for 2013 and shopping is not a sin for few days, multiple blog visitors asked me what hardware advise I can share for their Data Science and Visualization Lab(s). First of all I wish you will get a good Turkey for Thanksgiving (below is what I got last year):

Turkey2012

I cannot answer DV Lab questions individually – everybody has own needs, specifics and budget, but I can share my shopping thoughts about needs for Data Visualization Lab (DV Lab). I think DV Lab needs many different types of devices: smartphones, tablets, projector (at least 1), may be a couple of Large Touchscreen Monitors (or LED TVs connectable to PCs), multiple mobile workstations (depends on size of DV Lab team), at least one or two super-workstation/server(S) residing within DV Lab etc.

Smartphones and Tablets

I use Samsung Galaxy S4 as of now, but for DV Lab needs I will consider either Sony Xperia Z Ultra or Nokia 1520 with hope that Samsung Galaxy S5 will be released soon (and may be it will be the most appropriate for DV Lab):

sonyVSnokia

My preference for Tablet will be upcoming Google Nexus 10 (2013 or 2014 edition – it is not clear, because Google is very secritive about it) and in certain cases Google Nexus 7 (2013 edition). Until Nexus 10 ( next generation) will be released, I guess that two leading choices will be ASUS Transformer Pad TF701T

t701

and Samsung Galaxy Note 10.1 2014 edition (below is a relative comparison of the size of these 2 excellent tablets):

AsusVsNote10

Projectors, Monitors and may be Cameras.

Next piece of hardware in my mind is a projector with support for full HD resolution and large screens. I think there are many good choices here, but my preference will be BENQ W1080ST for $920 (please advise if you have a better projector in mind in the same price range):

benq_W1080ST

So far you cannot find too many Touchscreen Monitors for reasonable price, so may be these two 27″ touchscreen monitors (DELL P2714T for $620 or Acer T272HL bmidz for $560) are good choices for now:

dell-p2714t-overview1

I also think that a good digital camera can help to Data Visualization Lab and considering something like this (can be bought for $300): Panasonic Lumix DMC FZ72 with 60X optical zoom and ability to do a Motion Picture Recording as HD Video in 1,920 x 1,080 pixels – for myself:

panasonic_lumix_dmc_fz72_08

Mobile and Stationary Workstations and Servers.

If you need to choose CPU, I suggest to start with Intel’s Processor Feature Filter here: http://ark.intel.com/search/advanced . In terms of mobile workstations you can get quad-core notebook (like Dell 4700 for $2400 or Dell Precison 4800 or HP ZBook 15 for $3500) with 32 GB RAM and decent configuration with multiple ports, see sample here:

m4700

If you are OK with 16GB of RAM for your workstation, you may prefer Dell M3800 with excellent touchscreen monitor (3200×1800 resolution) and only 2 kg of weight. For a stationary workstation (or rather server) good choices are Dell Precision T7600 or T7610 or HP Z820 workstation. Either of these workstations (it will cost you!) can support up to 256GB RAM, up to 16 or even 24 cores in case of HP Z820), multiple high-capacity hard disks and SSD, excellent Video Controllers and multiple monitors (4 or even 6!) Here is an example of backplane for HP Z820 workstation:

HP-z820

I wish to visitors of this blog a Happy Holidays and good luck with their DV Lab Shopping!

Data Vikings from Sweden, DV Motherland

In past Vikings discovered America, conquested or colonized parts of England, Russia, Ireland, Scotland, even Southern Italy and Iceland… But in 21st century (as far as this blog is concerned) Sweden became a Motherland of Data Visualization:

4SwedishDVVendorsLogosLet’s start with most famous Data Viking and most known Storyteller in Data Visualization field – prof. Hans Rosling from Karolinska Institutet and chairman of the Gapminder Foundation (in Stockholm). Gapminder’s team invented the popular and useful 6-dimensional Motion Chart and developed Trendalizer which was bought by Google in 2007, see it here: https://developers.google.com/chart/interactive/docs/gallery/motionchart . The recent example of Prof. Rosling Storytelling you can see  here:

In Stockholm you can find another Data Visualization Innovator – Panopticon is a leader in Complex Even Processing and real-time Visual Analytics. Among other innovation here is the example of Panopticon’s invention (by its senior developer Hannes Reijner) of Horizon Chart, see sample here:

HorizonGraph

and short video about it here:

In 2012 Panopticon posted 112% Year-Over-Year revenue growth (comparable with Tableau). In 2013 (the all stock deal closed by the end of September, 2013.) Datawatch bought Panopticon for $31.4M and I assume it will try to move some R&D from Sweden to Chelmsford, MA.

In  Göteborg/Gothenburg you can find R&D office of another DV Leader – Spotfire with 60+ Data Vikings. In 2007 TIBCO bought Spotfire for $195M but even now in 2013 unable to move R&D into USA. So now Spotfire actually has 3+ main offices: TIBCO Corporate Headquarters in California, Spotfire Headquarters in Somerville, MA (estimate is 15% of Spotfire workforce) and main R&D office in Sweden. In addition, lately TIBCO choose the strategy to buy rather then build new features, for example, just in 2013 they added to Spotfire portfolio the following new companies and as result they have even more distributed R&D team now:

  • Extended Results (PushBI) in Redmond, WA
  • MAPORAMA in Paris, France
  • StreamBase Systems, Inc. in Waltham, MA

As a result, despite the fact that Spotfire 6 is the most mature Data Visualization platform on market, people in TIBCO Corporate Headquarters running into risk of do not have enough knowledge of their own major Intellectual Properties.

In southern Sweden – Lund, we can find Swedish Headquarters of the major DV Leader – Qliktech, who occupied almost half of Data Visualization market in terms of sales. At least 140 Data Vikings located in Lund and may be another 200 elsewhere in Sweden. Qliktech’s Data Vikings are major innovators with features like the fastest in-memory Data Engine, most natural Visual Drill-down, Associative Query Language to name a few. This also presents a major problem for Qliktech, because they have Headquarter in Radnor, PA (where only 150+ employees work (estimate), which is less then 10% of Qliktech’s workforce!), Main marketing, sales and support office in Newton, MA (estimate: less then 5% of workforce) and most R&D in Lund (estimate: at least 10% of workforce).

This means that almost 500 technically advanced Data Visualization experts (engineers, developers, architects etc., which is at least 23% of total Qliktech+Spotfire workforce) are still in Sweden. The simple observation of Tableau’s TCC13 conference in September 2013 shows that Tableau’s top managers and officers know their product deeper and more intimately then their counterparts in Qliktech and Spotfire. That is very easy to explain: because 650+ Tableau’s employees (almost 65% of their workforce and most developers, managers and officers) work in the same Main HQ office in Seattle, WA and they obviously talking to each other in-person and often!

My humble advice to Qliktech, Spotfire and Datawatch is simple – gradually relocate as much Data Vikings from Sweden to appropriate headquarters in USA or find and hire local american equivalents of those Swedish geniuses…

As a background for this advice, please consider this information (updated on 11/17/13): statistics of job openings clearly showing that all 3 DV Leaders keep doing (by inertia) what they did in past with only difference that it worked recently for Tableau and does not work for Qliktech and Spotfire. Here are specific examples:

  • Tableau has 176 job openings (much more then Qlikview (only 80) and Spotfire(only 18) combined)!

  • 97 (55%) of Tableau openings are in Seattle, more then half of Tableau’s openings are engineering and technical positions!

  • Qliktech has 17 (21%) positions opened in Lund, only 10 (12%) in Radnor and 4 (5%) in Newton, MA. Only 11 (14%, 9 times less then at Tableau in absolute numbers) Qliktech’s openings are engineering and technical.

  • Spotfire has only 18 openings (1 in Göteborg, 5 in CA, 4 in MA) and only 4 Spotfire’s positions (out of 18, 22% that is) are engineering or technical.

This statistics clearly showing that neither Qliktech no TIBCO see the wrong pattern and huge problem here and that can be a reason for disruption in the future and the gradual  relocation of Data Vikings is only way to prevent the danger… And of course, if you can afford, find and hire equal talents in USA Headquarters then by all means keep geniuses in Sweden without relocation which is a half-similar to what Tableau does (HALF is because Tableau historically does not need to maintain the significant R&D office outside of USA)!