Friday, September 27, 2013

The value of 'a topic on a page'

Some time ago, I become interested in summarising a topic on a page.  The interest sprang from some really good posters I found that summarised all the important stuff ragarding a topic e.g. Rational had a really good one on UML and another on RUP.

Around the same time, I found a template that Apple were using for creating 'posters' to summarise a topic on a page and the notion of 'poster sessions' where people would put their poster on a wall and people would meet in a room with multiple posters and read them to get a quick update on topics of interest.

Most recently, I have rekindled my interest in 'topic on a page' and find them a great way of communicating a lot of concepts quickly.  The digital equivalent seems to be a really good infographic or a really good Prezi presentation.

The only trouble is, a good poster is hard to produce.

I'm currently working on a 'Big Data on a page' poster and will post it back here when I have something to share.

In the meantime, any tips on how to create good posters, infographics and the like are most welcome :)

Wednesday, September 18, 2013

Microsoft knocking down barriers to BYOD

Apparently there has always been some legal issues with companies installing or providing access to Microsoft software on their employees' BYOD's - see this Forrester article from 2012 ...

However, it looks like Microsoft have taken some good steps in knocking down the barriers with the introduction of Windows Intune - see 

Im keen to see the barriers knocked down and this seems like it just might - now all we need is for the legal agreements of companies with Microsoft to 'catch up' ;)

Does anyone have any experience with Windows Intune yet? 

Tuesday, September 3, 2013

The Rosie Project

Last night, I went to another book launch. Actually, it was the first one I have ever attended but I'm sure for some people and the book store on Norton St, Leichhardt, Shearer's Bookstore, it was another one of many.

The book launched was Graeme Simsion's "The Rosie Project" and having read a 'preview' of the book before it was officially published, Im looking forward to starting the official book this weekend, Graeme tells me it's even better now and I thought it was great before.

I won't spoil things by telling you too much about the book itself other than to say I think you will find it a "very good read", as did the Woman's Weekly.

I'd rather give you some of the highlight's from the book launch itself which, unless you were one of the 50 or so people there, you probably missed :)

Graeme addressed the audience from a 'pulpit' above the floor of the book store, with the audience spread out in a number of directions, following the aisles of the book store that radiated out from the corner of the room above which Graeme was standing.

The first part of his address concerned how he'd become rich and given up his day job which apparently had happened about a month ago.

Ive known Graeme for many years and while I know he sets his goals very, very high, I think the part where he told the audience how much money it had taken to give up the day job was, there was a hint that Graeme was slightly incredulous that it had all happened so quickly.

Many months before, I had been having drinks with Graeme and had contemplated putting $10,000 into the project, believing that it would succeed. I wish now I had wired the money the next day or taken out my cheque book there and then *sigh* ;)

The rest of the address concerned how the story had been inspired, how it was originally produced as a film script, and how it had been Graeme's project at his Screen writing course in Melbourne and that he had repeated the subject because a prior version had not been ready and he went back again to finish it.

I was mesmerized by the story, never having heard it before, all the time thinking that this was typical Graeme :) ... listening to the story I felt sorry for the jugernaught, they have nothing on Graeme :D

Monday, September 2, 2013

I want to be a Data Scientist

I want to be a Data Scientist

I saw a really interesting YouTube video today on called [The Data Scientists Toolset] [1]
[1]: "YouTube video"
It is a video of a panel discussion from a conference called Data Scientist Summit (note to self: I have really missed the boat if the others are already having summit’s… back in 2012! ;) )
I admit, I did not know anyone on the panel but after listening to them talk I believe they must all be experts of some note.
Some of the key points for me were:

  • The 3 big things you need as a Data Scientist
  • Value from Big Data = having Big Analytics
  • Run experiments ‘at scale’
  • Room for Everyone - Hadoop, NoSQL and “new SQL”, and
  • The ‘Desert Island challenge’

I’ll cover each in a bit more detail below.

The 3 big things you need as a Data Scientist

According to the experts on the panel, there are 3 big things that Data Scientists need to have:

  1. Domain skills and expertise,
  2. Great modelling (read statistics) skills, and
  3. Tech literacy with the Big data (and other) tools and technologies required.

To me, this is a great list of reasons for good collaboration between the Business and IT. Business professionals ideally have good Domain skills and experience. Visa versa, IT professionals typically have technology literacy.

The 'middle ground' Great modelling (statistics) skills is the interesting one. Some people have this based on whatever they did in Uni and continued into their professional career.

Its more likely that Business professionals are going to have the right skills and experience, especially in business domains such as Economics, Finance, Science, Research, etc.

However, of the 3 required areas of skills/ experience, this is the most likely to 'fall between the cracks' i.e. no-one has them.

I think this is (perhaps) the reason that higher education qualifications being offered by Universities around the world are so 'heavy' in statistics and maths.

Value from Big Data = having Big Analytics

I think this was a great point!

I see a lot of excitement (almost hysteria) about Big Data and how **cool** it is to be able to parse the Petabytes of log files and other Big Data out there but … where is the value?

Big Data is often associated with the 3 'qualifying' V's - Volume, Variety and Velocity.

I think it is a good idea to add 2 more 'quantifying' V's to the list - Veracity and Value.


Veracity to me examines the question of the 'validity' of the data source in terms of what people want to do with it.

One of the lessons I have learnt from just 'normal' Data warehouse and Business Intelligence/Analytics solutions is just because there is data out there it does not mean you should try and capture it and make use of it. You really need to ask yourself the question: is this data appropriate to my needs? or, in a qualitative sense, how appropriate is the data? (does it do part of the job?)


The 'flip side' is, Is there value in using this data? Does it help me tell the right story?
Big Data to me has a huge risk to be addressed - the GIGO (Garbage In, Garbage Out) principle means that people risk *Big Garbage*.

Run experiments ‘at scale’

Gone are the days of having to have small amounts of data to test your 'models' and to validate that they produce the 'right' results before trying them out on the 'real data' (usually Production or a copy of Production).

The panel stressed that Big Data tools and technologies allow you to operate 'at scale'.

Personally, I'm not sure about this one. I may not be a gun at statistics but I seem to remember that it does not take much data to provide a statistically valid model e.g. to predict the outcome of an election all you really need is a relatively small, but representative, sample from the population to have confidence in the results predicted… assuming that the rest of the population follow certain rules.

I think there's a difference between validating your models and running on full scale data.
Just because Big Data has 'resources to burn' I don't think people should lose sight of good modelling and testing.

Room for everyone

I think its 'reassuring' that Big Data is seen as a complementary technology and is best applied to suitable 'problems' (or classes of problem).

The panel made it clear they see a role for all of the data technologies: Big Data (e.g. Hadoop), NOSQL, and 'new SQL'.

One criterial they suggested for deciding which data technology was a best fit was whether the model of the data was 'to be discovered', partially agreed, or agreed (respectively).

Big Data technologies are typically associated with 'a model at use time' versus 'new SQL' where the modelling takes place first and then the data is poured in.

The ‘Desert Island challenge’

When it was time to wrap up, the moderator for the expert panel session posed a question: If you were (to be) stranded on a desert island, what tool or technology would you take with you … and only one!
Interestingly, **all** of the panel members named a programming language technology: Java, C++, Python, etc.

I guess this speaks to the 'roots' of the panelists and the fact that the Big Data tools and technology, while all useful in their own right, are not quite there yet to be able to dislodge the versatility and power provided by a programming language.

I hope this 'commentary' was of interest. I would encourage you to view the YouTube video for yourself. I am sure you will get different stuff out of it than I did.

More on Big Data to come in future Blogs.