Big Data, Bad Math: Gartner Consults Its Crystal Ball

Big Data is a marketing campaign. Uttered from the lips of technology companies, including analyst organizations such as Gartner, the term Big Data remains ill-defined. This is intentional. It allows them to claim just about anything they want because their claims can’t be fact-checked if you don’t actually know what Big Data is.

Generally speaking, technology companies use the term Big Data to refer to greater volumes and new sources of data. This, however, is not a new thing. Since the advent of computers, each year we’ve accumulated more data and new data sources. Data didn’t suddenly become big. Big Data is just more of the same, but it is celebrated by technology vendors, analyst groups, and thought leaders as a qualitative break from the past—the newest techno-panacea that everyone must invest in or be left behind. Claims of Big Data’s effects on the world are growing ever larger and more absurd.

In his keynote presentation at Gartner’s IT Expo last week in Orlando, SVP and Global Head of Research Peter Sondergard proclaimed that by the year 2015 a total of 4.4 million jobs will be created worldwide to support Big Data. Not only that, but every new Big Data role in the U.S. (1.9 million by 2015) will create jobs for three more people outside of IT. What does this actually mean? What constitutes a Big Data role? Because the term is so vaguely defined, Gartner can claim that any new job in an IT department or for people who work elsewhere with data in any way is in fact a new Big Data position. This is pure fantasy.

And how did Gartner come up with the 4.4 million job figure? My guess is that, after a long night of drinking, they gathered around the Ouija board and let the spirits (that is, their own drunken imaginations) lead them to the answer.

Notice the irony. Here is an organization of industry analysts talking about analytical technology that is engaging in analytical nonsense. No qualified data analyst would make such absurd and groundless predictions. Either the so-called analysts at Gartner have never been trained in data analysis or they are fabricating predictions that serve their own financial interests. Most likely, it’s both. CIOs who buy into these prognostications are either naïve or are, like Gartner, motivated by self-interest. After all, chasing the latest technology is what keeps CIOs employed.

Organizations all over the world rely on groups such as Gartner to guide their IT investments. Are they getting objective and reliable advice? Far from it. Gartner has no incentive to discourage organizations from investing in IT. They make their money by keeping us convinced that we can’t live without the latest technologies, regardless of whether they’re actually needed or actually work. The truth is, analyst organizations such as Gartner are in bed with the very technology vendors whose work they supposedly monitor and critique. They’re having a wild orgy in that bed, rolling in cash, but it is only the end users who are getting screwed. Essentially, Gartner and the like operate as extensions of technology company marketing departments. Gartner is creating demand for its clients’ products and services (yes, the very technology companies that these analyst organizations monitor—supposedly in an objective manner—are their clients, who pay dearly for their support). These products and services aren’t usually needed, they are often ineffective, and in the case of Big Data, they’re ephemeral. Have you noticed that every business intelligence vendor has suddenly become the leading Big Data company without changing anything that they do? Just slap a new name on business as usual and you can get the world to line up at your door.

Look past the marketing hype for analytical (data sensemaking) products that actually work. It doesn’t matter whether they’re called Big Data, analytics, or just plain data analysis tools. What matters is that they help you find the signals that exist in the midst of all that noise in your data and make it possible for you to understand those signals and use that understanding to work smarter than before. Demand that vendors show you how their tools can be used to glean real value from your own data. Ignore their claims and demand evidence. Make them show you how you can make better decisions using their products and services. Unless they can provide that, you don’t need what they’re selling.

Take care,

34 Comments on “Big Data, Bad Math: Gartner Consults Its Crystal Ball”


By Chuck Hooper. October 31st, 2012 at 5:48 pm

Well said, Stephen!

By Jason. October 31st, 2012 at 7:28 pm

> They’re having a wild orgy in that bed, rolling in cash, but it is only the end users who are getting screwed

Love it! I too wonder at the supposed job numbers from flurry of these type of articles. Hyperbolic to say the least. I remember at high school being told there wouldn’t be enough IT workers until 2025 at least. What rubbish. All these predictions really don’t take into account how wrong the last gasping predictions were and why; technology changes and shifts in efficiency to reduce the numbers of people actually needed, and of course as you point out, how well it all actually delivers the benefits it claims to. As always, no silver bullet in any of this stuff without the processes and business shift in understanding how to use it.

By Florian Eiden. November 1st, 2012 at 6:29 am

Hi Stephen,

Inspired by your ideas about big data, and the fantastic speech I attended that you delivered in France this month, I made these quick schemas on the theme of Big Data (I know they are in French, but I know you are able to understand the language when there’s pictures ;)) : http://fleid.net/2012/10/30/big-data-revolution-it-ou-effet-de-mode-marketing/

Thanks for speaking the truth like you do!

Regards,
Florian

By Andrew. November 1st, 2012 at 10:02 am

“What constitutes a Big Data role?”

Easy: having the words “big” and “data” together somewhere in the job title or description. Of course, the actual responsibilities and skill-sets will be no different than any other data-related role.

“Have you noticed that every business intelligence vendor has suddenly become the leading Big Data company without changing anything that they do?”

What annoys me is that these vendors aren’t just selling old solutions under new names, they’re also reinventing the problems with new names. And of course the names need to sound ominous. BIG Data (*crack of thunder*) - who wouldn’t be afraid to tackle that on their own?

As long as businesses trust vendors to identify their problems for them, the vendors will keep identifying problems for which they are always the leading experts in solving.

By Seth Grimes. November 2nd, 2012 at 11:36 am

Steve, business intelligence vendors have changed what they do: They’ve put in place Hadoop co-existence strategies. These strategies are their justification for characterizing themselves as Big Data solution providers. There is some validity to this relabeling, albeit very limited since Hadoop is a one-trick pony in a complex world.

Seth

By Stephen Few. November 2nd, 2012 at 12:14 pm

Seth,

How is Hadoop a qualitative departure from the past? Doesn’t it just do a little better what we’ve been doing for years (distributed systems, integration of structured and unstructured data, etc.)?

By Martin White. November 3rd, 2012 at 7:28 am

Steve

I’m right with you http://www.intranetfocus.com/archives/915 though perhaps not as trenchant. I worked on a project for the European Commission last year to quantify the market for enterprise search and that gave me a decent handle on the number of companies that are in the market for a ‘big data’ solution. Very few who are not already in it, such as telcos, financial services and global IT companies.

The key parameters (Gartner style) are volume, velocity,variety and value, and IBM add in veracity. In the companies I work for (mainly large multinationals)the CIO has no idea of volume of data, just storage. They will have no good sense of velocity, no idea of variety and no one has yet come up with a value calculation other than Oracle who recently stated that poor management of data took 14% off revenues. Well, they would!

In addition structured data on its own makes no sense. It needs information (unstructured)to provide context. A recent AIIM survey showed that 60% of IT managers realised that but only 2% were able to actually manage the integration.

Thanks for your bold and accurate appreciation of the situation. You’ve encouraged me to be bolder in future.

Martin

By Ajay Ohri. November 4th, 2012 at 10:58 am

The analyst -vendor buzz creating system in business technology seems corrupted and ripely overdue for disruption. Bravo Mr Few, few people lack the courage to flag bullshit so openly, too busy selling their next webinar. respect!

By Charlie Hull. November 5th, 2012 at 4:02 am

Agreed, in spades. The Big Data marketing bandwagon has rolled over the enterprise search market as well, with the term being used by many to rebrand what has never been an easy thing to sell: especially now that you can solve 99% of search problems with open source software such as Apache Lucene/Solr. Lots of the big search companies have now been sold, often at massive overvaluations - and the acquirers are now saying it’s all part of a Big Data strategy (it’s not that they stupidly believed the marketing hype, oh no!). I’m also with you on the analyst firms: few know anything about what they are writing, they’re great at producing pointless and misleading infographics, and it’s particularly annoying being approached by them every few months for pay-to-play speaking opportunities (usually wildly expensive). They don’t get open source at all (because no-one is paying them to market it) so are looking increasingly irrelevant in our sector, luckily.

By Kevin Neal. November 5th, 2012 at 4:24 pm

Quite frankly I’m fond of the term ‘Big Data’. And while I agree that it is somewhat vague and generic, the fact of the matter is that it helps to define a problem in some way that we can start to have an intelligent discussion about solutions. I don’t think anyone can dispute that Volume is absolutely growing like never before, so this is a new dynamic. Also, I don’t think any reasonable person would argue that the Variety of information is like never seen before. And, last, but not least, the Velocity is also very obviously a challenge to harness. So put this all together and it paints a picture of the problem at a high-level; without specifics, admittedly.

What’s really exciting, and somewhat new, is that there are reasonable tools to help real people, in all sorts of industries to realistically solve these problems instead of just huge enterprise organizations like IBM or Stanford University, for example, that have gigantic budgets and massive resources. Even the smallest of companies can rent Amazon AWS to do an incredible amount of data crunching that they didn’t have access to just a few short years ago. Or open source software like Splunk gives everyone a chance to build nice solutions without a significant up-front investment. Additionally, with the incredible emergence of Web Services to gather and integrate data from all different sources such as social media feeds makes the world of Big Data extremely interesting to people who wish to exploit this information.

While I understand your, and others, distaste for the term I think it does a sufficient job to describe the opportunity at the 10,000 foot level. Just my two cents.

By Stephen Few. November 5th, 2012 at 4:59 pm

Kevin,

Given the fact that you personally benefit from the hype regarding Big Data, I’m not surprised that you are fond of the term. Data volumes are growing precisely as they have before: exponentially. Nothing has changed. At the risk of being thought unreasonable, new sources of data have increased in exactly the same way. The same is true of velocity. What we’re seeing is nothing but an incremental extension of the past. With few exceptions, the vendors that promote Big Data are doing nothing qualitatively different from the past and most of what they’re doing is of little worth.

As someone who is very quite familiar with the tools that can be used to make sense of and derive value from data, I can say with some authority that we are seeing very slow progress in the tools and that most that have been developed recently are juvenile and in many respects retrogressive. Judged on the basis of its analytical and data visualization capabilities, Splunk is an example of a product that was developed without the expertise that was needed to design an effective tool. It’s data visualization capabilities emulate some of the worst products available and encourage some of the worst practices. What can a small company accomplish using Amazon AWS that it could not have accomplished before the advent of the term Big Data with the right skills and a decent tool?

Essentially, as I’ve said many times before, chasing Big Data misses the point. Data, big or small, can only be handled by using our brains, prepared with the proper sensemaking skills. In the realm of data, bigger, faster, and greater variety is a curse, not a blessing, if you haven’t already learned how to use the data that you have. Most people who need data are getting buried, not being helped. Making the pile bigger is only increasing the burden.

This particular blog post focused on Gartner’s erroneous prediction. Did you have an opinion of that?

By Kevin Neal. November 5th, 2012 at 6:08 pm

Stephen,

My point about software such as Splunk is that it enables people to learn, maybe inaccurately as you point out, some of the simple concepts of data aggregation and presentation. Having software such as this is a good thing, in my opinion, for the point of learning general concepts. Also, being open source means that the community can make it better if they wish. Just like Alfresco and many other solutions.

For Amazon AWS, my point is that this is an example of a resource that small companies can utilize to gather volume, variety and velocity, so I would say that in some way AWS can help a small company such as Fliptop, for example (and I just picked one case study at random to illustrate this point), to achieve the following:

“AWS allows the company to scale to hundreds of instances to process massive lookup jobs without the corresponding capital expenditure of a traditional infrastructure. “In addition,” says Chiao, “the built-in redundancy, recovery, and monitoring features of services like Amazon RDS allow our development team to run our production infrastructure without any dedicated operations resources.”

As far as the Gartner prediction, it’s not quite 2015 yet so we might be a bit pre-mature. I agree that with the term being so vague they can find a creative way to justify their prediction but I have personally witnessed more of my industry contacts changing job titles to something with “analytics” or “data specialist” or something similar. Also, it seems like demand for these types of positions is increasing in my unscientific research. Also, the pay for these types of people seems to be increasing so I am personally interested and not necessarily benefiting from the hype, yet.

By Stephen Few. November 5th, 2012 at 10:17 pm

Kevin,

Given the fact that Splunk promotes bad data analysis and visualization practices, how is it a good thing? The learning that it promotes will do harm. Splunk might do some things well, but it doesn’t do a good job of supporting data sensemaking. Even though open source software may be improved, as you point it, it often isn’t. To date, no open source BI products that I’ve seen support data visualization effectively, which is sad. (Note: R supports data visualization well for some purposes, but I don’t count it as a BI product.)

Regarding services such as Amazon AWS, few small companies should be concerned with massive volumes of data coming at them at ultra-fast speeds or with new sources of data. Concerning themselves with this will distract from what they mostly need, which is to use the data that they already have more effectively. The fact that small companies can run their systems on the Cloud if they choose to is indeed useful, but this has nothing to do with so-called Big Data.

It is not too soon to call Gartner’s prediction what it is: pure nonsense, without substance. If anyone from Gartner would like to clarify what constitutes a Big Data job, and then describe the statistical model that they used to come up with their figures, I’ll gladly provide a fresh critique based on that information.

The fact that people are changing their titles to include terms such as “analytics”, “data specialist”, “data scientist”, and the like indicates a growing interest in the potential value of data. Unfortunately, changing their titles does not magically endow them with the skills that are needed to do the work. One of my objections to Big Data is that it is just another buzzword that is encouraging organizations to invest in technologies that they rarely need, most of which don’t work very well. Organizations desperately need to develop the skills that are needed to make sense of data and then find tools that effectively support those skills. Those skills existed long before the term Big Data hit the scene.

By Charlie Hull. November 6th, 2012 at 3:57 am

I was wondering who would be first to mention Cloud :)

Kevin, according to http://www.aiim.org/community/blogs/expert/Big-Data-3d-Big-Problems-3d-Big-Opportunity you ‘detest’ these ‘over-used’ and ‘faddish’ terms such as ‘Big Data’ and ‘Cloud’, yet above you say you are ‘fond’ of them. Which is it? I was also amused by your suggestion that a common way to capture ‘indexes’ is to autogenerate a searchable PDF file. Really? I can’t honestly think of a less efficient way to repurpose data.

By Kevin Neal. November 6th, 2012 at 10:48 am

Charlie,

I’m innocent of not being first to use ‘that word’ :-).

In the AIIM blog post I said I ‘GENERALLY AVOID’ but I ‘conceded’ in this instance so I can still detest ‘faddish’ terminology yet have the right to use such terms, on occasion, when I believe they serve some purpose. This was my exact quote below:

From a personal standpoint, I generally avoid using over-used, or faddish, terms such as “Social Media”, “Cloud Computing” or “Big Data” but these terms do serve a purpose. Therefore, for this blog post I will concede to my detest for these terms and I would like to share some thoughts on “Big Data”…

Also, I agree with you that Searchable PDF are a rather ineffective way to ‘index’ information, but it is low-hanging fruit and better than image-only files. The purpose of the AIIM community is to educate on the various methods of information capture and organization, therefore Searchable PDF was one of many options. My point was the relevant indexes with Data Capture, of course, is much better but for many reasons administrators don’t use the tools or enforce metadata on content. For example, asking users to contribute keywords on a small mobile touch screen is asking a lot of them so storage repositories end up with JPG or PDF-only files with no context whatsoever. Creating Searchable PDF files is one of the most common ways to start gaining control over content chaos. Plus when you can give a lot of content/indexes to semantic engines then they can start to make logical connections so Searchable PDF has uses other than just finding a particular file via keyword.

By Doug Laney. November 6th, 2012 at 1:53 pm

Having not been involved in Gartner’s forecasting process, nor having purview into 300,000 client interactions throughout the year, it’s no wonder Stephen is odiously dismissive.

“Big Data” is the #1 term searched on Gartner.com and one of the top inquiry topics among our clients (the vast majority of which are not technology vendors).

True, Big Data is not new. I first defined the “3Vs” (volume, velocity, variety), now commonly used by the industry to characterize Big Data, at Gartner over 12 years ago (http://goo.gl/wH3qG). And SGI Chief Scientist John Mashey actually coined the term several years before that.

As for the jobs forecast, Gartner has a disciplined forecasting methodology that brought together our industry experts, human capital experts, and forecasting specialists. It leverages existing data on skill breakdowns, related technology sales, end-user survey data, service trends for similar technology bubbles and country/region/industry specific data.

The range of job types considered in the forecast include those in computing, HW, storage, DBMS, enterprise content mgt, data prep, data quality, data integration, app dev, governance, security, analytics, etc. Note that our forecast clearly indicates that these will not all be new jobs, but also many existing jobs that will require new Big Data-related skills.

–Doug Laney, VP Research, Gartner, @doug_laney

By Stephen Few. November 6th, 2012 at 6:31 pm

Doug,

Thanks for joining the discussion. Perhaps your involvement will help us get to the heart of the matter.

Would familiarity with Gartner’s “300,000 client interactions throughout the year” resolve my questions and concerns about Big Data? This big number, however impressive, doesn’t actually address the issue at hand, does it? Being involved in Gartner’s forecasting process, now that’s a different matter. I would love to see this forecasting process firsthand to see if, contrary to my belief, it is actually more scientific and statistically-robust than reading the entrails of a chicken.

Evidenced by the amount of Big Data related traffic that you get on your website, Gartner is no doubt making a great deal of money by promoting it. I’m well aware of the considerable interest that people have in Big Data today, and it concerns me. Will this interest produce anything useful for anyone besides organizations such as yours that derive revenues from it? Is Big Data really different from data in the past. Does it represent something new — a qualitative break from the past — or is it merely the latest marketing campaign to promote information technologies, which make big promises but seldom deliver?

Even back when you defined the 3Vs (volume, velocity, and variety) 12 years ago, these characteristics that supposedly define Big Data were not new. What you call Big Data is a continuation of the past. It is the same exponential growth in data volume, velocity, and variety that’s been occurring since the advent of computers.

Your explanation of the jobs forecast boils down to the following statement: “Trust us, we’re experts. We know what we’re doing.” With all due respect, having observed how these forecasts are often fabricated by analyst organizations, I don’t trust your forecast or the process that created it. When I questioned Forrestor Research’s recent analysis of advanced data visualization products, they gave me the same basic explanation of their methods as you have, but it was clear that they cooked the results to favor their clients. When I asked who the so-called data visualization experts were on whom they claimed to rely for guidance, they couldn’t produce a single name. What they presented and sold as the product of expertise was fabricated from a position of ignorance and self-interest. Why should I believe that your process was more trustworthy than theirs? If your forecast is based on clear definitions, expert opinions, reliable data, and a valid statistical model, it is in your interest to open the hood and show us. If it turns out that I’m wrong and your forecast actually has merit, I’ll issue a apology and set the record straight.

Just as the public is growing weary and wary of political media pundits who exude confidence in careless speculation and demonstrate the authority of their opinions by shouting, consumers of information technology are beginning to question the hollow promises of technological nirvana that Gartner has been dangling in front of them for years. In the field of business intelligence (BI), the fundamental promise of better decisions derived from more intelligent uses of data is renewed every few years under a different name (Big Data being the latest), but little changes and the promise remains elusive.

Based on your explanation, the job types that Gartner’s forecast counted as Big Data positions — “computing, HW, storage, DBMS, enterprise content mgt, data prep, data quality, data integration, app dev, governance, security, analytics, etc.” — appear to cover the entire IT department. That’s convenient for you, but not particularly useful as a forecast. How are these IT jobs different from past positions in computing, HW, etc.? What causes them to suddenly qualify as Big Data positions? By talking about all of these Big Data jobs of the future, you’re suggesting that something new is going on that organizations must invest in and individuals must change their job titles to reflect, when in fact these technologies and jobs aren’t new at all. Isn’t this a bit misleading? Isn’t this a bit self-serving?

What are these “new Big-Data related skills” that people will need? In truth, the data sensemaking skills that people need to increase knowledge and support better decisions — that is, to use data more effectively, whether it be big, little, or middling — have been around for many years. While it is sadly true that these skills are still in short supply, promoters of Big Data are focused on technologies, not on the human skills on which real solutions will rely. The tools that most so-called Big Data vendors are producing do little to support and augment these human skills, and in many cases they undermine them. You folks at Gartner would be doing the world a great favor if you would focus much more attention on the human skills that technologies can at best support and augment, but never replace. Our greatest lack today in respect to data is not insufficient volume, velocity, or variety, but insufficient skill to make use of the data that we already have. Until this is remedied, Big Data will be the source of oppression, much like Big Brother, rather than the source of understanding and real progress that we need.

By Charlie Hull. November 9th, 2012 at 4:04 am

@Doug “Having not been involved in Gartner’s forecasting process, nor having purview into 300,000 client interactions throughout the year” - so you’re basically saying that because Stephen doesn’t work for Gartner or a similar large analyst he is in no position to criticise?

By Fábio Yuasa Niizu. November 11th, 2012 at 1:59 pm

Stephen, may be this new name does not represent a new concept never thought or designed before, but it is undeniable that there is a new technology bringing other parameters for data storage and processing with more velocity, volume capacity and data variety processing, but it is part of the analysts work to use this technology to support business teams by taking the information value and ensuring its veracity.
Big Data is not a new concept, but a combination of existing concepts based on a new technology which breakes some paradigms and create a new baseline for analytics applications.

By Stephen Few. November 11th, 2012 at 2:10 pm

Fabio,

What is this new technology that “breaks some paradigms and creates a new baseline for analytics applications”? What are the paradigms that it breaks? What is the new baseline that it sets?

By Fábio Yuasa Niizu. November 11th, 2012 at 2:47 pm

Stephen, by concept, we already discuss non structured data, infinite data storage and real time analytics since the business intelligence begining, but I have never seen a real application of that using the “old” technologies.
With this new technologies (noSQL, MapReduce, in memory Database and other) I began to see applications that are able to apply analytics to a real time video and detect any anomaly in front of a house (like a man clmbing the wall to jump in) because that image inside the film is not the pattern from the day by day of that house.
And this applications are based on a new technology, which I think are more than a simply evolution on ETL, and Database tools. As I said, I agree this is not a whole new concept, but in technology nowadays, I’m not sure anything can be considered so. And forward, when we put together Big Data, mobile, social network and cloud there will appear a set of new “marketing names” which will bring new real applications for bnusiness areas.

By Stephen Few. November 11th, 2012 at 3:48 pm

Fabio,

In fact, real applications have included unstructured data and real-time analytics (if by this, you’re referring to in-memory databases) for many years. Two BI tools that I’m familiar with–Spotfire and Qlikview–have used in-memory database engines for many years. Tableau introduced theirs several versions ago. I was including unstructured data in applications over 20 years ago. I’m not familiar with “infinite data storage,” but I can assure you that no data storage is infinite.

It is true that the capabilities of these technologies continue to progress. For example, your example involving real-time video is something that technologies have only enabled in the last few years when processing power finally made them possible.

So here’s the question: if we’ve been imagining these capabilities for many years and technologies have eventually enabled them as a result of the same exponential growth that began with the advent of the computer, is it accurate to speak of these capabilities as a qualitative departure from the past or a quantitative continuation of the past?

What we’re calling Big Data, mobile computing, social networking, and the cloud have all evolved as continuations of the past, not discontinuous departures from it. What we’re calling Big Data today we called decision support over 30 years ago, data warehousing over 20 years ago, business intelligence over 10 years ago, and analytics over 5 years ago. The first computer that I owned was the original portable computer, called the Osborne. It was an early example of mobile computing. I began using email about 29 years ago. It was an early example of social networking. When I began working for a large bank, about 28 years ago, our major systems were housed and run at a facility in Fresno, California, to which we connected using T1 telecom lines. It was an early pre-Internet example of what we now call the cloud.

I applaud the fact that technologies continue to improve and extend our capabilities. We need good technologies to solve the problems that we face today. However, I don’t applaud poorly designed technologies, the notion that technologies rather than people (through the assistance of technologies) will solve our problems, or the hype of marketing programs that keep organizations chasing the latest technologies rather than developing the skills that are required to use data effectively, independent of the tools that they use.

Big Data is just a new name for old ideas and for technologies that provide an incremental improvement over the past. For fun, take the time to look at articles, papers, and presentations by business intelligence thought leaders today and compare them to the work of these same people five or ten years ago. You will see, with few exceptions, that they say the same thing with a few new terms mixed in. They are stuck in the past. To get unstuck, they must become more intimately familiar with the real needs of people who struggle to make sense of data and also with the strengths and weaknesses of human cognitive and perceptual systems. Only then will they help to usher in the true “information age” that’s needed.

By Bill. November 12th, 2012 at 5:36 am

I can’t thank you enough Stephen, please stick to your guns! it’s the truth and we the younger Zingers need to hear these truth, constant reminding at every oppurtunity. Here is one greatful cruncher erring on the side of caution…

By digdeep. November 12th, 2012 at 2:49 pm

Elephants like to mate with other elephants. Whether Gartner knows it or not, there is an inherent self selection bias with its audience, and the theatre (analysis) of it all is part of the mating rituals. The job of vendors is to eat up ever expanding Capex/ Opex budgets, all they are doing is creating an avenue for budgets to be spent in full rather than get it taken away from departments. For this, I say don’t hate the player, hate the game.

By Stephen Few. November 12th, 2012 at 4:20 pm

DigDeep,

Those who willingly exploit this game deserve contempt as well. Organizations such as Gartner have created this dysfunctional game. One cannot hate the game without also hating those who created it.

By digdeep. November 12th, 2012 at 5:22 pm

That eco system, as dysfunctional as it is, i do not feel sorry for customers anymore that should know better than jumping on board the gravy train. Those that exploit the game are quick to share their numerator of success, without highlighting just exactly how big the denominator of failure really is. As much as I share the distaste you have about those that exploit the game, companies that treat technology first before people and buy into the BS do not have my sympathy.

By Stephen Few. November 12th, 2012 at 5:32 pm

DigDeep,

I understand your lack of sympathy for organizations that buy into the hype, but their error is rooted in naivete, whereas the deceit and exploitation of vendors and organizations such as Gartner that support them is intentional. I don’t excuse the naivete entirely, but I do have some sympathy for organizations that are lured into bad purchases. I especially have sympathy for the people in those organizations who had no involvement with the purchase decision, but are forced to live day after day with the bad consequences.

By Fábio Yuasa Niizu. November 12th, 2012 at 5:54 pm

Stephen,

I think we don’t really desagree in our opinion, it is just the height of the step to call a technology as an evolution or a new technology. I agree that sometimes new names are created to allow re-investments on old IT concepts, you just need to be careful to undertand the height of each step, calling it as a new name or not.

The main stuff here is to know that the technology by itself does not solve any business problem and the most important is to have human brains thinking about how to mix all this names to generate useful applications.

By Stephen Few. November 12th, 2012 at 6:05 pm

Fabio,

I share your opinion that we mostly agree.

Regarding technology, let’s keep in mind that the term Big Data refers primarily to data, not technology. It suggests that something about data has significantly changed when it hasn’t. Technologies will continue to evolve, sometimes in great leaps, to keep up with the growth of data, but these are just new ways of achieving what we’ve been trying to do with data all along.

By digdeep. November 12th, 2012 at 6:41 pm

regarding your last point in your reply to me, I agree with that sentiment 100%, it is the reason I left consulting. Having to jam a square peg in a round hole over and over until a locked in customer caves in, it truly is awful experience when inefficient operations is the by product of ruthless tech sales and naive customers. Then again, I have been called a purist :)

By Randy. December 20th, 2012 at 7:29 pm

“Big Data” is of course not the only term conjured up for marketing purposes. Years ago, I was at a Gartner conference and Tim O’Reilly gave a presentation on something new - “Web 2.0”. Everyone seemed to be nodding their heads and saying how insightful it was. I thought I was the dumbest guy in the room for not getting it. Maybe that’s why I’m not rich like Tim.

By Selwyn. December 21st, 2012 at 9:22 am

“Big Data is just a new name for old ideas and for technologies that provide an incremental improvement over the past”

“Have you noticed that every business intelligence vendor has suddenly become the leading Big Data company without changing anything that they do?”

Loud and clear. Make noise and you will get the attention of CIOs and senior executives. Every single white paper on Big Data is nothing more than the BI we had earlier with more latest jargon. I attended a presentation by senior guy from so-called Big Data solution provider “Mu-Sigma”. They are after every single financial institution with deep pockets to sponsor on projects which has a disclaimer at the beginning as “Dont expect short term benefits” and in the long term we are dead!!

They talk about cross-industry convergence concept. This is using Yield Optimization technique for Banks. I was going hang on minute. What?

It is really surprising how they convince CIOs and senior management with their research work. And we were listening to their sales pitch. I hope to see some well written case studies on the failures of these Big Data in the future. That will send a strong message

By Doug Laney. December 28th, 2012 at 1:24 pm

My point was that Gartner has a formal forecasting process that (however accurate) warrants a bit more respect than the “night of drinking around a Ouija board” that Stephen and his sour-grapes afforded it. And no, Gartner doesn’t “generate revenue” from promoting IT concepts–rather our clients subscribe to our service annually. If we happen to be addressing or introducing new and useful ideas/research, then yes, perhaps that’s how Gartner has grown to a $1B+ indispensable resource to 12,000 businesses and IT organizations around the world.

As to our forecasting process, I’m sorry Stephen doesn’t understand the concept of intellectual property; it is proprietary. But hey, criticizing something you don’t understand is low-hanging fruit for bloggers. (I’m sure I’m guilty of it as well.)

And regarding Big Data, yes some use it as a marketing term. The Gartner definition is specific though as to the types of challenges, use cases, strategies and threshold that’s crossed from “just data” — for what it’s worth.

Cheers and a happy, healthy, successful 2013 to all,
Doug Laney, VP Research, Gartner

By Stephen Few. December 29th, 2012 at 1:45 pm

Hello Doug,

I had given up hope that you would respond. Thanks for picking up the thread of this discussion. To remind readers what this discussion is about, here’s the claim by Gartner that I deemed nonsense: By the year 2015 a total of 4.4 million jobs will be created worldwide to support Big Data.

So, I have a prediction of my own. By the year 2015 a total of 6.3 million jobs will be created worldwide to support Small Data. “On what is this based?” you ask. Oh, I can’t reveal anything about my forecasting process — intellectual property and all that — but you can rest assured that it is viable and trustworthy. Could I have nearly 100,000 users of my services (books, articles, courses, etc.) if my methods weren’t trustworthy?

This prediction is every bit as valid as Gartner’s, even though it was fabricated from thin air. Can you prove otherwise? You can’t, because my prediction isn’t verifiable. Why? Primarily because I haven’t defined what I mean by Small Data jobs. I could have crunched the numbers using the most sophisticated statistical model known to humankind and this prediction would still be nonsense, because the measure — Small Data jobs — hasn’t been defined.

In my work, I explain my claims, back them with evidence, and lay them open for scrutiny. I believe that you are hiding pseudo-research behind the veil of intellectual property concerns. Is it acceptable to charge your clients for expert forecasts that might be nothing more than wild guesses? Is there a wizard or a demented little man from Kansas behind the curtain? People of intelligence, especially data analysts, will not accept your predictions as reliable without inspection. The fact that your clients accept them reveals the sad state of analytical reasoning among the CIOs of the world. In time, this will change (a prediction is based on hope, not on certainty).

You say that I criticize things that I don’t understand and suggest that I am nothing more than a self-aggrandizing “blogger.” To the contrary, as you well know, this is my field of expertise. I suspect that I understand what goes into effective analytics, including predictive models, as well or perhaps better than you. I can assure you that my criticism of Gartner’s prediction is not due to sour grapes. I do not envy your ability to make vaporous predictions. Instead, I have genuine, unadulterated disdain for predictions that are built on pseudo-analytics. I work in the field of quantitative data sensemaking and communication (i.e., business intelligence, analytics, data visualization, data science, and yes, what you call Big Data). I help organizations use data more effectively for decision-making. It is my job to expose pseudo-analytics like this prediction of yours for what they are: pure nonsense. When Gartner and other so-called analyst organizations spread blarney like this, it makes me angry, because it does harm.

As the VP of Research at Gartner, have you instituted a program to confirm the accuracy of your predictions? Have you reviewed past predictions to see if they were on target? How could you? A prediction that “by the year 2015 a total of 4.4 million jobs will be created to support Big Data” isn’t verifiable? You haven’t defined what you mean by a Big Data job as different from other IT jobs. I suspect that this omission was intentional. It is a sly trick that is also used by psychics when they do cold readings. If you invested more effort in this prediction than the nighttime carousing of drunken analysts playing with a Ouija board, as I described the process, you wasted your money — or more to the point, you wasted the money that your clients paid you for credible research.

Let’s consider your business model. You said: “Gartner doesn’t ‘generate revenue’ from promoting IT concepts — rather our clients subscribe to our service annually.” Your biggest clients — those that pay the most for your services — are the very companies that you are supposedly analyzing, monitoring, and reporting on. You support their marketing efforts by praising and making glowing predictions about the technologies that they sell. So yes, you do generate revenue from promoting IT concepts. You directly contribute to the marketing hype in ways that are designed to keep your clients happy. They pay you and you reward them by presenting their products in a good light.

Back to the general topic: Big Data. You said: “The Gartner definition is specific though as to the types of challenges, use cases, strategies and threshold that’s crossed from ‘just data’.” Excellent. Since I’ve scoured the literature and haven’t found anything of substance that separates so-called Big Data from just plain data, I’m eager to hear about these challenges, use cases, and strategies that are substantially new. Please elucidate. As you yourself pointed out, the characteristics that are generally used to describe Big Data — the 3 V’s (volume, velocity, and variety) that you wrote about long ago — are not new. When did we cross this threshold? When, since the advent of the computer, was data not growing at an exponential rate, moving at increasing speeds, and expanding in variety? If a threshold had been crossed, I would acknowledge it, do my best to describe it, and teach people how to work with data in the new ways that were required. As it turns out, however, what people need most to use data effectively isn’t new. Of course, data handling technologies continue to improve, but in fits and starts, much too slowly, and only rarely in game-changing ways. It is true that data continues to grow in volume, velocity, and variety, but this is growth, not an evolution into some new species. No threshold has been crossed.

What’s behind Big Data? The answer can be found by following the money. Now that would be a worthwhile research project for Gartner.

If anything that I’ve written here about Gartner or Big Data is not accurate, please say so, but back your responses with evidence. As the VP of Research at Gartner, especially one who specializes in information technology, if you can’t support your position with data, something is amiss.