Exploratory Data Analysis Tool Features: What’s Needed and What Should Be Left Out
I’ve spent a great deal of time over the years, and especially during the last few months, thinking deeply about the role of technologies in exploratory data analysis. When we create technologies of all types, we should always think carefully about their effects. Typically, new technologies are created to solve particular problems or to satisfy particular needs, so we attempt to consider how well they will succeed in doing this. But this isn’t enough. We must also consider potential downsides—ways in which those technologies might cause harm. This is especially true of information technologies, and data sensemaking technologies in particular, but this is seldom done by the companies that make them. The prevailing attitude in our current technopoly is that information technologies are always inherently good—what possible harm could there be? Some attention is finally being given to the ways in which information can be misused, but this isn’t the only problem.
Whenever we hand tasks over to computers that we have always done ourselves, we run the risk of losing critical skills and settling for results that are inferior. Tasks that involve thinking strike at the core of humanity’s strength. We sit on the top of the evolutionary heap because of the unique abilities of our brains. Surrendering thinking tasks to technologies ought to be approached with great caution.
I’d like to share a few guidelines that I believe software companies should follow when adding features to exploratory data analysis tools. Please review the following list and then share with me your thoughts about these guidelines.
- Leave out any task that humans can do better than computers.
- Leave out any task that’s associated with an important skill that would be lost if we allowed computers to do it for us.
- Leave out any feature that is ineffective.
- Add features to perform tasks that computers can do better than humans.
- Add features to perform tasks that humans do not benefit from performing in some important way.
- Add features that are recognized as useful by skilled data analysts, but only after considering the full range of implications.
- Never add a feature simply because it can be added or because it would be convenient to add.
- Never add a feature merely because existing or potential customers ask for it.
- Never add a feature simply because an executive wants it.
- Never design a feature in a particular way because it is easier than designing it in a way that works better.
- Never design a feature that requires human-computer interaction without a clear understanding of the human brain—its strengths and limitations.
- Never design a feature that that requires human-computer interaction that forces people to think and act like computers.
Take care,
15 Comments on “Exploratory Data Analysis Tool Features: What’s Needed and What Should Be Left Out”
Very fine aspirational goals.
“Leave out any task that humans can do better than computers” – Who defines this and which human is used as a benchmark? There’s a big grey area covering tasks that only some humans can do better and potentially tasks that humans can do better but the time taken means it provides poor value.
Barney,
This is precisely the kind of discussion that I was hoping to encourage. It’s true that this particular criterion, “Leave out any tasks that humans can do better than computers,†can only be applied after a great deal of thought. First, let me make clear that I would not apply this criterion to all technologies. For example, I can wash dishes as well or better than a mechanical dishwasher, but I’ve decided that this particular activity doesn’t benefit me as much as other potential activities, so I allow a dishwasher to do the work. I’m proposing this criterion as a guideline for determining whether or not features should be added to exploratory data analysis (EDA) tools. Thinking is an incredibly important human activity. It is our unique thinking skills that have allowed us to excel beyond other species. For this reason, we should be careful about surrendering thinking tasks to machines.
Now, to your specific questions. Who decides which tasks can be better done by humans than computers? Ultimately, software companies control these decisions, but they should make them in light of the best knowledge about human vs. computer abilities. In other words, they should be well versed in the scientific literature that addresses this topic. In some cases, the answers won’t be clear, and software companies will have to choose to either err on the side of caution in favor of leaving the task for humans to perform or to reap the expected revenues that will come from automating the task. Your question, “…which human is used as a benchmark?â€, is thought provoking. If only Einstein could perform the thinking task better than a computer, should we automate the task? You might be tempted to respond, “Of course,†but not so fast. If Einstein can do it better than computers, then the human brain is capable of doing it better than computers, but only Einstein has developed the skill. If Einstein can develop the skill, others can as well. We’re then faced with the question, “Would other humans benefit from developing this cognitive skill?†When we develop particular cognitive abilities, they often enable the development of other abilities as well. If today’s computers had been given to us as a gift by aliens in the 17th century, would we have been wise to relinquish the cognitive development that occurred between then and now through our own efforts? Figuring things out on our own through science, assisted by technologies, has produced enormous benefits.
I could go on, but I don’t want to dominate this discussion. I’m hoping to encourage this kind of thinking prior to the addition of features in EDA tools.
Hi Stephen,
Thanks for this insightful post. I think that the majority of your points apply not only to EDA tools, but also to software in general. It’s always good to be reminded those things.
Besides, what would be your approach to this:
“Never add a feature simply because an executive wants it.”
Often, executives have much more weight in the hierarchy than programmers, designers and analysts.
Benoit
Benoit,
I included the guideline regarding executives because I frequently hear complaints from people that they were forced to do things poorly (e.g., to display data in an ineffective way) because an executive demanded it. Software companies can easily follow this guideline. Executives from companies other than their own wield no real power over their decisions. It is only their own executives who can exercise authority regarding the addition of product features. Smart software companies realize that their executives cannot understand the potential benefits and harms associated with product features as well as the people who possess relevant expertise. To exercise an executive decree in opposition to the judgments of the real experts is almost always bad management. It’s an expression of “executive ego†that is fueled by the misplaced reverence with which executives are esteemed in our culture today. Donald Trump is a blatant example of the monsters that we’ve created. How do you fight this if you work for a company that suffers from the ill effects of executive ego? You speak up. This is easier said than done, I realize. In fact, I realize this quite well, for I was fired more than once during the course of my career for speaking unpopular truths. The cost is high, but in the end it’s worth it.
These are great points, and I’d sign under the whole list.
But, given the current state of analytics software industry, I see how this undertaking can into trouble caused by Dunning-Kruger effect. (https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect)
For example, a vendor implements a feature because *they think* it is effective, or it does it better than humans, or it brings many benefits, or all of the above. And they are prevented from knowing otherwise, because they lack relevant expertise.
Dimitri,
Thanks for introducing a new term. I’ve never heard of the Dunning-Kruger effect, but I’ve encountered this cognitive bias many times over the years. This problem is already exhibited by many of the folks who work for software vendors who lack the expertise that’s required to design their products effectively. The good news is that any software vendor that decided to follow guidelines such as those that I’ve proposed would likely be inclined to watch out for the Dunning-Kruger effect. They would also be more inclined to listen to us when we raise concerns about the direction of their products.
Can we add one more?
Never add a feature simply because a competitor does it.
Nate,
Yes, indeed. I should have thought of that because it’s the primary motive behind most of the features that are added to data sensemaking and communication products.
“Never design a feature in a particular way because it is easier than designing it in a way that works better.”
I’ve worked for software companies, and with software from others, that repurpose existing features for new ends, leading to suboptimal outcomes. A good example is Tableau’s implementation of bullet graphs. There are many reasons for this, frequently based in expediency, rush to market, architectural inhibitions, or simple paradigm blindness. The results include ossified products that lose their usefulness as the accretion of poor features overloads their good points, and a loss of competitiveness as new products emerge without the legacy problems.
Nice article, Stephen. In your opinion, what do you think should be done at the universities level? For example, do you suggest incorporating such practices in the software-related courses?
Amin,
I think that universities should place greater emphasis on the ethical dimensions of software development. At times, when I’ve lectured for computer science classes, students have resisted my admonition that they are responsible for developing software that actually serves the needs of users and does so effectively. They have argued that they are only responsible for writing good code based on the specifications that they’re given, without concern for anything other than technical excellence. This argument that we’re only responsible for doing what we’re told has led to many atrocities. Technologies are not benign. Those who develop technologies share responsibility for the effects of those technologies.
“Never design a feature that make the client to ask you for corrections over and over again.” Tоо many vendors do not want to do the right job that will save customers time and money because then the customer will be using them less frequently.
“Add features that are recognized as useful by skilled data analysts, but only after considering the full range of implications.â€
It’s often the case that the full range of implications that come out of building a data set or a tool is extremely difficult to conceive of let alone consider in full. If we don’t even know that we don’t know something until we have a new data set or a new tool which opens up new avenues of inquiry or analysis then we cannot possibly consider the implications of creating something that allows us to find these new things out. So although I think the motivation for this guideline is sound it will be extremely difficult to follow it in practice.
Jeremy,
By the “full range of implications†I was referring to (but perhaps not clearly enough) the other considerations on my list. In other words, don’t pursue a feature without question merely because experts recommend it, but only after considering if it 1) involves a task that humans can do better than computers, 2) involves a task that humans don’t benefit from performing in some important way, etc. It is definitely true that we cannot anticipate all potential drawbacks to a feature, but we can anticipate many potential drawbacks if we ask the right questions in advance.
Also, please keep in mind that my list of considerations applies to features that are added to exploratory data analysis tools, not to data sets.