There has been a discovery in the online marketing and data/statistics world in the last few years. We have had more websites, products and tools created online than we can possibly keep track of. The terms to describe this deluge of activity we have been hearing the most are “data overload” and “information overload” from both companies and consumers. This Google Magazine uses the term Data Obesity to describe this phenomenon.
They ask the question, why is more data always better?
I think the idea of “more data us better” is common from people who lived before the Internet was prevalent. We had to work hard to find data. Researching something meant going to a library and looking in a card catalog (or maybe something called Gopher) and then finding your way around the Dewey decimal system to find that book. And then sometimes they didn’t even have the book because it was checked out or possibly it was just filed wrong because nobody understood the Dewey decimal system.
On a related note recently we got invited to my cousin’s wedding in Santa Fe New Mexico. My dad promptly went to the library and checked out 3 books on Santa Fe and New Mexico. I cringed. He asked how to find out the flights to book something without a travel agent. I realized I have been traveling since 2000 this way and he stopped traveling about that time so he never has. I introduced him to Travelocity, it was mind blowing and a bit of data overload compared with the OAG book he used to use in the 80′s.
The point here is that finding data was really difficult. People had control over its distribution because it was in print. When it became more freely accessible due to Google and other companies efforts we assumed this would be good, because people could remember where to find it and use it whenever we wanted. We never thought it would get this big so fast. Now travel sites are overwhelming, they have too many choices and there are too many of them trying to get you to opt into something you don’t want while being over charged for bringing a suitcase on a flight. This is just one example of how data has gone exponential so quickly.
Others of us have come to a data overload conclusion when they have 200 emails in several in-boxes, 1000+ rss reader posts from feeds waiting, several work projects, 500+ Facebook wall posts in their feed and hundreds of tweets that have gone un-read. This is among a climate where you have to follow-up with projects 5-10 times to get things done, post blogs/tweets/FB status updates daily to keep on people’s radar, empty the DVR so it doesn’t get overloaded and auto delete something you really wanted, listen to the radio on the way to work just in case something big happens and still find time to scoop the litter box before it gets full and the cats poop on the floor.
And the real purpose in all those tweets/FB posts and feeds is that you business changes yearly and if you don’t know about the latest trend and some real insights about it before your boss asks about it, you won’t have a job for all that long. (in digital marketing)
Having data overload be a “good” problem to have from some people’s perspective (as in that it is growth oriented). The democratization of publishing combined with tracking methodology and databases have all contributed to this problem, giving everyone a voice, a potential following of readers, a data trail to analyze and method to say something important online 24/7/365. And then we have an even bigger problem of processing what is being said, figuring out if it is important or not and sharing/processing/saving it in some way if it is. Acting on that data is way down the line and many of us don’t even get there.
And this isn’t even the big problem with data overload. Where will we store it all? Why do tweets disappear from search so quickly? Because there are millions of them and the failwhale is full. According to the ThinkQuarterly UK, there are 800 Exabytes of data/information created every two days. It took humans from the beginning of civilization until 2003 to create the first 800 Exabytes, and we’re on a roll now.
Where does all this seemingly random data go? How will we know what it says without having to go into a database table and read specific field information? Where are the software tools to manage all this and still give humans the ability to customize the out put in ways that match the behavior or business purposes that we really need? Does any of this stuff ever get deleted?
These are all huge questions we have to answer as more people publish, share, create, track and do business online. We also have to weigh the possibilities of sharing data openly and locking it behind walls as well as how will people comprehensively find what they need when they want to as well as gauge the validity/accuracy of the information presented?
I’m betting on paid services for personal and business data management/archiving & Analysis tools. We will pay for good analysis, good data access & processing and good reliability/backups when we feel the pain of missing good insight, losing good data and just too much happening. Both personally and professionally. But unless you know how to work with SAP, SPSS, SQL, Oracle or a bunch of other systems data management is largely out of your control at this point. They are the librarians of our digital data and they need to find a workable way to Dewey decimal system it back into order and allow us to use it as humans need to.