Atanu Dey On India's Development

The Age of Superfluous Information — Part 2

Sorting and searching through information are uniquely human activities because only humans have an external store of information which needs to be accessed and acted upon. The notion of acting on information stored externally is not associated with non-human animals.

The larger the stock of information, the more expensive it is to search through it to locate the precise bit that is relevant at any particular instance. To make the task of searching more tractable, ordering the information in some fashion—called sorting—becomes paramount. Computer scientists have worked on the problem of sorting and searching for decades with phenomenally successful advancement in our understanding in this regard.

(Volume three of Donald Knuth’s magnum opus The Art of Computer Programming is devoted to Sorting and Searching. I did not get past the first volume on The Fundamental Algorithms, leave alone tackling the third volume in my graduate computer science courses.)

This is a continuation of my earlier piece on the age of superfluous information. I argue here that in post-industrial world and increasingly so in the future, sorting and searching through information will occupy the role that manufacturing did in the growth of the old economy.

The libraries of the world contain an ever-expanding stock of information, much of which is very rapidly being added to the humongous stock already existent on the world wide web. That stock is growing rapidly as the flow of information is turning into a flood as the internet spreads its tentacles into every nook and cranny of human activity. Billions of people today have access to information equivalent to hundreds of millions of books on the world wide web. Compare that to just a hundred years ago when the average human had access to half a dozen books worth of information at most. When taken to such extremes, quantitative change amounts to qualitative change. The world of information is not what it used to be. The challenges therefore are qualitatively different.

When the quantity supplied of any good is in excess of anything reasonably required or demanded, the variable of importance is quality. Basically a person’s information needs are really very simple. One can only read so much, listen to so much talk and music, watch so much video, and wish to know only so much about what is going on around in one’s neighborhood and in the world at large.

Here is the result of some simple arithmetic I did just now. I estimated the total stock of information available today. Then I divided it by the maximum rate at which information can be scanned by a human. The result: a person will take about 18 billion years to merely scan the information to exhaust the current store of information. At the current rate of increase of information (an accelerating rate, I might add), or flow, a person would require 44 additional years for every passing hour. Compare the 18 billion years to the current estimated remaining lifetime of the sun: a mere 5 billion years.

The bottom line is this: there is already so much information out there that even if no additional information were generated, each one of us could be occupied a little longer than forever to finish it. Information, as we well know, is a non-rival good. That is, my “consuming” a particular piece of information will not diminish the amount available to you. Compare this to a rival good such as food. Stock of food is enough to last the six billion humans for about 3 months. In other words, if we produce no additional food, all together humans would finish the stock in three months. Or, a single human can therefore finish this in 1.5 billion years. But it is not so in the case of information. Each of us would take the estimated 18 billion years to finish the information we already have before we ask for more.

Clearly, for an average human, about 0.00000000001 percent of the total information stock is more than enough. About 99.9999999999 percent of the available information is worthless. So how does one go about searching out the teaspoonful of useful information from the oceans of available information. That is the challenge and therein lie the opportunities. That is why firms like Google will make the big bucks. The opportunity is not so much in making information available but making the right information available.

Which brings me to the point which I started off with. Searching is only part of the story when it comes to information. The other part is sorting. If one can sort the information along some relevant dimension, then you have meaningful information. What is meaningful can only be defined in the context of the entity processing the information. From the same stock and flow of information, different entities define different subsets that are relevant and meaningful. This subset can be labeled private information as opposed to the vast store of public information. Private information is the top of the sorted list of public information. Internalizing the private information leads to what we can call a stock of knowledge associated with the individual.

It is useful at this point to remind ourselves of the distinction between information and knowledge. Information is a public good the stock of which is growing exponentially. Knowledge is a private good and its primary raw material is the private information which is a very vanishingly small subset of the available public information. Even though public information has no known bounds, there are limits to how much private information can processed by a human brain and thus there are limits to the acquisition of the private good we call knowledge.

Conflating knowledge and information is distressingly too common these days and so I would like to dwell on this distinction for a bit. Some say that today we have a knowledge economy. It is trivially true because it has always been a knowledge economy ever since humans evolved brains capable of processing information into knowledge and began using knowledge to organize and coordinate economic activities. What is novel is the unfathomably huge stock of information we have available today. What distinguishes one individual from another today is the capacity to figure out what is relevant information and to internalize it efficiently into knowledge. That capacity is one of the basic skills imparted by what we call education.

To summarize the story so far: from the vantage point of an individual, this is an age of superfluous information; only a tiny fraction is relevant and meaningful; searching through the information can be automated but efficiently sorting for relevance is a private skill; imparting that skill is a primary function of education.

Next time I will explore the role of education in an age of superfluous information.

  • Pingback: Atanu Dey on India’s Development » The Age of Superfluous Information

  • http://constructal.blogspot.com Sameer

    Gee…that was…long! But after scrolling down I was able to locate the word “summarize” in the end and immediately realized that if I could just remember what was there in that para, it would be OK!

    So I guess I was able to sort through and identify the most relevant bit of information.

    Thanks Atanu

  • Vikram Asrani

    Nice article, but please can you provide some additional details (references as well as assumptions) as to how you arrived at the above numbers ? (3 mnts, 18 billion years, percentages).

    Atanu’s Response:Vikram, these are carefully worked out figures using classical well-worn tools. The methods used are trade-secrets that I am not allowed to divulge. :)

  • http://projectoutsourced.com Krishnan

    Atanu,
    One of the biggest distinction is between “verifiable/credible/monetizable” info, such as
    Lexis-Nexis for legal, financial & corporate,
    pro-imdb, variety, hollywoodreporter, filmfinders etc for celluloid marketplace ,
    the US MLS for real estate,
    nejm archives & uptodate for medicine, Nature archives for science,
    cinemanow for movies,
    corbis for photography, etc.

    as opposed to “useless/close-to-garbage” info from search engines like Google/AskJeeves/wikipedia etc.

    People for some irrational reason place huge premium on Google, forgetting that a search engine is only as good as its content, and public content is fairly useless. I defy anybody to have anything significant to do with the US real estate without having to go thru MLS, or say getting the top 100 research papers on H5N1 without using uptodate.

    Point being, useful monetizable info has already been signed, sealed & locked up for subscriber-only, carefully filtered access.

    Superflous useless nonsense info is the one that overwhelms us thru search engines like google.

    Ofcourse ocasionally a googler will stumble on info that is actually useful, like how to build an atom bomb or which stock is worth investing in, etc. But that is like your hero Sidharta stumbling upon nirvana. The random guy has as much chance at getting useful info as he has at getting nirvana.

    I am highly skeptical of premises like “information wants to be free”, “info is public good” etc. Useful info is usually monetizable and monetizable info is private, sealed up. What is available out there for free is usually not worth a penny, and/or put up by some misguided philanthropist.

    Bottomline – if I were to hire Atanu Dey to write quality articles on Indian Economy for the Economist at $1000 a pop, then deeshaa.org would be much more sophisticated, useful,searchable & subscriber-only access, than the present state of affairs :)
    No offence, but communism simply doesn’t work, whether in the domain of information or politics.

    Atanu’s response:

    “Information is a public good” needs to be understood in the technical sense of what a public good is. It is not a premise. It is a definition. A public good does not have much to do with communism or any other isms. It is just a way of distinguishing a thing as either a public or a private good. These are precisely defined terms.

    And about communism: communism does not work, and it is offensive. The reason it does not work is people do not understand that markets work and that incentives matter. And the reason people don’t understand economics is because economics uses everyday words that confuse the heck out of people.

  • http://www.drmalpani.com Dr Malpani, MD

    I guess we need to differentiate between data, information, knowledge and wisdom.

    Let me use an example to clarify this. Your doctor may have more information than you do, but this does not necessarily mean that he is more knowledgeable – leave alone wiser than you. Moreover, he may have more information about the disease, but you know much more about yourself, which means you are the expert on your personal illness. How wise you are about dealing with your problem depends upon you ! While it’s easy to acquire information, and even knowledge, wisdom is a different cup of tea !

    Data
    Data itself is not very useful. Think of it as the “Know-nothing” stage.
    We must understand what the data is ( for example, your blood sugar levels) and how to acquire it, which is where the medical expertise is valuable.

    Data to Information
    Once we can apply this data to our disease, the data becomes information. This is the “Know-what” stage. This is when the doctor makes a diagnosis, for example, by pattern recognition – by matching your symptoms with those described in a text book.

    Information to Knowledge
    Next, the information must be converted into knowledge by finding patterns within the information. Thus, charting your blood sugar levels in relation to time , meals and exercise makes it knowledge. This is the “Know-how” stage and helps you to gain insights into your illness and how it affects you. The knowledge can be generic and can be applied to most patients with a particular disease.

    Knowledge to Wisdom
    Wisdom arises when the knowledge is transformed into insight or principles. Once you understand the source of the patterns of your personal illness, you can learn to manage your own illness, with your doctor’s help. This is the “Know-why” stage, and when you reach this stage, you become the true expert on your illness ! You can now share your wisdom with other patients – and your doctor, if he is wise enough to be willing to listen to you !

    Search engines can help you sort and sift through the information, but you will need to make sense of it – and convert it to knowledge, and hopefully, even wisdom, for yourself ( with or without the help of a professional).

    Atanu’s response: Dr, thanks for the mentioning the points. I stress the distinction between information and knowledge fairly vehemently because conflating the two leads to a lot of silly talk. The hierarchy I use is data, information, knowledge, and then understanding leading to wisdom and finally — at the very end — enlightenment. Here is a post from Jan 2004 on the distinction between information and knowledge.

  • surojit

    It is interesting to note that information has no opportunity cost (Romer et. al, 1994). Anyone who can control access to information, can charge a price higher than zero i.e., can earn monopoly profits.

    Coming to the point of superfluous information. It might be because of no opportunity costs, people often ‘waste’ information i.e., do not make the most efficient use of information. There is problem of overproduction of information and complete ‘market failure’in information exchange market.

  • Pingback: Rajan’s Rambling » Blog Archive » Thoughts from Atanu

  • Pingback: Atanu Dey on India’s Development » Information Overload