You wouldn’t download a car. But you COULD deduplicate a person!

  • Vote This Post

    12

Physics Professor Albert Bartlett is famous for arguing that one of the main reasons that societies are unable to deal with sustainability crises is that the human brain hasn’t evolved an effective mechanism for intuitively visualizing and understanding exponential growth.

The classic example of this would be Moore’s Law. Intel founder Gordon Moore predicted that processors would double in power every 18 months. But even Moore expected the trend to die out in less than 10 years!

The longevity and consistency of Moore’s law has been one of the most interesting mathematical phenomena of the computer age. And it isn’t just processors which are subject to Moore’s law. Data storage costs, data production rates, network sizes, bandwidth speeds and countless other areas of IT are affected by exponential growth.

But probably the most impressive example of technology-driven exponential growth can be seen in the field of genomics.

The Human Genome Project was completed 2003, at a cost of roughly 3 billion dollars. Today, we’re able to produce a sequence for less than $5000. And many have said that we’re on the verge of reaching the $1000 barrier any day now.

Genome sequencing is a very heavy, data-intensive undertaking. The rate at which genomes can be sequenced is doubling every 4 months, and this means that the data storage required for these operations is also outpacing the growth rate for storage technology.

But storage isn’t the only problem. You now need an efficient way to work with these gigantic and exponentially-growing data pools. As you might imagine, searching through petabytes of information is no easy task.

Thankfully, researchers at Harvard and MIT have come up with an innovative algorithm that not only cuts down on storage requirements, but actually increases in performance and efficiency as data volume grows.

This algorithm exploits the fact that genes share many similarities between species, and even more so between members of individual species. For example, there is only a 2% genetic difference between humans and chimpanzees. And the difference between you and me is much, much smaller than that.

The researchers have developed an algorithm that capitalizes on these similarities in order to form a compression algorithm that’s extremely efficient. And unlike Zip file encryption on your computer, this data can still be searched and used while in encrypted format.

The algorithm also improves efficiency by eliminating duplication of work. If you’ve already run computation on one person’s genome, you’ve already done much of the work to run the same computation on another person. This greatly reduces processing time.

Work on this algorithm is still underway, and is currently being expanded to RNA sequences and proteins.

Here, we see a perfect example of how issues relating to sustainability and technology-driven growth require solutions which can adapt well to exponential growth.

Patrick Jobin Patrick Jobin (4 Posts)

Patrick is a technical writer with Storagepipe Solutions. Storagepipe is a leading Canadian provider of data protection, high availability, and managed backup services.


  • Bruce Stewart

    That exponential growth visualization problem is tripping us up in a number of domains these days. I wonder if we’d help change that if we made projections and warnings based on them a part of information display in the systems we deploy.

    For instance, when a firm is smaller, the vision of an endless series of quarters with double-digit growth is eminently reasonable: you’re starting from a small base within a very large “pond” or market.

    But when you become large, warnings might help people see the real situation (e.g. if 90% of all desktop devices run Windows, double-digit numbers for the next version aren’t indicating “growth” but replacement, and if the overall market is stable or shrinking, perhaps the limits will be hit sooner).

    Goodness knows school curricula do a lousy job of teaching people to think geometrically rather than arithmetically, or to recognize limits and assess “how long until we reach them”. Helping to make that clear may seem like unneeded information — but it may help get that ability to see the real situation internalized over time.