Surfing the Infolanche

There’s really quite an extraordinary collection of brief but provocative articles in a recent past edition of Wired, clustered around the thesis that the massive quantities of information generated by new technologies are qualitatively altering whole dimensions of human experience and understanding.

The overarching article, The Petabyte Age: Because More Isn’t Just More – It’s Different, observes “sensors everywhere . . . infinite storage . . . clouds of processors,” and asserts that “our ability to capture, warehouse, and understand massive amounts of data is changing science, medicine, business, and technology. As our collection of facts and figures grows, so will the opportunity to find answers to fundamental questions. Because in the era of big data, more isn’t just more. More is different.”

This insight is examined for it’s implications in an array of diverse enterprises, from agriculture and astronomy through law, medicine and particle physics, to politics, terrorism and world conflict.

Consider the realm of astronomy. Operating since 1998, the Sloan Digital Sky Survey has generated about 25 terabytes of data and “has measured the precise distance to a million galaxies and has discovered about 500,000 quasars.” But the Large Synoptic Survey Telescope, due to be completed in 2014, will have seven times the field of view and ten times the power of SDSS, can view the entire sky in three days, and will produce not terabytes but petabytes of data on a radically accelerated schedule.

Or ponder the Large Hadron Collider, examined in Chasing the Quark: Sometimes You Need to Throw Information Away:

“While proton beams race in opposite directions around a 17-mile underground ring, crossing and recrossing the Swiss-French border on each circuit, six particle detectors will snap a billion ‘photos’ per second of the resulting impacts. The ephemeral debris from those collisions may hold answers to some of the most exciting questions in physics.

“The LHC, expected to run 24/7 for most of the year, will generate about 10 petabytes of data per second. That staggering flood of information would instantly overwhelm any conceivable storage technology, so hardware and software filters will reduce the take to roughly 100 events per second that seem most promising for analysis. Even so, the collider will record about 15 petabytes of data each year, the equivalent of 15,000 terabyte-size hard drives filled to the brim. Hidden in all those 1s and 0s might be extra dimensions of space, the mysterious missing dark matter, or a whole new world of exotic superparticles.”

One arena that evokes a particularly personal fascination for me is discussed in the four paragraphs of Predicting the Vote: Pollsters Identify Tiny Voting Blocs. A little over a decade ago the state political party for which I was working as information services manager was avidly pursuing the vision of what is now an emerging reality. But the headline misses the real import of the development. It isn’t pollsters, and it isn’t prediction. It’s the use of intricately detailed demographic information by political activists, influence groups, political parties, campaigns and candidates to target persuasive appeals to carefully defined groups and individuals. Current resources are already “orders of magnitude larger than the databases available just four years ago.” The impact upon our democracy will be pervasive.

Even more intriguing for an old worker-bee in the intelligence business – and not to be missed – are the remarkable developments reviewed in Feeding the Masses: Data In, Crop Predictions Out and Spotting the Hot Zones: Now We can Monitor Epidemics Hour by Hour.

I also once worked, ever so briefly, for a Dallas law firm, sorting through thousands of three-and-half inch diskettes in the discovery process for a commercial lawsuit, an activity fairly described as “a labor-intensive ordeal that involved disgorging thousands of pages of company records.” But as John Bringardner observes, “these days, the number of pages commonly involved in commercial litigation discovery has ballooned into the billions” so that “a pretrial discovery request today can generate nearly 10,000 times more paper than 10 years ago.” The result, with changes in federal rules, is that “five years ago, newly minted corporate litigators spent much of their time digging through warehouses full of paper documents. Today they’re back at their desks, sorting through PDFs, emails, and memos on their double monitors — aided by semantic search technologies that scan for keywords and phrases.”

It’s easy to feel overwhelmed, inundated, or awestruck by the sheer mass of information available from alternative, and often contending, sources. “Worldwide, an estimated 18,000 Web sites publish breaking stories in at least 40 languages,” writes Adam Rogers. In Tracking the News: A Smarter Way to Predict Riots and Wars he describes how the European Commission has sought to make this daunting mass of information amenable to extractive methods that yield “early warnings about everything from natural disaster to political unrest.”

Not all of the articles are equally perspicuous. Chris Anderson’s The End of Theory: The Data Deluge Makes the Scientific Method Obsolete proves nothing of the sort, though it remains both entertaining and intriguing.

All in all, the Petabyte Age articles are an interesting group, well worth the few minutes you’ll spend investigating them.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: