It’s very disturbing to open the paper one day to learn that the government is collecting every bit of data it can from the phone records of private citizens. And it’s even more disturbing to open the paper the following day to learn that the feds are also snooping through all of our Internet records as well.
Disturbing, if not terribly surprising. (And the fact that all of this is probably — at least technically — legal makes this more disturbing, not less.)
Maybe it’s time we stopped arguing that our personal data is “private” and instead start referring to it as “classified.” Privacy doesn’t seem to garner much respect anymore, but the government still regards “classified” as a sacrosanct category.
The massive scale and scope of this data collection makes this even more disturbing but also, perversely, is somewhat reassuring. What I mean there is that it’s very disturbing to learn that the government is collecting every bit of phone and Internet data it can collect on everyone. But it would be even more disturbing to learn that the government is collecting every bit of phone and Internet data it can collect about you — you personally and you exclusively.
Which brings us to one of the many questions I have about all of this: Are the government agencies collecting this massive undifferentiated ocean of data really capable of putting it to any good use?
“Data mining” has been discussed for decades now, but the reality has never lived up to the hype. A lot of the folks claiming to have mastered the dark arts of data mining are little more than the computing equivalents of diviners with dowsing rods. Sorting and filtering and aggregating the massive flood of data being collected from phone and Internet records is no small task. I’m sure that the Pentagon, the CIA or the NSA has access to the raw computing power needed for such a project, but it requires more than just supercomputers. It also takes top-notch programmers to create the algorithms needed to sort through so much data and to find the signal amongst all that noise. And beyond that, it takes people with the wisdom, experience and know-how to define and recognize the difference between signal and noise — or just to know what they’re looking for. This isn’t like looking for a needle in a haystack, but for a particular piece of hay in a haystack.
I’m not saying that the Pentagon, NSA and CIA are especially incompetent, and therefore that they aren’t capable of sorting through all this data in a meaningful way. I’m suggesting, rather, that no one may be capable of sorting through all this data in a meaningful way.
“HQ is watching everything we do,” one of my co-workers says all the time, pointing nervously up at the CCTV cameras on the ceiling of the store/warehouse.
“HQ is recording everything we do,” I tell him. “No one is watching.”
The company has more than 2,200 stores all over the world, with dozens of cameras in each store, all operating 24/7/365. If something happens in any of those stores, they can go back to look at the CCTV footage to get a better idea of what happened, but it would be impossible for the company to “watch” everything that all of its cameras are recording. It would be pointless even to attempt to do so.
I realize that the NSA is better at this than the retail chain I work for — and that it’s far more intent on collecting and “mining” data. A reader at Andrew Sullivan’s Dish site paints a grim picture of the kind of details that might be traced from mining “metadata” from cell phone records — all of which I’m sure is technologically possible (and more efficient than the old methods of doing the same thing with binoculars, unmarked cars and shoe leather). And Bruce Schneier — who knows more about this stuff than just about anyone — paints a truly frightening picture of the NSA’s ambitious plans for near universal data-mining, including its construction of an “enormous computer facility in Utah to store all this data, as well as faster computer networks to process it all.” These agencies clearly possess the capability to collect and to process huge amounts of data, but that’s still not the same thing as them knowing how or why or what to do with it all.
“That is tens of billions of phone calls and for the love of god,” Simon writes. “How many agents do you think the FBI has? How many computer-runs do you think the NSA can do?”
The ugly, vacuous graphic above comes from a snarky post by Matt Yglesias in which he notes that “well-run organizations wouldn’t rely on this kind of garbage in their internal presentations.” But it’s even worse than that, actually — the graphic comes from a PowerPoint presentation. That’s what was leaked, alerting the public to the scope of this massive meta-data collecting effort — “a top secret 41-slide PowerPoint presentation.” I’m willing to accept, in theory, that some massive, secretive, nefarious agency might be capable of meaningfully sifting through all of the vast volume of data the government is now collecting. But it’s hard to reconcile such a theoretically omnicompetent agency with the kind of outfit that would use PowerPoint, or that would regard an MS Office application as something appropriate for “top secret” use.
My point here is not to say, “Don’t worry, they don’t know what they’re doing,” but rather to suggest that because they may not know what they’re doing, we might want to worry differently. A secretive, unaccountable, omnicompetent agency poses one kind of threat to civil liberties. A secretive, unaccountable, semi-competent agency poses another.
If the NSA, CIA, et. al., are actually more capable than I suspect of putting all this data to use for “national security” and preventing terrorism and all that, then the much-discussed “debate” about the “balance” between security and privacy really would be an appropriate conversation.
But what if they’re not actually capable of sifting and “mining” all this data? What if they’re simply collecting more data than they’ll ever be able to make sense of? What if those Utah supercomputers wind up being little more than glorified floppy disks filled with the unsearched and unsearchable records of everybody’s phone calls to everybody else?
In that case, we’re talking about agencies which are collecting all this data for little reason other than because they can, without any legitimate “national security” pretext — without any excuse.