The revelation that the government gets daily reports from Verizon that includes the metadata on every single communication sent through their network shows the dangers of data mining, something clearly forbidden both by the 4th Amendment and statutory law. Benjamin Wittes gives one very good reason why:
We have only the order itself, not the application that underlies it, but I have a hard time imagining the application that could have produced it. Section 215, codified in law as 50 U.S.C. § 1861, allows the government to apply to the FISA court for an order for production “of any tangible things . . . for an investigation to obtain foreign intelligence information not concerning a United States person or to protect against international terrorism or clandestine intelligence activities. . . .” To acquire such an order, the government does not have to do much—just as it doesn’t have to do much in a criminal investigation: It merely has to offer, in pertinent part, “a statement of facts showing that there are reasonable grounds to believe that the tangible things sought are relevant to an authorized investigation . . . to obtain foreign intelligence information not concerning a United States person or to protect against international terrorism or clandestine intelligence activities.”
So I’m trying to imagine what conceivable of facts would render all telephony metadata generated in the United States “relevant” to an investigation, presumably of the bombing. This would include, of course, all telephony metadata that, as matters turned out, postdates the killing of one bomber and the capture of the other—though there’s no way the government could have known that when the application was submitted. And it would also include all telephony metadata that postdates the government’s conclusion that the Tsarnaev brothers were apparently not agents of any foreign terrorist group. But even if this were not the case, how is it possible that all calls to, say, Dominos Pizza in Peoria, Illinois or all calls over a three month period between two small businesses in Juneau, Alaska would be “relevant” to an investigation of events in Boston—even if we assume that the FBI did not know whom it was investigating in the Boston area and did not know whom that unknown person was communicating with?
I think the only possible answer to this question is that a dataset of this size could be “relevant” because there are ways of analyzing big datasets algorithmically to yield all kinds of interesting things—but only if the dataset is known to include all of the possibly-relevant material. The data may not be relevant, but the dataset is relevant because it is complete—and therefore is sure to include any communications by whomever the bombers turn out to be.
Orin Kerr spells out another problem with the program:
If the order is what it appears to be, then the order points to a problem in Section 1861 that has not been appreciated. Section 1861 says that the “things” that are collected must be relevant to a national security investigation or threat assessment, but it says nothing about the scope of the things obtained. When dealing with a physical object, we naturally treat relevance on an object-by-object basis. Sets of records are different. If Verizon has a database containing records of billions of phone calls made by millions of customers, is that database a single thing, millions of things, or billions of things? Is relevance measured by each record, each customer, or the relevance of the entire database as a whole? If the entire massive database has a single record that is relevant, does that make the entire database relevant, too? The statute doesn’t directly answer that, it seems to me. But certainly it’s surprising — and troubling — if the Section 1861 relevance standard is being interpreted at the database-by-database level.
That isn’t just troubling, it’s terrifying.