| « It's The End of The Year, Wired News-style | Main | What Google Should Announce at CES » |
Following on the story about the NSA's recently disclosed data-mining project, Tom Owad launched his own data mining program targeting folks with a common first name who had Amazon Wishlists.
He pulled down all 260,000 lists using a couple of old computers, a few lines of code and two DSL lines. Then he searched for folks who liked books by Michael "I hate lighters" Moore and Rush "Jail is for poor drug users" Limbaugh.
(I wonder if Owad started this before or after the very predictable debunking of the Homeland Security monitors Inter-Library Loans of Mao's Red Book story?)
He then mashed up the hits with, oh yes, you Web 2.0 kids saw this coming, Google Maps.
All the tools used in this project are standard and free. The services, likewise, are all free. The technical skills required to implement this project are well within the abilities of anybody who has done any programming. The network connection used to download these files was a standard home DSL connection. The computer that processed the data was a 1.5 GHz PowerBook G4. The operating system is Mac OS X 10.4, though everything could have been done just as easily with Linux (and probably with Windows). Not a penny was spent in the writing of this article, just 30 hours of time.This is what's possible with publicly available information, but imagine if one had access to Amazon's entire database - which still contains every sale dating back to 1999 by the way. Under Section 251 of the Patriot Act, the FBI can require Amazon to turn over its records, without probable cause, for an "authorized investigation . . . to protect against international terrorism or clandestine intelligence activities." Amazon is forbidden to disclose that they have turned over any records, so that you would never know that the government is keeping records of your book purchases. And obviously it is quite simple to crossreference this info with data available in other databases.
Good thing places like Google and ISPs don't keep track of your searches and internet travels for years, or somebody with a self-issued subpoena might decide to ask for that information from them in bulk, and do some mashing up on their own.
It's also a very visual illustration of the implications of 30,000 National Security Letters a year and the Bush Administration policy of allowing that information to all go into a central database that can even be shared with private companies.
For that latter story see Barton Gellman's November story in the Washington Post:
The FBI now issues more than 30,000 national security letters a year, according to government sources, a hundredfold increase over historic norms. The letters -- one of which can be used to sweep up the records of many people -- are extending the bureau's reach as never before into the telephone calls, correspondence and financial lives of ordinary Americans.Issued by FBI field supervisors, national security letters do not need the imprimatur of a prosecutor, grand jury or judge. They receive no review after the fact by the Justice Department or Congress. The executive branch maintains only statistics, which are incomplete and confined to classified reports. The Bush administration defeated legislation and a lawsuit to require a public accounting, and has offered no example in which the use of a national security letter helped disrupt a terrorist plot.
The burgeoning use of national security letters coincides with an unannounced decision to deposit all the information they yield into government data banks -- and to share those private records widely, in the federal government and beyond. In late 2003, the Bush administration reversed a long-standing policy requiring agents to destroy their files on innocent American citizens, companies and residents when investigations closed. Late last month, President Bush signed Executive Order 13388, expanding access to those files for "state, local and tribal" governments and for "appropriate private sector entities," which are not defined.
Posted by Ryan Singel at January 4, 2006 05:19 PM
Trackback PingsTrackBack URL for this entry:
http://www.secondaryscreening.net/cgi-bin/mt-tb.cgi/266
"Good thing places like Google and ISPs don't keep track of your searches and internet travels for years, or somebody with a self-issued subpoena might decide to ask for that information from them in bulk, and do some mashing up on their own."
Yeah I've thought of that too. Of course you are being facetious, G defintely and perhaps ISPs do have a long record.
Recently I used Wish Lists to research somebody. I had a client who I needed to buy a Christmas gift (spends lots) but I don't know much about him or his interests.
So first thing I did was google him, figuring if I can get some blog entries... resume etc. then I would have some interests. Nothing. So I went to LinkedIn. I did get a very brief entry there.
So then I went to Amazon Wish Lists. I figured...everybody in my family has wish lists, so maybe he does. Sure enough he did -- FROM 3 YEARS AGO.
A while ago if you recall Orkut was datamined too.
Posted by: Chris at January 6, 2006 01:23 AM
