Google has announced that it will soon sell a 6-dvd set that records the relative popularity of a billion five-word sequences it has found in its web searches (of over a trillion words). This is a "let the flowers bloom" initiative. Google's sure they have no idea what wonderful uses this data may serve, and they can't wait to spread it around and find out. This sort of textual data is quite valuable, all the more so because of the enormous amount of data it's based on.
Fortunately I can already tell you one of the most profitable uses that will be made of this data: programs that generate spam will use it to make much more realistic emails to slip past spam filters. I can hardly wait to read some of this new, semi-literate spam. But I do think Google is right to release the data. World, surprise me!
Friday, August 04, 2006
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment