Wednesday, January 09, 2013

Find: How to index twitter? Library of Congress won't launch its Twitter archive soon

All sorts of interesting problems here. Visualization, search, sentiment, summary, big data ...

Library of Congress won't launch its Twitter archive anytime soon

twitter fail whale

In 2010, the Library of Congress announced plans to collect every public Twitter post in a single searchable archive, as part of a bold attempt to create a new repository of digital information. Two years later, however, the project has yet to get off the ground, primarily because the Library hasn't come up with an efficient way to harness such a massive amount of data.

On Friday, the LOC published a white paper explaining the delay, which it attributes to a lack of available software and constrained budgets. The organization has already created a private archive, but it remains virtually unsearchable. According to the library, a single query on its current system "could take 24 hours" to yield results. Fixing this problem, it says,...

No comments:

Post a Comment