The Library of Congress Is Almost Finished Archiving 170 Billion of Your Best Tweets

Image may be NSFW.
Clik here to view.

If you'd forgotten your drunken or embarassing tweets from 2010, bad news: the Library of Congress is reportedly weeks away from finishing their project to archive the roughly 170 billion tweets sent between Twitter's founding in 2006 and April 2010, when the initiative was announced. Why are they archiving your tweets? All in the name of science and research.

Twitter is a new kind of collection for the Library of Congress, but an important one to its mission of serving both Congress and the public. As society turns to social media as a primary method of communication and creative expression, social media is supplementing and in some cases supplanting letters, journals, serial publications and other sources routinely collected by research libraries.

Archiving and preserving outlets such as Twitter will enable future researchers access to a fuller picture of today's cultural norms, dialogue, trends and events to inform scholarship, the legislative process, new works of authorship, education and other purposes.

Only public tweets that were published six months ago or longer will be included in the library, which has so far received over 400 requests for information from researchers around the world.

Sounds like an interesting and useful enough project, right? Well, don't get too excited yet; it reportedly takes a full 24 hours to perform just one search, which seems mindboggling and impossible (or, as the Library puts it, "an inadequate situation") until you consider the nightmarish labyrinthine manner in which the Library is organizing the information.

Gnip, the designated delivery agent for Twitter, receives tweets in a single real-time stream from #Twitter. Gnip organizes the stream of tweets into hour-long segments and uploads these files to a secure server throughout the day for retrieval by the Library. When a new file is available, the Library downloads the file to a temporary server space, checks the materials for completeness and transfer corruption, captures statistics about the number of tweets in each file, copies the file to tape, and deletes the file from the temporary server space.
…

The Library has assessed existing software and hardware solutions that divide and
simultaneously search large data sets to reduce search time, so-called "distributed and parallel computing". To achieve a significant reduction of search time, however, would require an extensive infrastructure of hundreds if not thousands of servers. This is costprohibitive and impractical for a public institution.

Yikes. If you're one of those 400 curious researchers, who are interested in such noble pursuits as "patterns in the rise of citizen journalism and interest in elected officials' communications to tracking vaccination rates and predicting stock market activity," you'll probably need to wait a while longer to get actual, usable information. Or, you don't have the patience, just hire a small army of interns.

[via Mashable//Image via Shutterstock]

The Library of Congress Is Almost Finished Archiving 170 Billion of Your Best Tweets

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112