Mirror the Wikipedia For a Torrent Of Data On Tap

| | TrackBacks (0)

Nice write up of, the links for the the latest mediawiki dump of wikipedia and the process for bringing up a mysql database on a local machine of the vast encyclopedia.

Warning these are HUGE datasets. So be ready for a lengthy download.

Mirror the Wikipedia



Without images and fulltext searching of article text, it weights in at 7.5 GiB (20061130 dump). If you add the fulltext article search, it’s 23 GiB on your hard drive. That’s a bit much for a laptop (at least mine), but a desktop could handle it easily.

0 TrackBacks

Listed below are links to blogs that reference this entry: Mirror the Wikipedia For a Torrent Of Data On Tap.

TrackBack URL for this entry: http://kennethhunt.com/mt/mt-tb.cgi/1857

About this Entry

This page contains a single entry by klsh published on October 12, 2007 2:34 PM.

Google's Modular Data Center Patent Because It All Fits In A Shipping Container Like Magic 7,278,273 was the previous entry in this blog.

Amazon S3 clone in ruby it is called Park Place is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.