fast wordpress hosting project helps restore millions of broken Wikipedia links

By Ron Miller

The web, it turns out, is a fragile place. Companies, governments, educational institutions, individuals and organizations put up and take down sites all the time. The problem is that the web has become a system of record, and when links don’t work because pages no longer exist, the record is incomplete. With the help of volunteers from, Wikipedia has been able to recover 9 million broken links and help solve that problem for at least one knowledge base. captures a copy of as many websites as it can to build an archive of the web. If you know what you’re looking for, you can search their Wayback Machine archive of billions of web pages, dating back to the earliest days of the World Wide Web. The problem is you have to know what you’re searching for, and that can be problematic.

A Wikipedia contributor named Maximilian Doerr put the power of software to bear on the problem. He built a program called IAbot, short for Internet Archive bot. also credits Stephen Balbach, who worked with Doerr tracking down archives and bad links.

First IAbot identified broken links, those pages that returned a 404 or “page not found” errors. Once the bot identified a broken link, it searched the Internet Archive for the matching page, and when it found a copy, it linked to that, thereby preserving the link to the content, even though the original page or website was no longer available.

That software helped fix 6 million links across 22 Wikipedia sites. Wikipedia volunteers fixed an additional 3 million links by manually linking to the correct Internet Archive page, an astonishing amount of preservation work, and one that helps maintain the integrity of the web and provides an …read more

Source:: Techcrunch project helps restore millions of broken Wikipedia links project helps restore millions of broken Wikipedia links

Related posts