IBM DOT DOM
Enrich your Web site with the lesser-known fruits of the Wikimedia project
Uche Ogbuji (uche@ogbuji.net), Partner, Zepheira, LLC
04 Nov 2008
You know Wikipedia, but do you know of the dozens of related sites that provide user-generated content that is just as valuable? Many of the related sites under the Wikipedia umbrella are very useful to Web developers. Learn how to enrich your information space with resources beyond Wikipedia, including examples of widgets applying data from these sites.
Wikipedia ranks as one of the most popular and well-known Web sites ever. Everyone from kids looking to get a leg up on homework to Web developers tapping the power of user-generated content makes Wikipedia the first stop. But in terms of useful information, Wikipedia is merely the centerpiece of a much larger setting. The Wikimedia Foundation is the organization that runs Wikipedia, and much more. Its home page says: “Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment.” That’s a big claim, and it takes more than one even gigantic encyclopedia to fulfill it. You’re probably aware that there are numerous language versions of Wikipedia. (I was surprised and gratified to find the respectable number of Wikipedia articles in Igbo, my father tongue.) But do you know how often useful information is present in other languages that has not been translated to English? Have you heard of Wiktionary, Wikinews, Wikibooks, Wikisource, Wikiversity, and the like? Have you considered some of the benefits you could gain for your Web project by tapping into this vast pool of information? In this article I’ll show you around the greater Wikimedia and present code that helps your own site’s users “freely share in the sum of all knowledge”.
Here is a quick summary of the sites in the Wikimedia family, besides the well-known Wikipedia.
Wiktionary is the dictionary counterpart of Wikipedia. Many have expressed skepticism of the practicality of an open content encyclopedia, and it would seem to be an even more daunting task for a much less glamorous endeavor such as a dictionary. The French version is the largest, in terms of the number of “good” entries, closely followed by the English one, which has by far the most overall entries and edits. After that it’s a significant drop to the Turkish version, but there are nine language versions with at least 100,000 “good entries”, and many versions with close to that number, adding up to an astonishing body of work. Some of the versions grew by using robots to import entries from free sources, such as the French Wiktionary, which includes many entries copied from old, freely licensed dictionaries, such as the Dictionnaire de l’Académie française. Many Wiktionary entries include translations to other languages, so another trick is to bulk-import translations listed in other language versions. Entries range from stubs with no real content (obviously these are not classified as “good” entries) to rich entries that include etymology, examples of use, pronunciation (in phonetic alphabet and sound files), cross references, synonyms, antonyms, variant grammar forms, translations, and even appearance analyses from important textual bodies such as Project Gutenberg.
Wikinews is an outlet for articles on news and current events, with the idea that people knowledgeable of events and involved in events can collaboratively fill in the relevant pages. The guidelines are that stories should be written from a neutral point of view. Wikinews can include stories, multimedia reports, interviews, and more. Coming soon is Wikimedia Radio, eventually to be a constant streaming audio broadcast of various programs and news, drawn largely from Wikinews and other Wikimedia projects. Naturally, Wikinews coverage tends to be slanted towards regions and topics with many interested contributors, which does not lend itself to being comprehensive. In addition, Wikipedia’s popularity means that there are usually rapid updates to its articles, even at a pace suitable for news articles, which has often stolen thunder from the Wikinews project.
The obvious expansion of an encyclopedia article is to a full book on the topic, and this is the domain of Wikibooks. It includes Wikijunior, a collection of text for children and child education, which might become its own full project soon. Wikiversity was also once a subsection of Wikibooks, which has become a full Wikimedia site. Wikiversity encourages learning in a group or community setting, with participants editing learning project pages in accompaniment to any hands-on activities that support understanding. Organized into faculties, it focuses on all the many support resources that combine with textbooks in an educational setting. Wikibooks hosts the textbooks and also supports collaborative community development, with outlines of Wiki pages getting expanded piecemeal into full books. Books and faculties range from learning languages to computer science, from organic chemistry to law. Educators in the biological sciences should also take note of Wikispecies, a taxonomic directory of life forms, like a modestly structured Wikipedia of organisms.
Working back from all these secondary information sites to original documents, Wikisource, also known as The Free Library, gathers source texts, annotations, translations, and supporting materials. The texts can be works of fiction or non-fiction, historical records, civic documents, or anything else noteworthy and free from copyright restrictions.
Wikiquote is an open reference site for quotations from history and culture, in multiple languages. There has been some recent controversy about Wikiquote, with some arguing it should be disbanded due to objectionable content and copyright violations. Some think quotes should be added to the role of Wikisource. Many others, however, think that if there are any content issues at Wikisource, the community should first at least try to resolve these before taking the drastic step of disbanding a wiki. Certainly there seems no likelihood that this will happen any time soon.
Wikimedia Commons is a companion site for the Wikimedia family that hosts images, video, audio, and any other free media files. It’s a large repository, containing millions of files. It’s also intended to be a cultural repository of such media and seeks to further this through categorization and recognition of notable images.
The breadth and height of activity in the Wikimedia space opens up many opportunities for cross-pollination and useful applications beyond what the foundation itself provides. This is the spirit of Web 2.0. Users can take presently unintegrated streams of open data and turn them into fresh applications beyond the imagination or ambition of the original publishers.
