Jump to content

User:JohnWinn

From Wikipedia, the free encyclopedia

Being new to Wikipedia, I am intrigued at how the quality of articles remains relatively high despite the editing freedoms that potentially allow vandalism. This has set me thinking about ways to automate protecting high quality content whilst not providing barriers to contributors.

Schemes for automatic protection of content could be based on

  • a reputation system where each user has a score indicating the expected quality of their edits. This could either be explicit (where other users get to vote on the quality of your edits) or implicit (based on how much of your edits are reverted/deleted, how many page views they get).
  • the age of the content, either in terms of time or number of views. The idea here is that the more people who have eyeballed a piece of content without editing it, the more likely it is to be high quality.

To investigate the second point, I have written a prototype tool which takes the edit history of a page and uses it to render the page showing the age of each word. The tool runs offline at the moment, but some examples are below. The tool would hopefully make it easier to spot recent changes for those looking to undo vandalism and also help the reader judge the quality of the content.

Examples of pages showing the age of the content

[edit]

About the content age tool

[edit]

The tool is written in Java and uses Special:Export to retrive an XML version of the edit history of an article. A diff is performed between the latest version of the article and each of the previous versions. Each word is marked with the revision that it first appeared in and the result is used to create the marked-up article (as shown in the above examples).