views:

52

answers:

1

How does Wikipedia implement the "edit this section" feature for its articles, wherein a user can edit just a section of an article, rather than the whole article? I've tried crawling through mediawiki's code by myself, but it's a bit dense for me to understand. Here's my guess (but only a guess):

User click's on [edit] in an article. This links to the regular edit page, but with an additional parameter passed via GET in the URL specifying which section to edit. Without this extra parameter, Mediawiki would normally simply present the user with a form editing the entire page. But specifying a section to edit causes Mediawiki to extract ONLY that section and present it for editing.

What stumps me is how does Mediawiki parse out individual sections? From what I understand, MW doesn't store the sections individually - it stores each ENTIRE VERSION of each version of the page as one big block of text (in addition to boatloads of metadata). Does MW simply look for H1, H2, H3, ... tags, and use those to split up the text into regions when it renders the page? And when a user saves a revised section, does it look at the current version, re-parse the text into sections, and just "inject" the new version into a copy of the current version, which it then saves as a NEW version?

I assume my understanding of MediaWiki is grossly simplified, but I'm just trying to get a rough idea.

Thanks!

+3  A: 

This might be a clue - from http://en.wikipedia.org/wiki/Help:Section. The sections use a specific markup as such:

==Section==

===Subsection===

====Sub-subsection====
  • Using the same heading more than once on a page causes problems.
  • When a section with a duplicate name is edited, the edit history and summary will be ambiguous as to which section was edited.
  • When saving the page after a section edit, the editor's browser may navigate to the wrong section.

Sections can be separately edited by clicking special edit links labeled "[edit]" by the heading, or by right clicking on the section heading, depending on the preferences set. This is called "section editing feature" (Preferences -> Editing -> "Enable section editing via [edit] links"). Section editing feature will take you to an edit page by a URL such as

http://en.wikipedia.org/w/index.php?title=Help:Section&action=edit&section=2

Note that here section numbers are used, not section titles; subsections have a single number, e.g. section 2.1 may be numbered 3, section 3 is then numbered 4, etc. You can also directly type in such URLs in the address bar of your browser.

So it looks like the parser has a count of sections in the TOC and then uses the = tag to place the specific text into the editor.

Here are some of the tables used:

Page Table - (http://www.mediawiki.org/wiki/Manual:Page_table) - Each page in a MediaWiki installation has an entry here which identifies it by title

Revision Table holds metadata for every edit done to a page within the wiki. Every edit of a page creates a revision row, which holds information such as the user who made the edit, the time at which the edit was made, and a reference to the new wikitext in the text table

Text Table - holds the wikitext of individual page revisions.

The contents of pages are stored as BLOBs. So it must parse in binary.

Hope this helps.

Todd Moses