Wickedly Short Wikis

Auteur: 
budu

pWell well well, let's be a little more serious this time around and talk about code. This is a technical blog (hum... interesting, a stock installation of emacs flyspell mode doesn't know about words like wiki, blog or even flyspell) after all, we ain't gonna talk about nothing! /ppSo, today, I'll talk about wikis or, to be more precise, the software behind them. We'll dissect a very simple implementation, just to see what they are all about. Basically, a wiki is a very simple concept, reduced to its core it's the combination of a href=http://c2.com/cgi/wiki$?WikiPrinciplesfour principles/a: /pulli Automatic link generation/lili Editable content/lili Simplified formating/lili Backlinks/li/ulpUpon realizing this, hordes of hackers from around the world flocked to write the shortest one for glory and fame. And thus the a href=http://c2.com/cgi/wiki$?ShortestWikiContestshortest wiki contest/a was born. As everybody expected the emwinner/em used strikePerl/strike a mix of Perl and shell script. This all happened a long time ago (in Internet time) and now I'll tears apart one of these very little beast. /ppI won't bother with the Perl ones as I'm not masochist. In fact, I've actually already done it for a href=http://infomesh.net/2003/wypy/ttwypy.py/tt/a a couple of weeks ago. It's the best Python entry and also the best for a language that is not Perl. Even then, I've started with the 23 lines version not loose too much time, just enough. The actual entry to the contest was 11 lines long and contained only 814 characters, which is still nearly four times larger than the winning one, that's short! In this version, the code is at least divided into four functions. It begins with a simple one, ttload/tt, which returns the content of a text file if it exists, else a string: /ppre class=codespan class=comment-delimiter#/spanspan class=comment ex is a shortcut for os.path.exists/spanbr /span class=keyworddef/span span class=function-nameload/span(n): span class=keywordreturn/span (ex(span class=string'w/'/span+n) span class=keywordand/span span class=py-builtinsopen/span(span class=string'w/'/span+n).read()) span class=keywordor/span span class=string''/spanbr //prepThis isn't really exciting, just a good usage of logical operators. The next one has more meat to it, let's see the code first: /ppre class=codespan class=keyworddef/span span class=function-namefs/span(s): span class=keywordreturn/span span class=py-builtinsreduce/span(span class=keywordlambda/span s, r: re.sub(span class=string'(?m)'/span+r[0], r[1], s), ((span class=string'\r'/span,span class=string''/span),br / (span class=string'(^|[^A-Za-z0-9?])(([A-Z][a-z]+){2,})'/span, span class=keywordlambda/span m: (m.group(1) + span class=string'%slt;a hr'/span \br / span class=string'ef=wypy?%s'/span+m.group(2)+span class=string'%sgt;%slt;/agt;'/span) % ((m.group(2),span class=string'p='/span,span class=string'amp;amp;q=e'/span,span class=string'?'/span),br / (span class=string''/span,span class=string''/span,span class=string''/span,m.group(2)))[ex(span class=string'w/'/span+m.group(2))]), (span class=string'^\{\{$'/span,span class=string'\nlt;ulgt;'/span),br / (span class=string'^\* '/span,span class=string'lt;ligt;'/span), (span class=string'^}}$'/span,span class=string'lt;/ulgt;'/span), (span class=string'^---$'/span,span class=string'lt;hrgt;'/span), (span class=string'\n\n'/span,span class=string'lt;pgt;'/span),br / (span class=string'(ht|f)tp:[^lt;gt;\s]+'/span,span class=string'lt;a href=\glt;0gt;gt;\glt;0gt;lt;/agt;'/span)), cgi.escape(s))br //prepIt look awful, but upon closer inspection we see that the noise is mostly regular expressions and some compact markup in a format string. All in all, it's a a href=http://en.wikipedia.org/wiki/Fold_%28higher-order_function%29fold/ataking the escaped input string as initial value and reducing a list of tuples which are composed of a regular expression and either of a string or a function. These happen to be the two first arguments of ttre.sub/tt which is the function used in the lambda expression fed to reduce. There's one little unexplained detail in there, why does tt'(?m)'/tt is prepended to every regular expressions? After some googling, I've found this explanation: /pblockquoteCaret and dollar match after and before newlines for the remainder of the regular expression. (Older regex flavors may apply this to the entire regex.) /blockquotepWell, I'll confess I don't truly understand what this is suppose to mean, but it doesn't look that important anyway, let's move on. /ppNow the ttdo/tt function, obviously the one actually doing something! It's a kind of dispatching function (using a dictionary) performing different type of actions depending on the value of the first parameter: /ppre class=codespan class=keyworddef/span span class=function-namedo/span(m, n): span class=keywordreturn/span {span class=string'get'/span:span class=string'lt;h1gt;WyPy:lt;a href=wypy?p=%samp;amp;q=fgt;%slt;/agt;lt;/h1gt;('/span \br / span class=string'lt;a href=wypy?p=%samp;amp;q=egt;edit melt;/agt;)lt;pgt;%s'/span % (n, n, n, fs(load(n)) span class=keywordor/span n),br / span class=string'edit'/span: span class=string'lt;form action=wypy?%s method=POSTgt;lt;h1gt;%slt;input type=hidden name=p'/span \br / span class=string' value=%sgt; lt;input type=submitgt;lt;/h1gt;lt;textarea name=t rows=15 cols=80gt;%slt;/'/span \br / span class=string'textareagt;lt;/formgt;'/span % (n, fs(n), n, load(n) span class=keywordor/span span class=stringDescribe %s/span % n), span class=string'find'/span: br / (span class=string'lt;h1gt;Links: %slt;/h1gt;'/span % fs(n))+fs(span class=string'{{\n* %s\n}}'/span % span class=string'\n* '/span.join(span class=py-builtinsfilter/span(br / span class=keywordlambda/span s: span class=py-builtinsopen/span(span class=string'w/'/span+s).read().find(n) gt; -1, os.listdir(span class=string'w'/span))))}.get(m)br //prepIt is quite simple actually, get returns a wiki page, edit is obvious and find make a page containing a list of links. The ttmain/tt function then wires up everything to be called by a CGI process and also add a page title. /ppre class=codespan class=keyworddef/span span class=function-namemain/span(f): br / n = f.get(span class=string'p'/span) span class=keywordor/span env(span class=stringQUERY_STRING/span) span class=keywordor/span span class=string''/span; n = (span class=string'HomePage'/span,n)[n.isalpha()]br / span class=py-builtinsprint/span span class=stringContent-type: text/html; charset=utf-8\r\n\r\nlt;titlegt;%slt;/titlegt;/span % nbr / span class=keywordif/span env(span class=stringREQUEST_METHOD/span) == span class=stringPOST/span: span class=py-builtinsopen/span(span class=string'w/'/span+n, span class=string'w'/span).write(f[span class=string't'/span])br / span class=py-builtinsprint/span do({span class=string'e'/span:span class=string'edit'/span, span class=string'f'/span:span class=string'find'/span}.get(f.get(span class=string'q'/span)) span class=keywordor/span span class=string'get'/span, n)br //prepThat's it, short and sweet. A last interesting point, the only code that I had trouble understanding was this: tt('HomePage',n)[n.isalpha()]}/ttwhich is quite confusing on first sight. It cleverly use the implicit conversion from boolean to integer to choose between the two items of a tuple. It's in fact an old school way of simulating the ternary operator, that feature having been added to Python 2.5 in 2006. That trick might be well know amongst pythonitas, but I'm not one of them as I rarelly code in Python. /ppWhile dissecting this code I've refactored (or might I say emdegolfed/em) it into a 45 lines version that you can grab a href=http://dl.dropbox.com/u/2682770/mu/wypy.pyhere/a. I've got a nearly finished version of this code translated to Clojure, I'll follow up on this post once it's done, but that may take a while as I'm having fun with the recently released Closure Library. /ppP.S.: Ward's Wiki is down at the time of this writing so the two first links of this post might not work for some time. You can try to access them through Google's cached links. /p