On Mon, 14 May 2012 23:32:35 +0200 Dario Giovannetti
wrote: I would like to propose using pandoc ( http://johnmacfarlane.net/pandoc/ ) instead of the make-doc.sh script for converting the installation guide (Markdown syntax) to the document hosted in the wiki (MediaWiki syntax). I've tested it and the result looks pretty good, with only a few minor manual refinements required (which I volunteer to perform, if needed): instead of the current script, which practically produces an html document, we would get a correctly-formed, much neater MediaWiki document. This would greatly simplify further improvements in the wikification of the the guide, like the adaptation to ArchWiki's style standars.
Thank you
Dario
If you have a patch that I can apply, I'll try it out. if the code using pandoc is more elegant, and the output result is comparable (and/or better), i'm up for it.
Dieter As requested, here's the patch: it's quite radical also because the
On 15/05/12 22:40, Dieter Plaetinck wrote:
previous script was creating a header that is not used any longer in the
wiki. As you can see I've rewritten almost everything in Python, since
I'm much more comfortable in that language with regular expressions;
besides, that way the code is much more readable and flexible.
NOTE 1: you will require to install the "pandoc" package, currently in
the AUR: http://aur.archlinux.org/packages.php?ID=32490
NOTE 2: the patch has been committed on the "develop" branch.
From 593842cd1182ae0342efc9356477b16739641455 Mon Sep 17 00:00:00 2001
From: Dario Giovannetti
)( *\n)"
+LIST_REPLACE = "\g<1>"
+
+# Used in wikify_internal_links
+LINK_REGEXP = "\[{baseurl}([^\]\s]+?) ([^\]\n]+?)\]"
+LINK_REPLACE = "[[\g<1>|\g<2>]]"
+
+# If a translation of the guide is added, a proper entry should be added to
+# this dictionary; the key names must be 2-character language tags
+LANGFIXES = {
+ "en": {
+ "baseurl": "https?://wiki\.archlinux\.org/index\.php/", # regexp
+ "header": """\
+[[Category:Getting and installing Arch]]
+[[fr:Guide officiel de l'installation]]
+[[ro:Ghid de instalare oficial]]
+{{i18n|Official Installation Guide}}
+""", # string
+ "intro": """The Official Installation Guide is maintained in
[http://projects.archlinux.org/aif.git/ aif.git].
+
+The version included with the latest
[http://www.archlinux.org/download/ release] (2011.08.19) can be found
[http://projects.archlinux.org/aif.git/plain/doc/official_installation_guide_...
here].
+
+The latest version can be found
[http://projects.archlinux.org/aif.git/plain/doc/official_installation_guide_...
here].
+
+The (unofficial) [[Beginners' Guide]] provides a thorough walkthrough
of the the installation and configuration process.
+
+""",
+ "summary_heading": None, # must be None only for English
+ "summary": "'''Article summary'''", # string
+ "related": "'''Related articles'''", # string
+ "introduction": "= Introduction =", # string
+ },
+}
+
+
+def fix_multiline_list_items(text):
+ """
+ pandoc doesn't convert multiline list items correctly, so this function
+ compensates for that.
+ """
+ test = ""
+ # It's necessary to run this multiple times because of how the regular
+ # expression is designed
+ while text != test:
+ test = text
+ text = re.sub(LIST_REGEXP, LIST_REPLACE, text, flags=re.MULTILINE)
+ return text
+
+
+def wikify_internal_links(text, patches):
+ """
+ Turns external links that point to the local subdomain into proper
internal
+ links.
+ """
+ regexp = LINK_REGEXP.format(**patches)
+ text = re.sub(regexp, LINK_REPLACE, text)
+ return text
+
+
+def insert_header(text, patches):
+ """
+ Inserts the standard article header.
+ """
+ text = patches["header"] + text
+ return text
+
+
+def assemble_summary(text, patches):
+ """
+ Converts the article summary and related links into a standard summary
+ """
+ # NOTE: this function requires some fixes if more languages are added
+ part_a = text.partition(patches["summary"])
+ part_b = part_a[2].partition(patches["related"])
+ part_c = part_b[2].partition(patches["introduction"])
+ related_links = part_c[0].strip().split("\n")
+ summary_heading = ("|" + patches["summary_heading"]
+ if (patches["summary_heading"])
+ else "")
+ summary_text = part_b[0].strip()
+ related = "\n".join(["{{{{Article summary text|1={}}}}}".format(r)
+ for r in related_links])
+ summary = """{{{{Article summary start{}}}}}
+{{{{Article summary text|1={}}}}}
+{{{{Article summary heading|Related articles}}}}
+{}
+{{{{Article summary end}}}}
+
+""".format(summary_heading , summary_text, related)
+ text = part_a[0] + summary + patches["intro"] + part_c[1] + part_c[2]
+ return text
+
+
+def main(filename, text):
+ """
+ Main function
+ """
+ language = filename[-2:]
+ text = fix_multiline_list_items(text)
+ if language in LANGFIXES:
+ patches = LANGFIXES[language]
+ text = wikify_internal_links(text, patches)
+ text = insert_header(text, patches)
+ text = assemble_summary(text, patches)
+ return text
+
+if __name__ == "__main__":
+ print(main(FILENAME, INPUT))
--
1.7.5.4
Dario