Migrate from MediaWiki to GitBook

Export data from MediaWiki

Using ‘Special:Export’ to export all pages at MediaWiki

  • Go to Special:Allpages and choose the desired article/file.
  • Copy the list of page names to a text editor
  • Put all page names on separate lines. I did this job with Vim.
  • Go to Special:Export and paste all your page names into the textbox, making sure there are no empty lines.
  • Click ‘Submit query’
  • Save the resulting XML to a file using your browser’s save facility.

Now you can use this XML file to next step

Convert MediaWiki documents to markdown format

We have a big XML file now. It contains all pages in MediaWiki. All the pages are in MediaWiki format. pandoc tool can convert the format to markdown directly. But pandoc can’t parse XML file.

I use tool MediaWiki to Markdown to do the job. It’s written in PHP. The conversion uses an XML export from MediaWiki and converts each wiki page to an individual markdown file.

The tool is easy to use. An example convert all pages to Github Flavoured Markdown format.

php convert.php --filename=mediawiki.xml --output=export --format=gfm 

Import data to GitBook

I got many md files in previous step.

  • Create two folders en and zh for multiple languages support
  • Copy all md files to the two folders. Cleanup files for each language.
  • Copy Main_Page.md to SUMMARY.md. It’s the index file for gitbook. I rewrited this file carefully.
  • Create languages file for gitbook
* [English](en/)
* [Chinese](zh/)
  • Create book file for gitbook, Content for book.json.
{
    "plugins": [
        "language-selected",
        "expandable-chapters-small", 
        "-search", 
        "-livereload", 
        "-fontsettings", 
        "-lunr", 
        "-sharing"
    ]
}
  • Install gitbook and preview

Rewrite rule for nginx

I wrote these rewrite rules for nginx. It make the new URLs compatible with original MediaWiki.

        rewrite ^/wiki/Main_Page$ /_book/index.html last;
        rewrite ^/index.html$ /_book/index.html last;
        rewrite ^/gitbook/(.+)$ /_book/gitbook/$1 last;
        rewrite ^/wiki/(.+).html$ /en/$1.html permanent;
        rewrite ^/wiki/(.+)$ /en/$1.html permanent;

        location /en/ {
                try_files $uri /_book$uri /_book$uri.html;
        }

        location /zh/ {
                try_files $uri /_book$uri /_book$uri.html;
        }

Miscs

I have many markdown files from original MediaWiki. It’s too sad gitbook can’t process markdown file outside the SUMMARY.md. You can find more peoples have same problem at the repo of Gitbook. I use the workground to resolve the problem.

作者: 发表于June 6, 2018 at 3:18 pm

版权信息: 可以任意转载, 转载时请务必以超链接形式标明文章原始出处作者信息及此声明

Tags:

留条评论