Markup - Markdown


markdown is a Lightweighted markup language.

Github by default uses its own Markdown syntax called gfm



Little Cheatsheet

# Your title 
#(id=#custom-id tight=true bullet_char=-) Your title 
#{id=#custom-id tight=true bullet_char=-} Your title
  • Image
![Alt text](/path/to/img.jpg)
![Alt text](/path/to/img.jpg "Optional title" =100x20)
![Alt text](/path/to/img.jpg =100)



  • Eclipse: Wiki Text with outline ! Just install Eclipse. F1 > Markdown Markup Cheat Sheet. Table implementation are just plain HTML
  • Idea Intellij with plugin



This framework does not have any idea of a link to a md file.


Wiki (doc)


geared towards building project documentation.

mkdocs new [dir-name] #- Create a new project.
mkdocs serve #- Start the live-reloading docs server.
  • Create a new directory, named site
mkdocs build #- Build the documentation site.
mkdocs build --clean #- Delete the content of the site dir
mkdocs build --help
  • Help message
mkdocs help #- Print this help message.

Github - GitHub Flavored Markdown Spec

Gfm was build on top sundown but is now build on top of CommonMarc. 1)

Other Library

Parsing / Tokenizing Strategy

This paragraph tries to summarize the type of parsing strategy used by several different markdown parser

    • Two phases:
      • Block tree creation (heading, paragraph) where the block may defines when not to close it based on the beginning of the next line.
      • Inline parsing of the content of the block
    • AST
  • Dokuwiki:
    • Lexer rules: Tree of lexical mode.
      • A mode is:
        • a node
        • unique
        • with a regular expression for (open, closed or selfclosing)
        • that can be in several branch (also known as chain)
      • The branch concatenates all regular expression in a big one via a or and group expression. It's called a parallel regexp (ie Doku_LexerParallelRegex)
    • Lexer output: Sequence of token (not a tree)
      • but tokens have a entry/exit state
      • and there is special mode to handle complex struct such as list, table
    • Lexer rules: tree of lexical mode (they call it a chain, but this a branch of the tree)
      • A blockquote token may contain paragraph, heading and list chains.
    • Lexer output: Sequence of token (not a tree)
      • but there is special token called inline token with nested tokens. sequences with inline markup (bold, italic, text, …).
    • tokens in a nested tree structure 2)
    • the mode / token should define if it's an inline or block (but not both)
    • a block mode needs to add the inline text in a queue to be parsed later (ie this.lexer.inline(token.text, token.tokens); 3)
    • recursive block strategy: Whenever it detects the start of a new container block it 4)
      • attempts to find the end of the container block
      • parses out all of the text that the container block contains
      • removes line prefixes related to the container block
      • recursively tokenizes the cleaned contents of the container block


Example of AST: CommonMark (DTD)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="">
  <list type="ordered" start="1" tight="true" delimiter="period">
        <text>A paragraph</text>
        <softbreak />
        <text>with two lines.</text>

Documentation / Reference

Task Runner