Markup - Markdown

About

markdown is a Lightweighted markup language.

Github by default uses its own Markdown syntax called gfm

Syntax

Reference

http://commonmark.org/ is the reference (GitHub Flavored Markdown Spec based on CommonMark)
http://daringfireball.net/projects/markdown/syntax is the original Dingus - online demo
rfc7763 - The text/markdown Media Type

Little Cheatsheet

Comment: same as for HTML: HTML - Comment

heading: Atx

# Your title 
#(id=#custom-id tight=true bullet_char=-) Your title 
#{id=#custom-id tight=true bullet_char=-} Your title

Image

![Alt text](/path/to/img.jpg)
![Alt text](/path/to/img.jpg "Optional title" =100x20)
![Alt text](/path/to/img.jpg =100)

Tools

Editor

Eclipse: Wiki Text with outline ! Just install Eclipse. F1 > Markdown Markup Cheat Sheet. Table implementation are just plain HTML
Idea Intellij with plugin
VsCode

Generator

Blog

This framework does not have any idea of a link to a md file.

Hugo
https://www.11ty.io/docs/languages/markdown/ - Eleventy - Jenkyll replacement -

Book

Pandoc - Doc - http://pandoc.org/MANUAL.html

Wiki (doc)

mkdocs

geared towards building project documentation.

mkdocs new [dir-name] #- Create a new project.
mkdocs serve #- Start the live-reloading docs server.

Create a new directory, named site

mkdocs build #- Build the documentation site.
mkdocs build --clean #- Delete the content of the site dir
mkdocs build --help

Help message

mkdocs help #- Print this help message.

Github

https://github.github.com/gfm/ - GitHub Flavored Markdown Spec

Gfm was build on top sundown but is now build on top of CommonMarc. ¹⁾

Other Library

Java: Pegdown, https://github.com/rjeschke/txtmark
React: https://github.com/rexxars/react-markdown
cmark - is the C reference implementation of CommonMark. It's a parsing and rendering library
sundown - github - Standards compliant markdown processing library in C

https://github.com/remarkjs/remark/tree/master/packages/remark-parse - Parser that generate the below tree (mdast)
https://github.com/syntax-tree/mdast#list-of-utilities - Tree
https://github.com/chjj/marked/ - Js parser

Parsing / Tokenizing Strategy

This paragraph tries to summarize the type of parsing strategy used by several different markdown parser

CommonMark / Gfm (Github)
- Two phases:
  - Block tree creation (heading, paragraph) where the block may defines when not to close it based on the beginning of the next line.
  - Inline parsing of the content of the block
- AST
Dokuwiki:
- Lexer rules: Tree of lexical mode.
  - A mode is:
    - a node
    - unique
    - with a regular expression for (open, closed or selfclosing)
    - that can be in several branch (also known as chain)
  - The branch concatenates all regular expression in a big one via a or and group expression. It's called a parallel regexp (ie Doku_LexerParallelRegex)
- Lexer output: Sequence of token (not a tree)
  - but tokens have a entry/exit state
  - and there is special mode to handle complex struct such as list, table
markdown-it:
- Lexer rules: tree of lexical mode (they call it a chain, but this a branch of the tree)
  - A blockquote token may contain paragraph, heading and list chains.
- Lexer output: Sequence of token (not a tree)
  - but there is special token called inline token with nested tokens. sequences with inline markup (bold, italic, text, …).
marked
- tokens in a nested tree structure ²⁾
- the mode / token should define if it's an inline or block (but not both)
- a block mode needs to add the inline text in a queue to be parsed later (ie this.lexer.inline(token.text, token.tokens); ³⁾
- recursive block strategy: Whenever it detects the start of a new container block it ⁴⁾
  - attempts to find the end of the container block
  - parses out all of the text that the container block contains
  - removes line prefixes related to the container block
  - recursively tokenizes the cleaned contents of the container block

AST

Example of AST: CommonMark (DTD)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
  <list type="ordered" start="1" tight="true" delimiter="period">
    <item>
      <paragraph>
        <text>A paragraph</text>
        <softbreak />
        <text>with two lines.</text>
      </paragraph>
    </item>
  </list>
</document>

Documentation / Reference

¹⁾

blog

²⁾

Marked pipeline

³⁾

Example: Add a custom syntax to generate <dl> description lists.

⁴⁾

Marked parser strategy