Markup - Markdown
Table of Contents
About
markdown is a Lightweighted markup language.
Github by default uses its own Markdown syntax called gfm
Articles Related
Syntax
Reference
- http://commonmark.org/ is the reference (GitHub Flavored Markdown Spec based on CommonMark)
- rfc7763 - The text/markdown Media Type
Little Cheatsheet
- Comment: same as for HTML: HTML - Comment
- heading: Atx
# Your title
#(id=#custom-id tight=true bullet_char=-) Your title
#{id=#custom-id tight=true bullet_char=-} Your title
- Image



Tools
Editor
- Eclipse: Wiki Text with outline ! Just install Eclipse. F1 > Markdown Markup Cheat Sheet. Table implementation are just plain HTML
- Idea Intellij with plugin
Generator
Html Page
<xmp theme="united" style="display:none;">
# Markdown text goes in here
## Chapter 1
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore
et dolore magna aliqua.
## Chapter 2
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
</xmp>
<script src="https://strapdownjs.com/v/0.2/strapdown.js"></script>
Blog
This framework does not have any idea of a link to a md file.
- https://www.11ty.io/docs/languages/markdown/ - Eleventy - Jenkyll replacement -
Book
- Pandoc - Doc - http://pandoc.org/MANUAL.html
Wiki (doc)
mkdocs
geared towards building project documentation.
mkdocs new [dir-name] #- Create a new project.
mkdocs serve #- Start the live-reloading docs server.
- Create a new directory, named site
mkdocs build #- Build the documentation site.
mkdocs build --clean #- Delete the content of the site dir
mkdocs build --help
- Help message
mkdocs help #- Print this help message.
- Homepage: By convention index.md
Github
https://github.github.com/gfm/ - GitHub Flavored Markdown Spec
Gfm was build on top sundown but is now build on top of CommonMarc. 1)
Other Library
- Java: Pegdown, https://github.com/rjeschke/txtmark
- cmark - is the C reference implementation of CommonMark. It's a parsing and rendering library
- sundown - github - Standards compliant markdown processing library in C
- https://github.com/remarkjs/remark/tree/master/packages/remark-parse - Parser that generate the below tree (mdast)
- https://github.com/chjj/marked/ - Js parser
Parsing / Tokenizing Strategy
This paragraph tries to summarize the type of parsing strategy used by several different markdown parser
-
- Two phases:
- Block tree creation (heading, paragraph) where the block may defines when not to close it based on the beginning of the next line.
- Inline parsing of the content of the block
- AST
- Dokuwiki:
- Lexer rules: Tree of lexical mode.
- A mode is:
- a node
- unique
- with a regular expression for (open, closed or selfclosing)
- that can be in several branch (also known as chain)
- The branch concatenates all regular expression in a big one via a or and group expression. It's called a parallel regexp (ie Doku_LexerParallelRegex)
- Lexer output: Sequence of token (not a tree)
- but tokens have a entry/exit state
- and there is special mode to handle complex struct such as list, table
-
- Lexer rules: tree of lexical mode (they call it a chain, but this a branch of the tree)
- A blockquote token may contain paragraph, heading and list chains.
- Lexer output: Sequence of token (not a tree)
- but there is special token called inline token with nested tokens. sequences with inline markup (bold, italic, text, …).
-
- tokens in a nested tree structure 2)
- the mode / token should define if it's an inline or block (but not both)
- a block mode needs to add the inline text in a queue to be parsed later (ie this.lexer.inline(token.text, token.tokens); 3)
- recursive block strategy: Whenever it detects the start of a new container block it 4)
- attempts to find the end of the container block
- parses out all of the text that the container block contains
- removes line prefixes related to the container block
- recursively tokenizes the cleaned contents of the container block
AST
Example of AST: CommonMark (DTD)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document SYSTEM "CommonMark.dtd">
<document xmlns="http://commonmark.org/xml/1.0">
<list type="ordered" start="1" tight="true" delimiter="period">
<item>
<paragraph>
<text>A paragraph</text>
<softbreak />
<text>with two lines.</text>
</paragraph>
</item>
</list>
</document>