lezer-parser/markdown: A lezer-integrated Markdown parser

原作者: [db:作者] 来自: 网络收藏邀请

开源软件名称（OpenSource Name）：

lezer-parser/markdown

开源软件地址(OpenSource Url)：

https://github.com/lezer-parser/markdown

开源编程语言(OpenSource Language)：

TypeScript 98.6%

开源软件介绍(OpenSource Introduction)：

lezer-markdown

This is an incremental Markdown (CommonMark with support for extension) parser that integrates well with the Lezer parser system. It does not in fact use the Lezer runtime (that runs LR parsers, and Markdown can't really be parsed that way), but it produces Lezer-style compact syntax trees and consumes fragments of such trees for its incremental parsing.

Note that this only parses the document, producing a data structure that represents its syntactic form, and doesn't help with outputting HTML. Also, in order to be single-pass and incremental, it doesn't do some things that a conforming CommonMark parser is expected to do—specifically, it doesn't validate link references, so it'll parse [a][b] and similar as a link, even if no [b] reference is declared.

The @codemirror/lang-markdown package integrates this parser with CodeMirror to provide Markdown editor support.

The code is licensed under an MIT license.

Interface

parser: MarkdownParser: The default CommonMark parser.

class MarkdownParser extends Parser

A Markdown parser configuration.

nodeSet: NodeSet: The parser's syntax node types.
configure(spec: MarkdownExtension) → MarkdownParser: Reconfigure the parser.
parseInline(text: string, offset: number) → Element[]: Parse the given piece of inline text at the given offset, returning an array of Element objects representing the inline content.

interface MarkdownConfig

Objects of this type are used to configure the Markdown parser.

props⁠?: readonly NodePropSource[]: Node props to add to the parser's node set.
defineNodes⁠?: readonly (string | NodeSpec)[]: Define new node types for use in parser extensions.
parseBlock⁠?: readonly BlockParser[]: Define additional block parsing logic.
parseInline⁠?: readonly InlineParser[]: Define new inline parsing logic.
remove⁠?: readonly string[]: Remove the named parsers from the configuration.
wrap⁠?: ParseWrapper: Add a parse wrapper (such as a mixed-language parser) to this parser.

type MarkdownExtension = MarkdownConfig | readonly MarkdownExtension[]: To make it possible to group extensions together into bigger extensions (such as the Github-flavored Markdown extension), reconfiguration accepts nested arrays of config objects.

parseCode(config: Object) → MarkdownExtension

Create a Markdown extension to enable nested parsing on code blocks and/or embedded HTML.

config

codeParser⁠?: fn(info: string) → Parser | null: When provided, this will be used to parse the content of code blocks. info is the string after the opening ``` marker, or the empty string if there is no such info or this is an indented code block. If there is a parser available for the code, it should return a function that can construct the parse.
htmlParser⁠?: Parser: The parser used to parse HTML tags (both block and inline).

GitHub Flavored Markdown

GFM: MarkdownConfig[]: Extension bundle containing Table, TaskList and Strikethrough.

Table: MarkdownConfig

This extension provides GFM-style tables, using syntax like this:

| head 1 | head 2 |
| ---    | ---    |
| cell 1 | cell 2 |

TaskList: MarkdownConfig: Extension providing GFM-style task list items, where list items can be prefixed with [ ] or [x] to add a checkbox.

Strikethrough: MarkdownConfig: An extension that implements GFM-style Strikethrough syntax using ~~ delimiters.

Other extensions

Subscript: MarkdownConfig: Extension providing Pandoc-style subscript using ~ markers.

Superscript: MarkdownConfig: Extension providing Pandoc-style superscript using ^ markers.

Emoji: MarkdownConfig: Extension that parses two colons with only letters, underscores, and numbers between them as Emoji nodes.

Extension

The parser can, to a certain extent, be extended to handle additional syntax.

interface NodeSpec

Used in the configuration to define new syntax node types.

name: string: The node's name.
block⁠?: boolean: Should be set to true if this type represents a block node.
composite⁠?: fn(cx: BlockContext, line: Line, value: number) → boolean: If this is a composite block, this should hold a function that, at the start of a new line where that block is active, checks whether the composite block should continue (return value) and optionally adjusts the line's base position and registers nodes for any markers involved in the block's syntax.
style⁠?: Tag | readonly Tag[] | Object<Tag | readonly Tag[]>: Add highlighting tag information for this node. The value of this property may either by a tag or array of tags to assign directly to this node, or an object in the style of styleTags's argument to assign more complicated rules.

class BlockContext implements PartialParse

Block-level parsing functions get access to this context object.

lineStart: number: The start of the current line.
parser: MarkdownParser: The parser configuration used.
depth: number: The number of parent blocks surrounding the current block.
parentType(depth⁠?: number = this.depth - 1) → NodeType: Get the type of the parent block at the given depth. When no depth is passed, return the type of the innermost parent.
nextLine() → boolean: Move to the next input line. This should only be called by (non-composite) block parsers that consume the line directly, or leaf block parser nextLine methods when they consume the current line (and return true).
prevLineEnd() → number: The end position of the previous line.
startComposite(type: string, start: number, value⁠?: number = 0): Start a composite block. Should only be called from block parser functions that return null.
addElement(elt: Element): Add a block element. Can be called by block parsers.
addLeafElement(leaf: LeafBlock, elt: Element): Add a block element from a leaf parser. This makes sure any extra composite block markup (such as blockquote markers) inside the block are also added to the syntax tree.
elt(type: string, from: number, to: number, children⁠?: readonly Element[]) → Element elt(tree: Tree, at: number) → Element: Create an Element object to represent some syntax node.

interface BlockParser

Block parsers handle block-level structure. There are three general types of block parsers:

Composite block parsers, which handle things like lists and blockquotes. These define a parse method that starts a composite block and returns null when it recognizes its syntax.
Eager leaf block parsers, used for things like code or HTML blocks. These can unambiguously recognize their content from its first line. They define a parse method that, if it recognizes the construct, moves the current line forward to the line beyond the end of the block, add a syntax node for the block, and return true.
Leaf block parsers that observe a paragraph-like construct as it comes in, and optionally decide to handle it at some point. This is used for "setext" (underlined) headings and link references. These define a leaf method that checks the first line of the block and returns a LeafBlockParser object if it wants to observe that block.

name: string: The name of the parser. Can be used by other block parsers to specify precedence.
parse⁠?: fn(cx: BlockContext, line: Line) → boolean | null: The eager parse function, which can look at the block's first line and return false to do nothing, true if it has parsed (and moved past a block), or null if it has started a composite block.
leaf⁠?: fn(cx: BlockContext, leaf: LeafBlock) → LeafBlockParser | null: A leaf parse function. If no regular parse functions match for a given line, its content will be accumulated for a paragraph-style block. This method can return an object that overrides that style of parsing in some situations.
endLeaf⁠?: fn(cx: BlockContext, line: Line, leaf: LeafBlock) → boolean: Some constructs, such as code blocks or newly started blockquotes, can interrupt paragraphs even without a blank line. If your construct can do this, provide a predicate here that recognizes lines that should end a paragraph (or other non-eager leaf block).
before⁠?: string: When given, this parser will be installed directly before the block parser with the given name. The default configuration defines block parsers with names LinkReference, IndentedCode, FencedCode, Blockquote, HorizontalRule, BulletList, OrderedList, ATXHeading, HTMLBlock, and SetextHeading.
after⁠?: string: When given, the parser will be installed directly after the parser with the given name.

interface LeafBlockParser

Objects that are used to override paragraph-style blocks should conform to this interface.

nextLine(cx: BlockContext, line: Line, leaf: LeafBlock) → boolean: Update the parser's state for the next line, and optionally finish the block. This is not called for the first line (the object is contructed at that line), but for any further lines. When it returns true, the block is finished. It is okay for the function to consume the current line or any subsequent lines when returning true.
finish(cx: BlockContext, leaf: LeafBlock) → boolean: Called when the block is finished by external circumstances (such as a blank line or the start of another construct). If this parser can handle the block up to its current position, it should finish the block and return true.

class Line

Data structure used during block-level per-line parsing.

text: string: The line's full text.
baseIndent: number: The base indent provided by the composite contexts (that have been handled so far).
basePos: number: The string position corresponding to the base indent.
pos: number: The position of the next non-whitespace character beyond any list, blockquote, or other composite block markers.
indent: number: The column of the next non-whitespace character.
next: number: The character code of the character after pos.
skipSpace(from: number) → number: Skip whitespace after the given position, return the position of the next non-space character or the end of the line if there's only space after from.
moveBase(to: number): Move the line's base position forward to the given position. This should only be called by composite block parsers or markup skipping functions.
moveBaseColumn(indent: number): Move the line's base position forward to the given column.
addMarker(elt: Element): Store a composite-block-level marker. Should be called from markup skipping functions when they consume any non-whitespace characters.
countIndent(to: number, from⁠?: number = 0, indent⁠?: number = 0) → number: Find the column position at to, optionally starting at a given position and column.
findColumn(goal: number) → number: Find the position corresponding to the given column.

class LeafBlock

Data structure used to accumulate a block's content during leaf block parsing.

parsers: LeafBlockParser[]: The block parsers active for this block.
start: number: The start position of the block.
content: string: The block's text content.

class InlineContext

Inline parsing functions get access to this context, and use it to read the content and emit syntax nodes.

parser: MarkdownParser: The parser that is being used.
text: string: The text of this inline section.
offset: number: The starting offset of the section in the document.
char(pos: number) → number: Get the character code at the given (document-relative) position.
end: number: The position of the end of this inline section.
slice(from: number, to: number) → string: Get a substring of this inline section. Again uses document-relative positions.
addDelimiter(type: DelimiterType, from: number, to: number, open: boolean, close: boolean) → number: Add a delimiter at this given position. open and close indicate whether this delimiter is opening, closing, or both. Returns the end of the delimiter, for convenient returning from parse functions.
addElement(elt: Element) → number: Add an inline element. Returns the end of the element.
findOpeningDelimiter(type: DelimiterType) → number | null: Find an opening delimiter of the given type. Returns null if no delimiter is found, or an index that can be passed to takeContent otherwise.
takeContent(startIndex: number) → Element[]: Remove all inline elements and delimiters starting from the given index (which you should get from findOpeningDelimiter, resolve delimiters inside of them, and return them as an array of elements.
skipSpace(from: number) → number: Skip space after the given (document) position, returning either the position of the next non-space character or the end of the section.
elt(type: string, from: number, to: number, children⁠?: readonly Element[]) → Element elt(tree: Tree, at: number) → Element: Create an Element for a syntax node.

interface InlineParser

Inline parsers are called for every character of parts of the document that are parsed as inline content.

name: string: This parser's name, which can be used by other parsers to indicate a relative precedence.
parse(cx: InlineContext, next: number, pos: number) → number: The parse function. Gets the next character and its position as arguments. Should return -1 if it doesn't handle the character, or add some element or delimiter and return the end position of the content it parsed if it can.
before⁠?: string: When given, this parser will be installed directly before the parser with the given name. The default configuration defines inline parsers with names Escape, Entity, InlineCode, HTMLTag, Emphasis, HardBreak, Link, and Image. When no before or after property is given, the parser is added to the end of the list.
after⁠?: string: When given, the parser will be installed directly after the parser with the given name.

interface DelimiterType

Delimiters are used during inline parsing to store the positions of things that might be delimiters, if another matching delimiter is found. They are identified by objects with these properties.

resolve⁠?: string

If this is given, the delimiter should be matched automatically when a piece of inline content is finished. Such delimiters will be matched with delimiters of the same type according to their open and close properties. When a match is found, the content between the delimiters is wrapped in a node whose name is given by the value of this property.

When this isn't given, you need to match the delimiter eagerly using the findOpeningDelimiter and takeContent methods.

mark⁠?: string

If the delimiter itself should, when matched, create a syntax node, set this to the name of the syntax node.

class Element

Elements are used to compose syntax nodes during parsing.

type: number: The node's id.
from: number: The start of the node, as an offset from the start of the document.
to: number: The end of the node.

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

mpeterv/markdown: An implementation of the Markdown text-to-html markup system i ...发布时间：2022-08-18

RittmanMead/md_to_conf: Markdown to Confluence import发布时间：2022-08-18

剪的笔顺,诠释剪的笔画,认识剪的部首

1 六六分期app的软件客服如何联系？(六六分期

六六分期app的软件客服如何联系？不知道吗？加qq群【895510560】即可！标题：六六分期

阅读：18875|2023-10-27

2 可心卡盟:win10系统火狐flash插件崩溃怎么

今天小编告诉大家如何处理win10系统火狐flash插件总是崩溃的问题，可能很多用户都不知

阅读：9885|2022-11-06

3 亲亲特价:怎么删除回收站图标

今天小编告诉大家如何对win10系统删除桌面回收站图标进行设置，可能很多用户都不知道

阅读：8289|2022-11-06

4 济南大学虚拟社区:鲁大师节能降温的具体办

今天小编告诉大家如何对win10系统电脑设置节能降温的设置方法，想必大家都遇到过需要

阅读：8645|2022-11-06

5 xlueops.exe:无线网络安装向导

我们在使用xp系统的过程中,经常需要对xp系统无线网络安装向导设置进行设置，可能很多

阅读：8575|2022-11-06

6 女斗合众国:win7系统cf与主机连接不稳定怎

今天小编告诉大家如何处理win7系统玩cf老是与主机连接不稳定的问题，可能很多用户都不

阅读：9584|2022-11-06

7 0xc000022-[cf烟雾头]cf怎么调烟雾头

电脑对日常生活的重要性小编就不多说了，可是一旦碰到win7系统设置cf烟雾头的问题，很

阅读：8571|2022-11-06

8 qizideyouhuo:应用程序无法正常启动0xc0000

我们在日常使用电脑的时候，有的小伙伴们可能在打开应用的时候会遇见提示应用程序无法

阅读：7961|2022-11-06

9 ipz-185:win7系统vcf文件怎么打开

今天小编告诉大家如何对win7系统打开vcf文件进行设置，可能很多用户都不知道怎么对win

阅读：8577|2022-11-06

10 傻哥蹦迪:win10系统s4怎么打开usb调试

今天小编告诉大家如何对win10系统s4开启USB调试模式进行设置，可能很多用户都不知道怎

阅读：7501|2022-11-06

客服电话

电子邮件

lezer-parser/markdown: A lezer-integrated Markdown parser

开源软件名称（OpenSource Name）：

开源软件地址(OpenSource Url)：

开源编程语言(OpenSource Language)：

开源软件介绍(OpenSource Introduction)：

lezer-markdown

Interface

class MarkdownParser extends Parser

interface MarkdownConfig

GitHub Flavored Markdown

Other extensions

Extension

interface NodeSpec

class BlockContext implements PartialParse

interface BlockParser

interface LeafBlockParser

class Line

class LeafBlock

class InlineContext

interface InlineParser

interface DelimiterType

class Element

请发表评论

全部评论

上一篇：

下一篇：

盘点左下腹部隐痛的10个原因（女性警惕）

PacktPublishing/Python-Machine-Learning-

sussillo/hfopt-matlab: A parallel, cpu-b

鲁东大学一米网:Win7系统USB驱动器RAM的操

emersion/go-ostatus: An OStatus library

剪的笔顺,诠释剪的笔画,认识剪的部首

六六分期app的软件客服如何联系？(六六分期

florent37/ViewAnimator: A fluent Android

florent37/Shrine-MaterialDesign2: implem

CVE-2020-36276

SimpleSoftwareIO/simple-sms: Send and re

关于我们

产品与服务

解决方案

139-2527-9053

`class` MarkdownParser `extends Parser`

`interface` MarkdownConfig

`interface` NodeSpec

`class` BlockContext `implements PartialParse`

`interface` BlockParser

`interface` LeafBlockParser

`class` Line

`class` LeafBlock

`class` InlineContext

`interface` InlineParser

`interface` DelimiterType

`class` Element