summaryrefslogtreecommitdiff
path: root/markdown/htmlparser.py
Commit message (Collapse)AuthorAgeFilesLines
* Use pyspelling to check spelling.Waylan Limberg2023-04-061-16/+16
| | | In addition to checking the spelling in our documentation, we are now also checking the spelling of the README.md and similar files as well as comments in our Python code.
* Fix import issue with importlib.utilWaylan Limberg2022-07-151-1/+1
| | | | Fixes #1274.
* [style]: fix various typos in docstrings and commentsFlorian Best2022-03-181-1/+1
|
* Properly parse unclosed tags in code spansWaylan Limberg2020-11-231-0/+32
| | | | | | | * fix unclosed pi in code span * fix unclosed dec in code span * fix unclosed tag in code span Closes #1066.
* Properly parse code spans in md_in_html (#1069)Waylan Limberg2020-11-181-2/+8
| | | | | | | | | | This reverts part of 2766698 and re-implements handling of tails in the same manner as the core. Also, ensure line_offset doesn't raise an error on bad input (see #1066) and properly handle script tags in code spans (same as in the core). Fixes #1068.
* Fix issues related to hr tagsIsaac Muse2020-10-241-0/+13
| | | | | | | | | | | Ensure that start/end tag handler does not include tags in the previous paragraph. Provide special handling for tags like hr that never have content. Use sets for block tag lists as they are much faster when comparing if an item is in the list. Fixes #1053.
* Correctly parse raw `script` and `style` tags. (#1038)Waylan Limberg2020-10-121-0/+70
| | | | | | | * Ensure unclosed script tags are parsed correctly by providing a workaround for https://bugs.python.org/issue41989. * Avoid cdata_mode outside of HTML blocks, such as in inline code spans. Fixes #1036.
* Refactor HTML Parser (#803)Waylan Limberg2020-09-221-0/+202
The HTML parser has been completely replaced. The new HTML parser is built on Python's html.parser.HTMLParser, which alleviates various bugs and simplifies maintenance of the code. The md_in_html extension has been rebuilt on the new HTML Parser, which drastically simplifies it. Note that raw HTML elements with a markdown attribute defined are now converted to ElementTree Elements and are rendered by the serializer. Various bugs have been fixed. Link reference parsing, abbreviation reference parsing and footnote reference parsing has all been moved from preprocessors to blockprocessors, which allows them to be nested within other block level elements. Specifically, this change was necessary to maintain the current behavior in the rebuilt md_in_html extension. A few random edge-case bugs (see the included tests) were resolved in the process. Closes #595, closes #780, closes #830 and closes #1012.