diff options
author | Lorry Tar Creator <lorry-tar-importer@lorry> | 2013-05-08 22:21:52 +0000 |
---|---|---|
committer | Lorry Tar Creator <lorry-tar-importer@lorry> | 2013-05-08 22:21:52 +0000 |
commit | 2f253cfc85ffd55a8acb988e91f0bc5ab348124c (patch) | |
tree | 4734ccd522c71dd455879162006742002f8c1565 /TODO | |
download | HTML-Parser-tarball-master.tar.gz |
HTML-Parser-3.71HEADHTML-Parser-3.71master
Diffstat (limited to 'TODO')
-rw-r--r-- | TODO | 28 |
1 files changed, 28 insertions, 0 deletions
@@ -0,0 +1,28 @@ +TODO + - Check how we compare to the HTML5 parsing rules + - limit the length of markup elements that never end. Perhaps by + configurable limits on the length that markup can have and still + be recognized. Report stuff as 'text' when this happens? + - remove 255 char limit on literal argspec strings + - implement backslash escapes in literal argspec string + - <![%app1;[...]]> (parameter entities) + - make literal tags configurable. The current list is hardcoded + to be "script", "style", "title", "iframe", "textarea", "xmp", + and "plaintext". + + +SGML FEATURES WE WILL PROBABLY IGNORE FOREVER + - Empty tags: <> </> (repeat previous start tag) + - <foo<bar> (same as <foo><bar>) + - NET tags <name/.../ + + +MINOR "BUGS" (alias FEATURES) + - no way to clear "boolean_attribute_value". + - <style> and <script> do not end with the first "</". + + +MSIE bug compatibility + - recognize server side includes as comments; <% ... %> + if no matching %> found tread "<% ..." as text + - skip quoted strings when looking for PIC |