summaryrefslogtreecommitdiff
path: root/TODO
diff options
context:
space:
mode:
authorLorry Tar Creator <lorry-tar-importer@lorry>2013-05-08 22:21:52 +0000
committerLorry Tar Creator <lorry-tar-importer@lorry>2013-05-08 22:21:52 +0000
commit2f253cfc85ffd55a8acb988e91f0bc5ab348124c (patch)
tree4734ccd522c71dd455879162006742002f8c1565 /TODO
downloadHTML-Parser-tarball-master.tar.gz
Diffstat (limited to 'TODO')
-rw-r--r--TODO28
1 files changed, 28 insertions, 0 deletions
diff --git a/TODO b/TODO
new file mode 100644
index 0000000..ea69920
--- /dev/null
+++ b/TODO
@@ -0,0 +1,28 @@
+TODO
+ - Check how we compare to the HTML5 parsing rules
+ - limit the length of markup elements that never end. Perhaps by
+ configurable limits on the length that markup can have and still
+ be recognized. Report stuff as 'text' when this happens?
+ - remove 255 char limit on literal argspec strings
+ - implement backslash escapes in literal argspec string
+ - <![%app1;[...]]> (parameter entities)
+ - make literal tags configurable. The current list is hardcoded
+ to be "script", "style", "title", "iframe", "textarea", "xmp",
+ and "plaintext".
+
+
+SGML FEATURES WE WILL PROBABLY IGNORE FOREVER
+ - Empty tags: <> </> (repeat previous start tag)
+ - <foo<bar> (same as <foo><bar>)
+ - NET tags <name/.../
+
+
+MINOR "BUGS" (alias FEATURES)
+ - no way to clear "boolean_attribute_value".
+ - <style> and <script> do not end with the first "</".
+
+
+MSIE bug compatibility
+ - recognize server side includes as comments; <% ... %>
+ if no matching %> found tread "<% ..." as text
+ - skip quoted strings when looking for PIC