summaryrefslogtreecommitdiff
path: root/ext/tidy/README
diff options
context:
space:
mode:
Diffstat (limited to 'ext/tidy/README')
-rw-r--r--ext/tidy/README122
1 files changed, 0 insertions, 122 deletions
diff --git a/ext/tidy/README b/ext/tidy/README
deleted file mode 100644
index 2d4e015176..0000000000
--- a/ext/tidy/README
+++ /dev/null
@@ -1,122 +0,0 @@
-
-README FOR ext/tidy by John Coggeshall <john@php.net>
-
-Tidy Version: 0.7b
-
-Tidy is an extension based on Libtidy (http://tidy.sf.net/) and allows a PHP developer
-to clean, repair, and traverse HTML, XHTML, and XML documents -- including ones with
-embedded scripting languages such as PHP or ASP within them using OO constructs.
-
----------------------------------------------------------------------------------------
-!! Important Note !!
----------------------------------------------------------------------------------------
-At this time libtidy has a small memory leak inside the ParseConfigFileEnc() function
-used to load configuration from a file. If you intend to use this functionality apply
-the "libtidy.txt" patch (cd tidy/src/; patch -p0 < libtidy.txt) to libtidy sources and
-then recompile libtidy.
----------------------------------------------------------------------------------------
-
-The Tidy extension has two separate APIs, one for general parsing, cleaning, and
-repairing and another for document traversal. The general API is provided below:
-
- tidy_create() Reinitialize the tidy engine
- tidy_parse_file($file) Parse the document stored in $file
- tidy_parse_string($str) Parse the string stored in $str
-
- tidy_clean_repair() Clean and repair the document
- tidy_diagnose() Diagnose a parsed document
-
- tidy_setopt($opt, $val) Set a configuration option $opt to $val
- tidy_getopt($opt) Retrieve a configuration option
-
- ** note: $opt is a string representing the option. Although no formal
- documentation yet exists for PHP, you can find a description of many
- of them at http://www.w3.org/People/Raggett/tidy/ and a list of supported
- options in the phpinfo(); output**
-
- tidy_get_output() Return the cleaned tidy HTML as a string
- tidy_get_error_buffer() Return a log of the errors and warnings
- returned by tidy
-
- tidy_get_release() Return the Libtidy release date
- tidy_get_status() Return the status of the document
- tidy_get_html_ver() Return the major HTML version detected for
- the document;
-
- tidy_is_xhtml() Determines if the document is XHTML
- tidy_is_xml() Determines if the document is a generic XML
-
- tidy_error_count() Returns the number of errors in the document
- tidy_warning_count() Returns the number of warnings in the document
- tidy_access_count() Returns the number of accessibility-related
- warnings in the document.
- tidy_config_count() Returns the number of configuration errors found
-
- tidy_load_config($file) Loads the specified configuration file
- tidY_load_config_enc($file,
- $enc) Loads the specified config file using the specified
- character encoding
- tidy_set_encoding($enc) Sets the current character encoding for the document
- tidy_save_config($file) Saves the current config to $file
-
-
-Beyond these general-purpose API functions, Tidy also supports the following
-functions which are used to retrieve an object for document traversal:
-
- tidy_get_root() Returns an object starting at the root of the
- document
- tidy_get_head() Returns an object starting at the <HEAD> tag
- tidy_get_html() Returns an object starting at the <HTML> tag
- tidy_get_body() Returns an object starting at the <BODY> tag
-
-All Navigation of the specified document is done via the PHP5 object constructs.
-There are two types of objects which Tidy can create. The first is TidyNode, which
-represents HTML Tags, Text, and more (see the TidyNode_Type Constants). The second
-is TidyAttr, which represents an attribute within an HTML tag (TidyNode). The
-functionality of these objects is represented by the following schema:
-
-class TidyNode {
-
- public $name; // name of node (i.e. HEAD)
- public $value; // value of node (everything between tags)
- public $type; // type of node (text, php, asp, etc.)
- public $id; // id of node (i.e. TIDY_TAG_HEAD)
-
- public function attributes(); // an array of attributes (see TidyAttr)
- public function children(); // an array of child nodes
-
- function has_siblings(); // any sibling nodes?
- function has_children(); // any child nodes?
-
- function is_comment(); // is node a comment?
- function is_xhtml(); // is document XHTML?
- function is_xml(); // is document generic XML (not HTML/XHTML)
- function is_text(); // is node text?
- function is_html(); // is node an HTML tag?
-
- function is_jste(); // is jste block?
- function is_asp(); // is Microsoft ASP block?
- function is_php(); // is PHP block?
-
- function next(); // returns next node
- function prev(); // returns prev node
-
- /* Searches for a particular attribute in the current node based
- on node ID. If found returns a TidyAttr object for it */
- function get_attr($attr_id);
-
- /*
-}
-
-class TidyAttr {
-
- public $name; // attribute name i.e. HREF
- public $value; // attribute value
- public $id; // attribute id i.e. TIDY_ATTR_HREF
-
-}
-
-Examples of using these objects to navigate the tree can be found in the examples/
-directory (I suggest looking at urlgrab.php and dumpit.php)
-
-E-mail thoughts, suggestions, patches, etc. to <john@php.net>