diff options
author | fdrake <fdrake> | 2004-07-23 03:28:08 +0000 |
---|---|---|
committer | fdrake <fdrake> | 2004-07-23 03:28:08 +0000 |
commit | a93f71c24ff5c17787fd51a7e2f7e38f04541745 (patch) | |
tree | 790a43bdbaafcae6648cb78bc8d8fb3d2ac8baa2 /doc | |
parent | 7b8aad62a09e0c154f06327269852ce66efa80f5 (diff) | |
download | libexpat-a93f71c24ff5c17787fd51a7e2f7e38f04541745.tar.gz |
Add basic documentation for the suspend/resume feature.
Closes SF bug #880632.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/reference.html | 265 | ||||
-rw-r--r-- | doc/style.css | 11 |
2 files changed, 276 insertions, 0 deletions
diff --git a/doc/reference.html b/doc/reference.html index 91ec6d1..2ec252c 100644 --- a/doc/reference.html +++ b/doc/reference.html @@ -72,6 +72,9 @@ interface.</p> <li><a href="#XML_Parse">XML_Parse</a></li> <li><a href="#XML_ParseBuffer">XML_ParseBuffer</a></li> <li><a href="#XML_GetBuffer">XML_GetBuffer</a></li> + <li><a href="#XML_StopParser">XML_StopParser</a></li> + <li><a href="#XML_ResumeParser">XML_ResumeParser</a></li> + <li><a href="#XML_GetParsingStatus">XML_GetParsingStatus</a></li> </ul> </li> <li><a href="#setting">Handler Setting Functions</a> @@ -728,6 +731,149 @@ arguments:</p> <p>In order to read an external DTD, you also have to set an external entity reference handler as described above.</p> +<h3 id="stop-resume">Temporarily Stopping Parsing</h3> + +<p>Expat 1.95.8 introduces a new feature: its now possible to stop +parsing temporarily from within a handler function, even if more data +has already been passed into the parser. Applications for this +include</p> + +<ul> + <li>Supporting the <a href= "http://www.w3.org/TR/xinclude/" + >XInclude</a> specification.</li> + + <li>Delaying further processing until additional information is + available from some other source.</li> + + <li>Adjusting processor load as task priorities shift within an + application.</li> + + <li>Stopping parsing completely (simply free or reset the parser + instead of resuming in the outer parsing loop). This can be useful + if a application-domain error is found in the XML being parsed or if + the result of the parse is determined not to be useful after + all.</li> +</ul> + +<p>To take advantage of this feature, the main parsing loop of an +application needs to support this specifically. It cannot be +supported with a parsing loop compatible with Expat 1.95.7 or +earlier (though existing loops will continue to work without +supporting the stop/resume feature).</p> + +<p>An application that uses this feature for a single parser will have +the rough structure (in pseudo-code):</p> + +<pre class="pseudocode"> +fd = open_input() +p = create_parser() + +if parse_xml(p, fd) { + /* suspended */ + + int suspended = 1; + + while (suspended) { + do_something_else() + if ready_to_resume() { + suspended = continue_parsing(p, fd); + } + } +} +</pre> + +<p>An application that may resume any of several parsers based on +input (either from the XML being parsed or some other source) will +certainly have more interesting control structures.</p> + +<p>This C function could be used for the <code>parse_xml</code> +function mentioned in the pseudo-code above:</p> + +<pre class="eg"> +#define BUFF_SIZE 10240 + +/* Parse a document from the open file descriptor 'fd' until the parse + is complete (the document has been completely parsed, or there's + been an error), or the parse is stopped. Return non-zero when + the parse is merely suspended. +*/ +int +parse_xml(XML_Parser p, int fd) +{ + for (;;) { + int last_chunk; + int bytes_read; + enum XML_Status status; + + void *buff = XML_GetBuffer(p, BUFF_SIZE); + if (buff == NULL) { + /* handle error... */ + return 0; + } + bytes_read = read(fd, buff, BUFF_SIZE); + if (bytes_read < 0) { + /* handle error... */ + return 0; + } + status = XML_ParseBuffer(p, bytes_read, bytes_read == 0); + switch (status) { + case XML_STATUS_ERROR: + /* handle error... */ + return 0; + case XML_STATUS_SUSPENDED: + return 1; + } + if (bytes_read == 0) + return 0; + } +} +</pre> + +<p>The corresponding <code>continue_parsing</code> function is +somewhat simpler, since it only need deal with the return code from +<code><a href= "#XML_ResumeParser">XML_ResumeParser</a></code>; it can +delegate the input handling to the <code>parse_xml</code> +function:</p> + +<pre class="eg"> +/* Continue parsing a document which had been suspended. The 'p' and + 'fd' arguments are the same as passed to parse_xml(). Return + non-zero when the parse is suspended. +*/ +int +continue_parsing(XML_Parser p, int fd) +{ + enum XML_Status status = XML_ResumeParser(p); + switch (status) { + case XML_STATUS_ERROR: + /* handle error... */ + return 0; + case XML_ERROR_NOT_SUSPENDED: + /* handle error... */ + return 0;. + case XML_STATUS_SUSPENDED: + return 1; + } + return parse_xml(p, fd); +} +</pre> + +<p>Now that we've seen what a mess the top-level parsing loop can +become, what have we gained? Very simply, we can now use the <code><a +href= "#XML_StopParser" >XML_StopParser</a></code> function to stop +parsing, without having to go to great lengths to avoid additional +processing that we're expecting to ignore. As a bonus, we get to stop +parsing <em>temporarily</em>, and come back to it when we're +ready.</p> + +<p>To stop parsing from a handler function, use the <code><a href= +"#XML_StopParser" >XML_StopParser</a></code> function. This function +takes two arguments; the parser being stopped and a flag indicating +whether the parse can be resumed in the future.</p> + +<!-- XXX really need more here --> + + <hr /> <!-- ================================================================ --> @@ -916,6 +1062,125 @@ for (;;) { </pre> </div> +<pre class="fcndec" id="XML_StopParser"> +enum XML_Status XMLCALL +XML_StopParser(XML_Parser p, + XML_Bool resumable); +</pre> +<div class="fcndef"> + +<p>Stops parsing, causing <code><a href= "#XML_Parse" +>XML_Parse</a></code> or <code><a href= "#XML_ParseBuffer" +>XML_ParseBuffer</a></code> to return. Must be called from within a +call-back handler, except when aborting (when <code>resumable</code> +is <code>XML_FALSE</code>) an already suspended parser. Some +call-backs may still follow because they would otherwise get +lost, including +<ul> + <li> the end element handler for empty elements when stopped in the + start element handler,</li> + <li> end namespace declaration handler when stopped in the end + element handler,</li> +</ul> +and possibly others.</p> + +<p>This can be called from most handlers, including DTD related +call-backs, except when parsing an external parameter entity and +<code>resumable</code> is <code>XML_TRUE</code>. Returns +<code>XML_STATUS_OK</code> when successful, +<code>XML_STATUS_ERROR</code> otherwise. The possible error codes +are:</p> +<dl> + <dt><code>XML_ERROR_SUSPENDED</code></dt> + <dd>when suspending an already suspended parser.</dd> + <dt><code>XML_ERROR_FINISHED</code></dt> + <dd>when the parser has already finished.</dd> + <dt><code>XML_ERROR_SUSPEND_PE</code></dt> + <dd>when suspending while parsing an external PE.</dd> +</dl> + +<p>Since the stop/resume feature requires application support in the +outer parsing loop, it is an error to call this function for a parser +not being handled appropriately; see <a href= "#stop-resume" +>Temporarily Stopping Parsing</a> for more information.</p> + +<p>When <code>resumable</code> is <code>XML_TRUE</code> then parsing +is <em>suspended</em>, that is, <code><a href= "#XML_Parse" +>XML_Parse</a></code> and <code><a href= "#XML_ParseBuffer" +>XML_ParseBuffer</a></code> return <code>XML_STATUS_SUSPENDED</code>. +Otherwise, parsing is <em>aborted</em>, that is, <code><a href= +"#XML_Parse" >XML_Parse</a></code> and <code><a href= +"#XML_ParseBuffer" >XML_ParseBuffer</a></code> return +<code>XML_STATUS_ERROR</code> with error code +<code>XML_ERROR_ABORTED</code>.</p> + +<p><strong>Note:</strong> +This will be applied to the current parser instance only, that is, if +there is a parent parser then it will continue parsing when the +external entity reference handler returns. It is up to the +implementation of that handler to call <code><a href= +"#XML_StopParser" >XML_StopParser</a></code> on the parent parser +(recursively), if one wants to stop parsing altogether.</p> + +<p>When suspended, parsing can be resumed by calling <code><a href= +"#XML_ResumeParser" >XML_ResumeParser</a></code>.</p> + +<p>New in Expat 1.95.8.</p> +</div> + +<pre class="fcndec" id="XML_ResumeParser"> +enum XML_Status XMLCALL +XML_ResumeParser(XML_Parser p); +</pre> +<div class="fcndef"> +<p>Resumes parsing after it has been suspended with <code><a href= +"#XML_StopParser" >XML_StopParser</a></code>. Must not be called from +within a handler call-back. Returns same status codes as <code><a +href= "#XML_Parse">XML_Parse</a></code> or <code><a href= +"#XML_ParseBuffer" >XML_ParseBuffer</a></code>. An additional error +code, <code>XML_ERROR_NOT_SUSPENDED</code>, will be returned if the +parser was not currently suspended.</p> + +<p><strong>Note:</strong> +This must be called on the most deeply nested child parser instance +first, and on its parent parser only after the child parser has +finished, to be applied recursively until the document entity's parser +is restarted. That is, the parent parser will not resume by itself +and it is up to the application to call <code><a href= +"#XML_ResumeParser" >XML_ResumeParser</a></code> on it at the +appropriate moment.</p> + +<p>New in Expat 1.95.8.</p> +</div> + +<pre class="fcndec" id="XML_GetParsingStatus"> +void XMLCALL +XML_GetParsingStatus(XML_Parser p, + XML_ParsingStatus *status); +</pre> +<pre class="signature"> +enum XML_Parsing { + XML_INITIALIZED, + XML_PARSING, + XML_FINISHED, + XML_SUSPENDED +}; + +typedef struct { + enum XML_Parsing parsing; + XML_Bool finalBuffer; +} XML_ParsingStatus; +</pre> +<div class="fcndef"> +<p>Returns status of parser with respect to being initialized, +parsing, finished, or suspended, and whether the final buffer is being +processed. The <code>status</code> parameter <em>must not</em> be +NULL.</p> + +<p>New in Expat 1.95.8.</p> +</div> + + <h3><a name="setting">Handler Setting</a></h3> <p>Although handlers are typically set prior to parsing and left alone, an diff --git a/doc/style.css b/doc/style.css index 8f19fbd..69df30b 100644 --- a/doc/style.css +++ b/doc/style.css @@ -49,6 +49,17 @@ body { margin-right: 10%; } +.pseudocode { + padding-left: 1em; + padding-top: .5em; + padding-bottom: .5em; + border: solid thin; + margin: 1em 0; + background-color: rgb(250,220,180); + margin-left: 2em; + margin-right: 10%; +} + .handler { width: 100%; border-top-width: thin; |