diff options
author | Ken Sharp <ken.sharp@artifex.com> | 2016-02-08 13:55:40 +0000 |
---|---|---|
committer | Ken Sharp <ken.sharp@artifex.com> | 2016-02-08 13:55:40 +0000 |
commit | 119e73617fb0f1b20e6d3257d26df0159c4ca81a (patch) | |
tree | dcf3194fa8ab4714fda8354db9dcf85dc58a5df8 | |
parent | 15f8b6ce6d7ae574d7803bb19d2f5cec474f087b (diff) | |
download | ghostpdl-119e73617fb0f1b20e6d3257d26df0159c4ca81a.tar.gz |
PDF interpreter - yet more robust error handlign with broken files
Bug #696540 "ps2pdf fails on a file that can be opened by some other viewers"
Two problems here.
First, the file has been truncated and garbage written, after the startxref
token. Previously we used the 'token' operator to try and read both the
'startxref' and the actual offset value. However, while executing the
token operator we reached EOF, and this *closes* the underlying file.
Unsurprisingly this then caused ioerrors on every subsequent operation.
So, define a new routine 'token_no_close' which installs a SubFileDecode
filter on top of the existing file/filter chain and reads from that. We
explicitly set CloseSource to false so that even if we encounter EOF
while executing the file we will only close the filter and not the
underlying file/filter chain.
However this then exposed a different problem when rebuilding the xref;
we scan backwards looking for a trailer with a /Root key. But if the
trailer was early in the file (< 64Kb), and the block reading worked
out so that the initial block was less than 64Kb then we would calculate
The offset to the trailer incorrectly because we assumed the block size
was always 64Kb, which it isn't if we don't have 64Kb to read.
No differences expected
-rw-r--r-- | Resource/Init/pdf_main.ps | 29 | ||||
-rw-r--r-- | Resource/Init/pdf_rbld.ps | 18 |
2 files changed, 39 insertions, 8 deletions
diff --git a/Resource/Init/pdf_main.ps b/Resource/Init/pdf_main.ps index 257181fda..37fe6970b 100644 --- a/Resource/Init/pdf_main.ps +++ b/Resource/Init/pdf_main.ps @@ -1169,6 +1169,33 @@ currentdict /xref-char-dict undef currentdict end } bind def +%% Executing token on a file will close the file if we reach EOF while +%% processing. When repairing broken files (or searching for startxref +%% and the xref offset) we do *NOT* want this to happen, because that +%% will close PDFfile and we don't have the filename to reopen it. +/token_no_close { %% -file- token_no_close <any> true | false + dup type /filetype eq { + << + /EODCount 2 index bytesavailable %% fix data length at underlying bytes + /EODString () %% make sure filter passes that many bytes, no EOD + /CloseSource false %% Be sure, tell the filter not to close the source file + >> + /SubFileDecode filter dup %% -filter- -filter- + token { %% -filter- <any> true | false + %% token returned a value + exch %% <any> filter + closefile %% <any> + true %% <any> true + }{ + %% token didn't find a value + closefile %% - + false %% false + } ifelse + } { + token + } ifelse +} bind def + % Look for the last (startxref) from the current position % of the file. Return the position after (startxref) if found or -1 . /find-startxref { % <file> find_eof <file> <position> @@ -1203,7 +1230,7 @@ currentdict /xref-char-dict undef } if } if 2 copy setfileposition - pop token not { //null } if + pop token_no_close not { //null } if dup type /integertype ne { ( **** Error: invalid token after startxref.\n) pdfformaterror ( Output may be incorrect.\n) pdfformaterror diff --git a/Resource/Init/pdf_rbld.ps b/Resource/Init/pdf_rbld.ps index a794eda88..cfa6242e1 100644 --- a/Resource/Init/pdf_rbld.ps +++ b/Resource/Init/pdf_rbld.ps @@ -165,22 +165,26 @@ %% choose to enhance this routine further. %% /search_earlier_trailer { -{ % position + { % position dup 0 gt { % position bool - dup 65535 .min exch % block_size position + dup 65535 .min exch % block_size position 1 index sub % block_size position-block_size dup % block_size new_position new_position PDFfile exch setfileposition % block_size new position - exch dup string 0 1 4 -1 roll 1 sub % + exch dup % new position block_size block_size + dup string 0 1 4 -1 roll 1 sub % {2 copy PDFfile read pop put pop } for % - % new_position (...string from file....) + % new_position block size (...string from file....) (trailer) search { pop { search not { exit } if pop } loop % determine where the trailer is in the file % trailer loc = end loc - remaing string length - length exch 65535 add exch sub + length % new_position block size string length + 3 1 roll % string length new_position block size + add exch sub % string length - (new_position + block size) } { + pop % discard old block size pop 0 } ifelse } { @@ -257,7 +261,7 @@ length sub 9 sub % move the file to this position and read startxref and position PDFfile exch setfileposition PDFfile token - pop pop PDFfile token pop + pop pop PDFfile token_no_close pop dup type /integertype eq not { pop % startxref not followed by integer. We will search the end of the file for trailer. @@ -431,7 +435,7 @@ pdfformaterror ( **** Ghostscript will attempt to recover the data.\n) pdfformaterror - ( However, the output may be incorrect.\n) pdfformaterror + ( **** However, the output may be incorrect.\n) pdfformaterror } bind def % Attempt to recover the XRef data. This is called if we have a failure |