| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Noticed while working on a different issue; we were not cleaning up the
dictionary, nor restoring back the PostScript state, after running a
Portfolio (PDF Collection) file.
Fixed by calling runpdfend after running all of the embedded PDF files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No file or bug report for this, the customer requested the files be
kept private. However any PDF Collection (Portfolio) file will show
the problem.
GhostPDF supports preserving embedded files from the input, but when
we are processing a PDF Collection we don't want to do that, because
in this case we run each of the embedded files individually. If we
copy the EmbeddedFIles as well then we end up duplicating them in the
output.
So, when processing EmbeddedFiles, check the Catalog to see if there is
a /Collection key, if there is then stop processing EmbeddedFiles.
The customer also pointed out there was no way to avoid embedding any
EmbeddedFiles from the input, so additionally add a new switch
-dPreserveEmbeddedFiles to control this. While we're doing that, add
one to control the preservation of 'DOCVIEW' (PageMode, PageLayout,
OpenAction) as well, -dPreserveDocView.
This then leads on to preventing the EmbeddedFiles in a PDF Collection
from writing their DocView information. If we let them do that then
we end up opening the file incorrectly.
To facilitate similar changes in the future I've rejigged the way
.PDFInit works, so that it calls a helper function to read any
interpreter parameters and applies them to the PDF context. I've also
added a new PostScript operator '.PDFSetParams' which takes a PDF
context and a dictionary of key/value pairs which it applies to the
context.
Sadly I can't actually use that for the docview control, because the
PDF initialisation is what processes the document, so changing it
afterwards is no help. So I've altered runpdfbegin to call a new
function runpdfbegin_with_params and pass an empty dictionary. That then
allows me to call runpdfbegin_with_params from the PDF Collection
processing, and turn off PreserveDocView.
So in summary; new controls PreserveDocView and PreserveEmbeddedFiles
and a new function .PDFSetParams to allow us to alter the PDF
interpreter parameters after .PDFInit is executed. PDF Collections no
longer embed duplicate files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we load a Type 1 font from a font file (i.e. not an embedded font), we use
the Adobe Glyph List to find if a glyph name has other names (based on the
Unicode code point).
For example, "/ocyrillic" is code point 0x43e, which also commonly maps to the
name "/afii10080".
Previously, use used "forall" to interate through the CharStrings dictionary,
but that causes two problems. Firstly, and most importantly, when we write new
entries to that dictionary, if the dictionary has to be extended, it ends up
messing with the "forall" indexing. Secondly, it means we do work than necessary
because we potentially seek out equivalents for names we've just added.
To improve this, populate an array with the original names from the CharStrings
dictionary, and iterate through that - thus the changing contents of the
dictionary doesn't matter.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug 706544 "Unknown .defaultpapersize: (Letter). error shown at startup"
The problem appears to be that the system libpaper is configured to
have a default paper size of 'Letter' rather than the all lower case
'letter', so Ghostscript doesn't have a matching media size.
libpaper says that lower case is 'preferred' but there's apparently no
reason why we couldn't also have (for example) A4 instead of a4.
We can tackle this by making the system defined paper size lower case
before returning it to Ghostscript. However, a few of the paper sizes
in statusdict do have upper case characters in their name, so we need
to duplicate those few sizes as lower case. The contents of statusdict,
and in particular the defined media sizes, are not specified so we're
OK to do this.
|
| |
|
|
|
|
|
|
| |
Commit 3635f4c75e54e337a4eebcf6db3eef0e60f9cebf removed a bunch of
filters from PostScript, but we still need the MD5Encode filter for
the old PDF interpreter with some kinds of PDF encryption.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following on from bug #706494, there are a whole bunch of non-standard
filters, some of these are required for the old PDF interpreter
written in PostScript, some appear to have been included just for
symmetry with the Decode filters and some are gneuinely used by our
own support PostScript.
This code undefines all the filters we can from filterdict, thus
preventing any of those from being used maliciously. We do have to
retain /ImscaleDecode, /eexecDecode, /PFBDecode and /TBCPDecode as
these are used by the PostScript support files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As requested by customer 1120. The PSPageOptions array is used to
specify strings to be inserted into the PostScript output of the
ps2write device; the intention is to permit device-specific code to be
placed on each page.
Ordinarily the array contains one string for each page of the output,
and that string is applied individiually.
To assist in the common cases where a pattern of different setup is
applied to pages, such as treating each of a pair of pages differently,
for every pair of pages, the processing 'wraps around' the array if
there are more pages than strings. So if we supply two strings the first
string will be applied to pages 1, 3, 5, ... and the second string will
be applied to page 2, 4, 6, ....
The customer wants to disable the 'wrap around' so that if they supply
fewer strings than pages, the device simply stops adding strings when
it runs out of content in the array.
There is no practical way to do this by altering the array content
because it is actually quite awkward to deal with heterogeneous
parameter arrays. Rather than rewrite the code extensively I've chosen
(reluctantly) to add a new parameter 'PSPageOptionsWrap'. When true
(the default value) then the behaviour is unchanged. When false the
ps2write device no longer wraps around the array, but simply stops
adding content to each page.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #706483 "PPI_CUtils Idiom detection causing incorrect output"
The Idiom Recognition substitutes a function for the basic PPI 'Force
Gray' function which additionally turns off JPEG image passthrough while
this feature is in force. This was for bug #702964.
Unfortunately the substituted function was leaving the return values
from .putdeviceparamsonly on the stack, leading to incorrect output.
Just pop the returned values, we don't want them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #706441 "AcroForm field Btn with no AP not implemented"
The title is misleading. The actual problem is that when checking to
determine the 'visibility' of an annotation, the NoView bit was being
checked with the wrong bit set in the mask.
This led to the annotation not being visible and not rendering.
From there; the old PDF interpreter used the presence of an OutputFile
on the command line to determine whether or not the output should be
treated as 'printer' or 'viewer'. The display device doesn't take an
OutputFIle so we treat that as a viewer. We weren't taking that action
at all internally.
So pass OutputFile in from the PostScript world if it is present, and
look for it on the command line if we are stand-alone. Start by assuming
we are a printer. If we find an OutputFile, and have not encountered a
'Printed' switch, then assume we are a printer.
Secondly; deal with the warnings. These are real but are the wrong place
for a warning. The problem is that we have an annotation which has an
/AP dictionary:
<<
/D <</Off 723 0 R/renew 724 0 R>>
/N <</renew 722 0 R>>
>>
We pick up the Normal (/N) key/value and see that the value is a
dictionary. So we consult the annotation for a /AS (appearance state)
which in this case is defined as:
/AS/Off
So we then try to find the /Off state in the sub-dictionary. There isn't
one. The specification has nothing to say about what we should do here.
I've chosen to replace the appearance with a null object and alter the
drawing routine to simply silently ignore this case.
Final note; the code is now behaving as it is expected to, but the file
in bug #706441 will still be missing a number of buttons when rendered,
because these buttons are only drawn when the application is a viewer.
In order to have them render Ghostscript must be invoked with :
-dPrinted=false
|
| |
|
|
|
|
| |
Stems from Bug 706267
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main code issue with this bug was that the banner printed on startup is
printed from Postscript, and Postscript's cvs operator doesn't allow for
dictating the number of digits it outputs so the number 00 will always end up as
the string "0", or 01 as "1". So our 10.01.0 version would be printed as
"10.1.0".
To address this, as a ".revisionstring" entry to systemdict, created during
startup, the string is created in C, so we control the format.
The remaining issues need to be address as part of the release process.
|
|
|
|
|
|
|
| |
We've been using the new 'C' PDF interpreter as the default for the
last two releases, with comparatively few problems. We've decided the
time has come to remove the (deprecated) fallback to the old PostScript
written PDF interpreter.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No bug report for this one, the patch was submitted by Justin after a
conversation on Discord (#ghostscript, 11th February 2023).
The goal was to extend the 'NoOutputFonts' control, which turns all
text in a PDF file into linework, such that the decision could be
made based on the font name.
The supplied patch does so by adding two new controls, AlwaysOutline
and NeverOutline, which can only be set from PostScript (because
the font names are in an array). Text using fonts appearing in the
AlwaysOutline array will be converted to linework regardless of the
setting of NoOutputFonts. If NoOutputFonts is set to true, then text
using fonts appearing in the NeverOutline array will still use the
font, and will not be converted to linework.
|
|
|
|
|
|
| |
This got missed in commit bb3319b1a396288c8fc75c7ab198e6e7fd3c734e, we
need to add it to pdf_main.ps for passing from PostScript to the PDF
interpreter.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #706059 "PDF interpreter does not preserve UserUnit with pdfwrite"
The bug report is in two parts; the first is that pdfwrite does not
automagically insert UserUnit and alter the scaling of files when the
media size exceeds the Acrobat architectural limitation of 14,400. We
do not regard this as a Ghostscript bug, since the PDF format does not
limit the MediaBox, this is a limitation of Acrobat.
The second part correctly noted that the new PDF interpreter was not
preserving any UserUnit from the input file when the output device was
pdfwrite (the only device which supports preserving UserUnit).
This commit preserves any extant UserUnit (and does not scale either the
media or the content) when the output device supports UserUnit and the
-dNoUserUnit switch isn't set. This now works for the case when the PDF
interpreter is built into Ghostscript and when it is a stand-alone
binary.
A couple of files in the test suite show very slight (pixel level)
changes now, because we preserve the UserUnit and so the scaling is not
quite the same.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #706036 "-dPreserveAnnots=false disregards /ShowAnnotTypes []"
There were a couple of problems here. Firstly the switches were
mis-spelled as SHOWANNOTTYPES and PRESERVEANNOTTYPES in both pdf_main.ps
(passing the values to the PDF interpreter) and in the PDF interpreter
itself. This meant that neither switch was being processed.
In addition the code to read the values was incorrectly assuming that
the value was a 'list' of strings. This is not the case for either
control, and indeed simply isn't possible in PostScript. In both cases
the value should be an array of names.
The code now accepts an array of either names or strings for both keys
from PostScript.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #705992 "Converting PDF to TIFF with ghostscript causes offset output when using -dFitPage and -sPAPERSIZE options"
The files supplied with the report all have a MediaBox (and other Boxes)
where the origin is not only not 0,0 but is in fact negative. A
possibility I had overlooked previously.
This commit fixes that.
This does lead to some changes in behaviour when compared to the old
PDF interpreter.
The output from the old interpreter was inconsistent with 'landscape'
pages, as evidenced in the bug report. In fact these pages are not
landscape at all, in the sense that the media and content is all defined
in portrait. The reason they are displayed landscape is because the page
contains a /Rotate key with a value of 270.
When the output was to a rendering device, such as TIFF, the landscape
files were sized to fit on Legal media (to satisfy -sPAPERSIZE=legal and
-dFitPage) BUT the output was then rotated, resulting in a landscape
bitmap.
When the output was a high level device, such as pdfwrite, then the
landscape files were written as portrait files, sized for Legal media
and rotated so that the content is 'upside down'.
This also happened with the rendering devices when using
-dDEVICEWIDTHPOINTS and -dDEVICEHEIGHTPOINTS to set the media size to
Legal.
When using -dDEVICEHEIGHTPOINTS and -dDEVICEWIDTHPOINTS to set true
landscape media, and fitting the page to that, the rendering devices
coped correctly, but the pdfwrite output has a badly sized CropBox
leading to the files displaying totally incorrectly in Acrobat.
The new code is consistent across all devices, rotations, orientations
of both actual media and requested media.
The requested media in the bug report is Legal, and Legal media is
portrait, not landscape, so it seems to me that producing a landscape
bitmap is wrong. We now produce portrait legal output. The pdfwrite
device also produces portrait media output (as before) but the content
is the correct way up, not rotated.
To achieve landscape output, you need to specify landscape media (which
can't be done with -sPAPERSIZE as only 11x17 has a defined landscape
version), by using -sDEVICEWIDTHPOINTS and -dDEVICEWIDTHPOINTS and
setting -dFIXEDMEDIA to prevent the media size being changed to match
the requested input.
|
|
|
|
|
|
|
|
|
|
|
|
| |
To help differentiate between a substituted CIDFont and an embedded one, a
change was made to store the file path in the CIDFont dictionary. That change
failed to account for the possibility that the file object and the CIDFont
dictionary may not be in compatible VM modes.
This adds code to ensure that the string holding the path is in a suitable VM
mode to be stored into the dictionary.
Reported by Richard Lescak <rlescak@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to preserve a consistent page count when using multiple input PDFs
the PS code defines and maintains CumulativePageCount value.
This is initially setup in newpdf_runpdfbegin.
But to gain finer grained control over the interpreter, it's necessary to skip
using newpdf_runpdfbegin, meaning newpdf_gather_parameters will throw an
undefined error on CumulativePageCount.
So tweak newpdf_gather_parameters so spot when it isn't defined, and set it to
one.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #705900 "Ghostscript - parameter PDFSTOPONWARNING doesn't stop
on all warnings and errors"
The first case here is actually in the PostScript interpreter and so
does not strictly affect GhostPDF. When executing 'run' on a PDF file
the PostScript code checks to see if the header was at the start of the
file and emits a warning if not. We now check PDFSTOPONWARNING after
that and exit with an error if it is true.
The xref checking (very early part of the new interpreter, predates the
arguments) was repairing files without first checking if it should
exit (if PDFSTOPONERROR is true).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way that the graphics library 'caches' ICC profiles when we are
using a clist is somewhat arcane. We retain a profile and link cache in
the device's 'clist' portion of the structure. As we find profiles we
add them to the profile cache and as we create link profiles we add
them to the link cache (profiles are written to and read from the clist)
For subsequent pages we (whether by design or accident) rely on the
device's clist caches remaining unchanged to give us a performance
benefit.
However with pdfi we were grestore'ing back to a point before we had
set 'PageUsesTransparency' in the device, then going back around and
setting it back up again.
Because we use that to control the banding, the fact that it had
changed caused us to throw away the clist, and then recreate it, which
threw away the ICC caches, which meant we had to recreate the link
profiles, from the profiles stored in the clist, on every page.
For /tests_private/pdf/PDFIA1.7_SUBSET/CATX3146.pdf and possibly for
other files this was a very significant portion of the total time
taken for the entire job (it is otherwise a comparatively simple file)
By removing the gsave and grestore pair we can avoid restore'ing back to
a time when the device's PageUsesTransparency flag was false, which
avoids us discarding and recreating the clist, which means we keep the
caches and therefore the file runs faster. This may affect other files
at certain resolutions/configurations.
command line:
-q -dQUIET -dNONATIVEFONTMAP -dSAFER -dBATCH -r72 -dJOBSERVER -sDEVICE=bit -o /dev/null /tests_private/pdf/PDFIA1.7_SUBSET/CATX3146.pdf
|
|
|
|
|
|
|
|
|
|
| |
Bug #705861 "Regression: image not centered when forcing A4 output"
The old PDF interpreter apparently centres the content on the scaled
page when using PDFFitPage, whereas we were maintaining the origin at
0,0.
Add code to duplicate this.
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #705849 "New PDF interpreter errors out with PDFFitPage"
If the input PDF file had square media then we would correctly decide
we did not need to rotate the media for a better fit, but we left the
values we would normally use to calculate if rotation was required on
the operand stack, resulting in an error.
Pop the unused values.
|
|
|
|
| |
Accidentally left two debugging lines in the previous commit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #705834 "stack overflow in psi/idict.c:160 dict_alloc (exploitable)"
This is caused by subsequent calls to .PDFInfo causing the Info
dictionary to end up with circular references as we replace indirect
references with PDF objects.
I'd been meaning for some time to revisit the PostScript code and avoid
calling .PDFInfo multiple times just for performance reasons (we have to
convert the PDF dictionary to a PostScript dictionary every time).
This commit uses the stored PostScript dictionary 'PDFInfo' instead of
calling .PDFInfo which avoids the circular reference and is slightly
more efficient.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Arising from Bug #705784, if we hadn't built the PDF interpreter we
would get typecheck errors which were somewhat misleading as to the
source of the problem.
This commit tidies up the error handling in the area of .PDFInit so
that we not only detect the problem there but give a warning that it
occurred.
In addition, add a means to detect if the PDF interpreter is built in
before we stat trying to process a PDF file and, if it is not, give
a sensible error message.
Tested with BUILD_PDF 0 and with NEWPDF true and false.
|
|
|
|
| |
Bug 705767 "Dereference of free object 41, next object number as offset failed"
|
|
|
|
| |
for the git prelease code.
|
| |
|
|
|
|
|
| |
The page device dictionary doesn't necessarily contain an /Overprint key,
so check it exists before we try to use it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Supports out-of-order ranges (if the parser allows it and disables the
PageHandler, i.e., flp device). Also adds support for ranges appended
to the "even" and "odd" keyword following a ":".
As before a trailing "-" in a range implies the last page, and as was
supported by the previous 'gxps' code, a leading "-" also is the last page.
For example, with XPS or PDF: -sPageList=odd:3-7,even:4-8,1-,-1,9
prints pages: 3, 5, 7, 4, 6, 8, 1, 2, ..., last, last, last-1, ..., 1, 9
The PageList string is parsed using C code into an array that consists of
an initial int that is > 0 if the list is ordered, followed by sets of
3 integers per range, even/odd flag (odd=1, even=2), start, end. The
final 3 ints are 0,0,0 as a marker.
The initial int is used by 'pagelist_test_printed' as an index to the next
range to be processed when the PageList is used for languages that can only
be processed sequentially (e.g. PS and PCL) and is updated when the page
passes the end of the current range. A value of -1 means the ranges are
not ordered (not strctly increasing).
The flp_fillpage is changed to ignore errors from processing the PageList
performed by ParsePageList (called from SkipPage) when PageCount is 0
so that parsers that support out of order processing (PDF and XPS) can
continue until later. This should have little or no performance impact
since it is limited to PageCount == 0.
Note that the new PDF parser also uses the C code parser and then uses
the array of ranges returned by ".PDFparsePageList". The old PostScript
based parser has not been updated, although it is easy to do so.
|
|
|
|
|
|
|
|
|
|
|
| |
The Postscript interpreter parses the Fontmap file during initialisation, so
before file access permission enforcement has been activated, so a user
defined fontmap file need not be added to the permit-file-read list.
But pdfi parses the fontmap on demand, thus after file access permissions are
active. So when calling pdfi from Postscript, the Postscript interpreter
needs to add the file and path to the permit-file-read list so pdfi can
access it.
|
|
|
|
|
|
|
|
|
|
|
|
| |
When doing the special processing for EPS files, we store the string from
the %%BoundingBox comment in systemdict so it can be accessed later for
doing fit page functionality.
Problem is, the special EPS handling is done in between save/restore operations,
meaning the string is restored away while a reference to it remains in
systemdict, potentially causing problems.
Moving the string to global VM resolves the problem
|
|
|
|
|
|
|
|
|
|
|
| |
This was missing from pdfi, so is added here.
Also, change the functionality so setting the SUBSTFONT key to "/None" will
result in any attempt to fall back to the default font to throw an
invalidfont error.
Lastly, fix some typos on the parameters being passed from Postscript into
pdfi.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug #705218 "table of contents/bookmarks wrong after merging PDF files"
Part 2 of 2 for this bug.
The pdfwrite device is capable of processing pdfmarks to produce a
variety of effects in the output. In this case we can add Link
annotations and Outlines to a PDF file. When the PDF interpreter
supports reading these from an input file we also write them to the
output file using (effectively) pdfmark operationjs.
The problem is that when we have multiple input files and the Dest of
a Link annotation or Outline is a page, that page refers to the page
number in the original file (eg 1). That won't be the correct page
number in the output file because it already contains the pages from
any previous files.
So what we need to do is offset the page number of the destination by
the number of pages already in the output. That's what this commit does;
for Ghostscript processing we store the number of pages in the device
and update at the end of each file by the number of pages in that file.
We send the number of pages to the (newly created) interpreter at the
start of every file in the dictionary argument we supply to .PDFInit.
For GhostPDF it is simpler because we handle all the files, we just
update the counter in the PDF context by the number of pages processed
in the file we have just completed.
We simply add the offset to the page number when creating the Dest
pdfmark.
This commit also extends the named destination processing to handle
name trees with more than a single node by recursively processing the
Kids array in each node. We also use the Limits to more quickly
determine if a node contains our target, rather than checking each
entry in the Names array for a match.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we ran the PDF interpreter from the PostScript interface
more or less totally in a stopped context, which meant that it never
exited with an error status, even when the file had an error.
We now use PDFSTOPONERROR to determine whether the PDF interpreter
should run in a stopped context or should actually return a PostScript
error, which makes the interpreter exit with a non-zero status.
This causes two files, output from pdfwrite, to throw errors where they
previously did not when tested on the cluster. This is actually correct
behaviour when compared with the action of the old PDF interpreter.
|
|
|
|
|
|
|
|
|
|
| |
The old code relied upon runpdf closing the file supplied to the /run
(redefined PS operator) function instead of closing the file itself.
For compatibility this now does the same with the new PDF interpreter
code. Note that unless PDFSTOPONERROR is true we will execute the PDF
file in a stopped context, so as to allow us to close the file even in
the event of an error.
|
|
|
|
|
|
|
|
|
|
| |
Sending PageSpotColors to the txtwrite device was causing it to throw
an error on setpagdevice as it did not process the key. While that
shouldn't really cause a problem (and doesn't with the following commit)
we don't really want to set the key for devices which aren't capable of
producing spots anyway.
This only affects the Ghostscript implementation of GhostPDF.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We added /Page# to the dictionary returned from pdfgetpage in order to
get pdf2dsc.ps to work (it relies upon that key being present) and
potentially other PostScript programs as well. The key was added to
the dictionary in pdfdopages.
However, after adding it, we then proceeded to assume it would always be
present, even if we called pdfpage and friends directly, rather than
using pdfdopages.
Fix that assumption here by adding the key to the dictionary in
pdfgetpage instead of pdfdopages. There's no way (using the old
PostScript code) to use any of the other functions without using
pdfgetpage first.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A customer requested that we make pdf_info.ps work with the new
PDF interpreter, and generate the same information.
This commit modifies the way we extract information on a
page-by-page basis to potentially include the names of spot inks
and information about fonts used on the page.
This is now returned to the PostScript environment using a PDF
dictionary instead of a C structure. The pdf_info.ps program has
been updated so that it use the new information in broadly the
same way as the information from the old PDF interpreter.
There are differences; pdf_info.ps extracts font information
itself, rather than having the interpreter do it. This is not
possible with the new interpreter which is why we have the
PDF interpreter do it for us. In addition the pdf_info.ps
program only descended to the page level whereas the new PDF
interpreter evaluates all objects on the page, potentially
meaning that more fonts (and technically spot inks) might be
detected.
We now have an additional PostScript operator '.PDFPageInfoExt'
which returns 'extended' information about a page. This is the
same as .PDFPageInfo but includes the font and spot ink
information.
Running with -dPDFINFO using either Ghostscript or GhostPDF will
print more information than before, including the spot inks and
considerably more information about fonts than the pdf_info.ps
program emits, including embedding status, descendant fonts
(and their membedding status) and the presence of ToUnicode
CMaps.
Updated documentation for all of the above.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Type 2 fonts, we were "collapsing" the mapping of the glyph name to GID to
charstring mapping into a single glyph name to charstring map. That is contrary
to the description in the PLRM 3rd Edition.
This becomes a problem when attempting to re-encode the font by replacing the
glyph name to GID encoding in the CharStrings dictionary, where we expect to
find a charstring, we find an integer (GID).
This changes how we create Type 2 fonts from CFF so it maintains that two
step mapping, and re-encoding will now work.
|
|
|
|
|
|
|
| |
This makes the procedure style a bit more like the rest of the PS init code,
and makes it both a little efficient and (I think) a little clearer.
Noticed in passing when working on oss-fuzz 46672.
|
|
|
|
|
|
|
|
| |
As per commit 958c044dbbf140b893874c6d634ac71400ea5a12 but this time for
the definition of knownoget.
In fact, given the way pdf2dsc uses this, it doesn't actually cause a
problem, but leaving junk on the stack is bad practice, so let's fix it.
|
|
|
|
|
|
|
|
|
|
|
| |
The utility program pdf2dsc.ps expects that both pget and knownoget will
be available in the current dictionary after runpdfbegin is executed.
To facilitate this pdfi makes such a definition but, unfortunately, it
had a bug if the key was not found in the dictionary, leaving a copy
of the dictionary on the operand stack when it should not.
Fixed here. Thanks to William Bader for spotting the bug.
|
|
|
|
|
|
|
|
|
| |
When rewriting FitPage for the new interpreter I seem to have left part
of the debugging code in place, which led to two incorrect values on
the stack when we calculated the scale factor. I general these values
were zero, leading to a scale factor of 0 and nothing being drawn.
Fixed here by removing the extraneous values.
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Ken Sharp for providing the solution for this.
If we are doing pattern simulation with the pdf14 device
then the patterns should also do this. Otherwise we can
get seg faults. For example:
-sDEVICE=bitrgbtags -dNEWPDF=false -dOverprint=/simulate
-r72 -o output.ppm -f tests_private/comparefiles/Bug693541.pdf
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The file supplied for this came from a customer who asked that we
delete the file after working on it, so there is no bug report and no
reproducer for this.
The problem is that the PDF file has a font using a CMap, and that CMap
uses /UseCMap (and the usecmap operator) to read a different (Horizontal)
CMap and then modify it with Vertical glyph positions.
The CMap does not have a begincodespacerange, it simply inherits the
ranges from the child CMap. This causes the code added to work around
Bug 690737 to read off the end of the CMap in read_CMap_stream.
Since there is nothing left to process, this causes errors in the CMap
processing.
Following a suggestion from Chris this commit first attempts the same
'discard up to the begincodespacerange) hackery, then checks to see if
the stream has any bytes left. If it does we proceed as before.
If there are no bytes left, then we have discarded all of the content.
So we rewind the stream to the point we were at before we tried to
discard the header, impose a different SubFileDecode, looking this time
for a 'begincmap' and then attempt to process the CMap as normal.
|