delta/ghostpdl.git - git.ghostscript.com: ghostpdl.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	PDF interpreter - clean up after Potfolio/Collection files	Ken Sharp	2023-05-01	1	-0/+5
\| \| \| \| \| \| \| \|	Noticed while working on a different issue; we were not cleaning up the dictionary, nor restoring back the PostScript state, after running a Portfolio (PDF Collection) file. Fixed by calling runpdfend after running all of the embedded PDF files.
*	GhostPDF - fix Portfolio PDF with pdfwrite	Ken Sharp	2023-04-20	1	-4/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No file or bug report for this, the customer requested the files be kept private. However any PDF Collection (Portfolio) file will show the problem. GhostPDF supports preserving embedded files from the input, but when we are processing a PDF Collection we don't want to do that, because in this case we run each of the embedded files individually. If we copy the EmbeddedFIles as well then we end up duplicating them in the output. So, when processing EmbeddedFiles, check the Catalog to see if there is a /Collection key, if there is then stop processing EmbeddedFiles. The customer also pointed out there was no way to avoid embedding any EmbeddedFiles from the input, so additionally add a new switch -dPreserveEmbeddedFiles to control this. While we're doing that, add one to control the preservation of 'DOCVIEW' (PageMode, PageLayout, OpenAction) as well, -dPreserveDocView. This then leads on to preventing the EmbeddedFiles in a PDF Collection from writing their DocView information. If we let them do that then we end up opening the file incorrectly. To facilitate similar changes in the future I've rejigged the way .PDFInit works, so that it calls a helper function to read any interpreter parameters and applies them to the PDF context. I've also added a new PostScript operator '.PDFSetParams' which takes a PDF context and a dictionary of key/value pairs which it applies to the context. Sadly I can't actually use that for the docview control, because the PDF initialisation is what processes the document, so changing it afterwards is no help. So I've altered runpdfbegin to call a new function runpdfbegin_with_params and pass an empty dictionary. That then allows me to call runpdfbegin_with_params from the PDF Collection processing, and turn off PreserveDocView. So in summary; new controls PreserveDocView and PreserveEmbeddedFiles and a new function .PDFSetParams to allow us to alter the PDF interpreter parameters after .PDFInit is executed. PDF Collections no longer embed duplicate files.
*	Bug 706594: Postscript - Type 1 glyph name equivalence "mapping"	Chris Liddell	2023-04-19	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we load a Type 1 font from a font file (i.e. not an embedded font), we use the Adobe Glyph List to find if a glyph name has other names (based on the Unicode code point). For example, "/ocyrillic" is code point 0x43e, which also commonly maps to the name "/afii10080". Previously, use used "forall" to interate through the CharStrings dictionary, but that causes two problems. Firstly, and most importantly, when we write new entries to that dictionary, if the dictionary has to be extended, it ends up messing with the "forall" indexing. Secondly, it means we do work than necessary because we potentially seek out equivalents for names we've just added. To improve this, populate an array with the original names from the CharStrings dictionary, and iterate through that - thus the changing contents of the dictionary doesn't matter.
*	Remove accidentally committed change	Chris Liddell	2023-04-06	1	-2/+0
\|
*	Ghostscript - convert the system default paper size to lower case	Ken Sharp	2023-04-05	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug 706544 "Unknown .defaultpapersize: (Letter). error shown at startup" The problem appears to be that the system libpaper is configured to have a default paper size of 'Letter' rather than the all lower case 'letter', so Ghostscript doesn't have a matching media size. libpaper says that lower case is 'preferred' but there's apparently no reason why we couldn't also have (for example) A4 instead of a4. We can tackle this by making the system defined paper size lower case before returning it to Ghostscript. However, a few of the paper sizes in statusdict do have upper case characters in their name, so we need to duplicate those few sizes as lower case. The contents of statusdict, and in particular the defined media sizes, are not specified so we're OK to do this.
*	Update postal address in file headers	Chris Liddell	2023-04-04	76	-224/+225
\|
*	Put back the MD5Encode filter	Ken Sharp	2023-03-27	1	-2/+2
\| \| \| \| \| \|	Commit 3635f4c75e54e337a4eebcf6db3eef0e60f9cebf removed a bunch of filters from PostScript, but we still need the MD5Encode filter for the old PDF interpreter with some kinds of PDF encryption.
*	PostScript filters - remove non-standard filters when SAFER is true	Ken Sharp	2023-03-24	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \|	Following on from bug #706494, there are a whole bunch of non-standard filters, some of these are required for the old PDF interpreter written in PostScript, some appear to have been included just for symmetry with the Decode filters and some are gneuinely used by our own support PostScript. This code undefines all the filters we can from filterdict, thus preventing any of those from being used maliciously. We do have to retain /ImscaleDecode, /eexecDecode, /PFBDecode and /TBCPDecode as these are used by the PostScript support files.
*	ps2write - prevent PSPageOptions wrapping	Ken Sharp	2023-03-18	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As requested by customer 1120. The PSPageOptions array is used to specify strings to be inserted into the PostScript output of the ps2write device; the intention is to permit device-specific code to be placed on each page. Ordinarily the array contains one string for each page of the output, and that string is applied individiually. To assist in the common cases where a pattern of different setup is applied to pages, such as treating each of a pair of pages differently, for every pair of pages, the processing 'wraps around' the array if there are more pages than strings. So if we supply two strings the first string will be applied to pages 1, 3, 5, ... and the second string will be applied to page 2, 4, 6, .... The customer wants to disable the 'wrap around' so that if they supply fewer strings than pages, the device simply stops adding strings when it runs out of content in the array. There is no practical way to do this by altering the array content because it is actually quite awkward to deal with heterogeneous parameter arrays. Rather than rewrite the code extensively I've chosen (reluctantly) to add a new parameter 'PSPageOptionsWrap'. When true (the default value) then the behaviour is unchanged. When false the ps2write device no longer wraps around the array, but simply stops adding content to each page.
*	Idiom Recognition - fix bug in PPI_CUils	Ken Sharp	2023-03-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Bug #706483 "PPI_CUtils Idiom detection causing incorrect output" The Idiom Recognition substitutes a function for the basic PPI 'Force Gray' function which additionally turns off JPEG image passthrough while this feature is in force. This was for bug #702964. Unfortunately the substituted function was leaving the return values from .putdeviceparamsonly on the stack, leading to incorrect output. Just pop the returned values, we don't want them.
*	GhostPDF - Fix annotation visibility detection, improve -dPrinted.	Ken Sharp	2023-03-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug #706441 "AcroForm field Btn with no AP not implemented" The title is misleading. The actual problem is that when checking to determine the 'visibility' of an annotation, the NoView bit was being checked with the wrong bit set in the mask. This led to the annotation not being visible and not rendering. From there; the old PDF interpreter used the presence of an OutputFile on the command line to determine whether or not the output should be treated as 'printer' or 'viewer'. The display device doesn't take an OutputFIle so we treat that as a viewer. We weren't taking that action at all internally. So pass OutputFile in from the PostScript world if it is present, and look for it on the command line if we are stand-alone. Start by assuming we are a printer. If we find an OutputFile, and have not encountered a 'Printed' switch, then assume we are a printer. Secondly; deal with the warnings. These are real but are the wrong place for a warning. The problem is that we have an annotation which has an /AP dictionary: << /D <</Off 723 0 R/renew 724 0 R>> /N <</renew 722 0 R>> >> We pick up the Normal (/N) key/value and see that the value is a dictionary. So we consult the annotation for a /AS (appearance state) which in this case is defined as: /AS/Off So we then try to find the /Off state in the sub-dictionary. There isn't one. The specification has nothing to say about what we should do here. I've chosen to replace the appearance with a null object and alter the drawing routine to simply silently ignore this case. Final note; the code is now behaving as it is expected to, but the file in bug #706441 will still be missing a number of buttons when rendered, because these buttons are only drawn when the application is a viewer. In order to have them render Ghostscript must be invoked with : -dPrinted=false
*	Increment version number	Chris Liddell	2023-03-02	1	-1/+1
\|
*	Allow a -sFONTMAP= option to change the fontmap file	Chris Liddell	2023-02-27	1	-2/+5
\| \| \| \|	Stems from Bug 706267
*	Bug 706389: Fix versioning format consistency	Chris Liddell	2023-02-27	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The main code issue with this bug was that the banner printed on startup is printed from Postscript, and Postscript's cvs operator doesn't allow for dictating the number of digits it outputs so the number 00 will always end up as the string "0", or 01 as "1". So our 10.01.0 version would be printed as "10.1.0". To address this, as a ".revisionstring" entry to systemdict, created during startup, the string is created in C, so we control the format. The remaining issues need to be address as part of the release process.
*	Ghostscript + GhostPDF - remove -dNEWPDF	Ken Sharp	2023-02-27	1	-10/+9
\| \| \| \| \| \| \|	We've been using the new 'C' PDF interpreter as the default for the last two releases, with comparatively few problems. We've decided the time has come to remove the (deprecated) fallback to the old PostScript written PDF interpreter.
*	pdfwrite - permit fonts to be converted to linework based on the name	Justin Beaty	2023-02-13	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No bug report for this one, the patch was submitted by Justin after a conversation on Discord (#ghostscript, 11th February 2023). The goal was to extend the 'NoOutputFonts' control, which turns all text in a PDF file into linework, such that the decision could be made based on the font name. The supplied patch does so by adding two new controls, AlwaysOutline and NeverOutline, which can only be set from PostScript (because the font names are in an array). Text using fonts appearing in the AlwaysOutline array will be converted to linework regardless of the setting of NoOutputFonts. If NoOutputFonts is set to true, then text using fonts appearing in the NeverOutline array will still use the font, and will not be converted to linework.
*	GhostScript + pdfi - support the PreserveMarkedInfo switch	Ken Sharp	2022-12-24	1	-1/+1
\| \| \| \| \| \|	This got missed in commit bb3319b1a396288c8fc75c7ab198e6e7fd3c734e, we need to add it to pdf_main.ps for passing from PostScript to the PDF interpreter.
*	GhostPDF - Preserve UserUnit when the output device supports it	Ken Sharp	2022-11-07	1	-15/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug #706059 "PDF interpreter does not preserve UserUnit with pdfwrite" The bug report is in two parts; the first is that pdfwrite does not automagically insert UserUnit and alter the scaling of files when the media size exceeds the Acrobat architectural limitation of 14,400. We do not regard this as a Ghostscript bug, since the PDF format does not limit the MediaBox, this is a limitation of Acrobat. The second part correctly noted that the new PDF interpreter was not preserving any UserUnit from the input file when the output device was pdfwrite (the only device which supports preserving UserUnit). This commit preserves any extant UserUnit (and does not scale either the media or the content) when the output device supports UserUnit and the -dNoUserUnit switch isn't set. This now works for the case when the PDF interpreter is built into Ghostscript and when it is a stand-alone binary. A couple of files in the test suite show very slight (pixel level) changes now, because we preserve the UserUnit and so the scaling is not quite the same.
*	GhostPDF + GS - fix ShowAnnotTypes and PreserveAnnotTypes	Ken Sharp	2022-10-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug #706036 "-dPreserveAnnots=false disregards /ShowAnnotTypes []" There were a couple of problems here. Firstly the switches were mis-spelled as SHOWANNOTTYPES and PRESERVEANNOTTYPES in both pdf_main.ps (passing the values to the PDF interpreter) and in the PDF interpreter itself. This meant that neither switch was being processed. In addition the code to read the values was incorrectly assuming that the value was a 'list' of strings. This is not the case for either control, and indeed simply isn't possible in PostScript. In both cases the value should be an array of names. The code now accepts an array of either names or strings for both keys from PostScript.
*	GhostPDF + GS - Fix negative llx and lly Box with FitPage	Ken Sharp	2022-10-26	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug #705992 "Converting PDF to TIFF with ghostscript causes offset output when using -dFitPage and -sPAPERSIZE options" The files supplied with the report all have a MediaBox (and other Boxes) where the origin is not only not 0,0 but is in fact negative. A possibility I had overlooked previously. This commit fixes that. This does lead to some changes in behaviour when compared to the old PDF interpreter. The output from the old interpreter was inconsistent with 'landscape' pages, as evidenced in the bug report. In fact these pages are not landscape at all, in the sense that the media and content is all defined in portrait. The reason they are displayed landscape is because the page contains a /Rotate key with a value of 270. When the output was to a rendering device, such as TIFF, the landscape files were sized to fit on Legal media (to satisfy -sPAPERSIZE=legal and -dFitPage) BUT the output was then rotated, resulting in a landscape bitmap. When the output was a high level device, such as pdfwrite, then the landscape files were written as portrait files, sized for Legal media and rotated so that the content is 'upside down'. This also happened with the rendering devices when using -dDEVICEWIDTHPOINTS and -dDEVICEHEIGHTPOINTS to set the media size to Legal. When using -dDEVICEHEIGHTPOINTS and -dDEVICEWIDTHPOINTS to set true landscape media, and fitting the page to that, the rendering devices coped correctly, but the pdfwrite output has a badly sized CropBox leading to the files displaying totally incorrectly in Acrobat. The new code is consistent across all devices, rotations, orientations of both actual media and requested media. The requested media in the bug report is Legal, and Legal media is portrait, not landscape, so it seems to me that producing a landscape bitmap is wrong. We now produce portrait legal output. The pdfwrite device also produces portrait media output (as before) but the content is the correct way up, not rotated. To achieve landscape output, you need to specify landscape media (which can't be done with -sPAPERSIZE as only 11x17 has a defined landscape version), by using -sDEVICEWIDTHPOINTS and -dDEVICEWIDTHPOINTS and setting -dFIXEDMEDIA to prevent the media size being changed to match the requested input.
*	Deal with different VM modes during CIDFont loading	Chris Liddell	2022-10-13	1	-4/+19
\| \| \| \| \| \| \| \| \| \| \| \|	To help differentiate between a substituted CIDFont and an embedded one, a change was made to store the file path in the CIDFont dictionary. That change failed to account for the possibility that the file object and the CIDFont dictionary may not be in compatible VM modes. This adds code to ensure that the string holding the path is in a suitable VM mode to be stored into the dictionary. Reported by Richard Lescak <rlescak@redhat.com>
*	Handle CumulativePageCount not being defined	Chris Liddell	2022-09-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In order to preserve a consistent page count when using multiple input PDFs the PS code defines and maintains CumulativePageCount value. This is initially setup in newpdf_runpdfbegin. But to gain finer grained control over the interpreter, it's necessary to skip using newpdf_runpdfbegin, meaning newpdf_gather_parameters will throw an undefined error on CumulativePageCount. So tweak newpdf_gather_parameters so spot when it isn't defined, and set it to one.
*	GhostPDF - Some error/warning cases not flagged with PDFSTOPON*	Ken Sharp	2022-09-23	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug #705900 "Ghostscript - parameter PDFSTOPONWARNING doesn't stop on all warnings and errors" The first case here is actually in the PostScript interpreter and so does not strictly affect GhostPDF. When executing 'run' on a PDF file the PostScript code checks to see if the header was at the start of the file and emits a warning if not. We now check PDFSTOPONWARNING after that and exit with an error if it is true. The xref checking (very early part of the new interpreter, predates the arguments) was repairing files without first checking if it should exit (if PDFSTOPONERROR is true).
*	GhostPDF + GS - performance improvement with transparent files and clist	Ken Sharp	2022-09-15	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The way that the graphics library 'caches' ICC profiles when we are using a clist is somewhat arcane. We retain a profile and link cache in the device's 'clist' portion of the structure. As we find profiles we add them to the profile cache and as we create link profiles we add them to the link cache (profiles are written to and read from the clist) For subsequent pages we (whether by design or accident) rely on the device's clist caches remaining unchanged to give us a performance benefit. However with pdfi we were grestore'ing back to a point before we had set 'PageUsesTransparency' in the device, then going back around and setting it back up again. Because we use that to control the banding, the fact that it had changed caused us to throw away the clist, and then recreate it, which threw away the ICC caches, which meant we had to recreate the link profiles, from the profiles stored in the clist, on every page. For /tests_private/pdf/PDFIA1.7_SUBSET/CATX3146.pdf and possibly for other files this was a very significant portion of the total time taken for the entire job (it is otherwise a comparatively simple file) By removing the gsave and grestore pair we can avoid restore'ing back to a time when the device's PageUsesTransparency flag was false, which avoids us discarding and recreating the clist, which means we keep the caches and therefore the file runs faster. This may affect other files at certain resolutions/configurations. command line: -q -dQUIET -dNONATIVEFONTMAP -dSAFER -dBATCH -r72 -dJOBSERVER -sDEVICE=bit -o /dev/null /tests_private/pdf/PDFIA1.7_SUBSET/CATX3146.pdf
*	GhostPDF + GS - Centre contents on media when using PDFFitPage	Ken Sharp	2022-09-13	1	-9/+27
\| \| \| \| \| \| \| \| \| \|	Bug #705861 "Regression: image not centered when forcing A4 output" The old PDF interpreter apparently centres the content on the scaled page when using PDFFitPage, whereas we were maintaining the origin at 0,0. Add code to duplicate this.
*	GhostPDF + GS - fix PDFFitPage with square MediaBox	Ken Sharp	2022-09-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	Bug #705849 "New PDF interpreter errors out with PDFFitPage" If the input PDF file had square media then we would correctly decide we did not need to rotate the media for a better fit, but we left the values we would normally use to calculate if rotation was required on the operand stack, resulting in an error. Pop the unused values.
*	GhostPDF + GS - remove some debug	Ken Sharp	2022-08-31	1	-2/+0
\| \| \| \|	Accidentally left two debugging lines in the previous commit.
*	GhostPDF + GS - small optimisation and avoid a circular reference	Ken Sharp	2022-08-31	1	-15/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug #705834 "stack overflow in psi/idict.c:160 dict_alloc (exploitable)" This is caused by subsequent calls to .PDFInfo causing the Info dictionary to end up with circular references as we replace indirect references with PDF objects. I'd been meaning for some time to revisit the PostScript code and avoid calling .PDFInfo multiple times just for performance reasons (we have to convert the PDF dictionary to a PostScript dictionary every time). This commit uses the stored PostScript dictionary 'PDFInfo' instead of calling .PDFInfo which avoids the circular reference and is slightly more efficient.
*	GhostPDF + GS - improve error handling	Ken Sharp	2022-08-26	1	-16/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Arising from Bug #705784, if we hadn't built the PDF interpreter we would get typecheck errors which were somewhat misleading as to the source of the problem. This commit tidies up the error handling in the area of .PDFInit so that we not only detect the problem there but give a warning that it occurred. In addition, add a means to detect if the PDF interpreter is built in before we stat trying to process a PDF file and, if it is not, give a sensible error message. Tested with BUILD_PDF 0 and with NEWPDF true and false.
*	GhostPDF - fix a typo in an error message	Ken Sharp	2022-08-22	1	-2/+2
\| \| \| \|	Bug 705767 "Dereference of free object 41, next object number as offset failed"
*	Bump version number and date to 10.01.0	Chris Liddell	2022-08-17	1	-1/+1
\| \| \| \|	for the git prelease code.
*	Add a stern warning about using the old PDF interpreter	Chris Liddell	2022-08-17	1	-0/+1
\|
*	Fix breakage caused by 418295bc6028635222e1d871b73230c761a87242	Chris Liddell	2022-08-05	1	-1/+1
\| \| \| \| \|	The page device dictionary doesn't necessarily contain an /Overprint key, so check it exists before we try to use it.
*	New PageList processing, supporting PDF and XPS random order.	Ray Johnston	2022-07-30	1	-131/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Supports out-of-order ranges (if the parser allows it and disables the PageHandler, i.e., flp device). Also adds support for ranges appended to the "even" and "odd" keyword following a ":". As before a trailing "-" in a range implies the last page, and as was supported by the previous 'gxps' code, a leading "-" also is the last page. For example, with XPS or PDF: -sPageList=odd:3-7,even:4-8,1-,-1,9 prints pages: 3, 5, 7, 4, 6, 8, 1, 2, ..., last, last, last-1, ..., 1, 9 The PageList string is parsed using C code into an array that consists of an initial int that is > 0 if the list is ordered, followed by sets of 3 integers per range, even/odd flag (odd=1, even=2), start, end. The final 3 ints are 0,0,0 as a marker. The initial int is used by 'pagelist_test_printed' as an index to the next range to be processed when the PageList is used for languages that can only be processed sequentially (e.g. PS and PCL) and is updated when the page passes the end of the current range. A value of -1 means the ranges are not ordered (not strctly increasing). The flp_fillpage is changed to ignore errors from processing the PageList performed by ParsePageList (called from SkipPage) when PageCount is 0 so that parsers that support out of order processing (PDF and XPS) can continue until later. This should have little or no performance impact since it is limited to PageCount == 0. Note that the new PDF parser also uses the C code parser and then uses the array of ranges returned by ".PDFparsePageList". The old PostScript based parser has not been updated, although it is easy to do so.
*	Bug 705652: Add user defined fontmap to permit read paths	Chris Liddell	2022-07-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	The Postscript interpreter parses the Fontmap file during initialisation, so before file access permission enforcement has been activated, so a user defined fontmap file need not be added to the permit-file-read list. But pdfi parses the fontmap on demand, thus after file access permissions are active. So when calling pdfi from Postscript, the Postscript interpreter needs to add the file and path to the permit-file-read list so pdfi can access it.
*	Store /EPSBoundingBoxString in global VM	Chris Liddell	2022-07-06	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \|	When doing the special processing for EPS files, we store the string from the %%BoundingBox comment in systemdict so it can be accessed later for doing fit page functionality. Problem is, the special EPS handling is done in between save/restore operations, meaning the string is restored away while a reference to it remains in systemdict, potentially causing problems. Moving the string to global VM resolves the problem
*	SUBSTFONT functionality	Chris Liddell	2022-07-04	2	-3/+7
\| \| \| \| \| \| \| \| \| \| \|	This was missing from pdfi, so is added here. Also, change the functionality so setting the SUBSTFONT key to "/None" will result in any attempt to fall back to the default font to throw an invalidfont error. Lastly, fix some typos on the parameters being passed from Postscript into pdfi.
*	GhostPDF - Handle Outlines in multiple input files with pdfwrite	Ken Sharp	2022-05-31	1	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Bug #705218 "table of contents/bookmarks wrong after merging PDF files" Part 2 of 2 for this bug. The pdfwrite device is capable of processing pdfmarks to produce a variety of effects in the output. In this case we can add Link annotations and Outlines to a PDF file. When the PDF interpreter supports reading these from an input file we also write them to the output file using (effectively) pdfmark operationjs. The problem is that when we have multiple input files and the Dest of a Link annotation or Outline is a page, that page refers to the page number in the original file (eg 1). That won't be the correct page number in the output file because it already contains the pages from any previous files. So what we need to do is offset the page number of the destination by the number of pages already in the output. That's what this commit does; for Ghostscript processing we store the number of pages in the device and update at the end of each file by the number of pages in that file. We send the number of pages to the (newly created) interpreter at the start of every file in the dictionary argument we supply to .PDFInit. For GhostPDF it is simpler because we handle all the files, we just update the counter in the PDF context by the number of pages processed in the file we have just completed. We simply add the offset to the page number when creating the Dest pdfmark. This commit also extends the named destination processing to handle name trees with more than a single node by recursively processing the Kids array in each node. We also use the Limits to more quickly determine if a node contains our target, rather than checking each entry in the Names array for a match.
*	GS + GhostPDF - Honour PDFSTOPONERROR	Ken Sharp	2022-05-24	1	-35/+80
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we ran the PDF interpreter from the PostScript interface more or less totally in a stopped context, which meant that it never exited with an error status, even when the file had an error. We now use PDFSTOPONERROR to determine whether the PDF interpreter should run in a stopped context or should actually return a PostScript error, which makes the interpreter exit with a non-zero status. This causes two files, output from pdfwrite, to throw errors where they previously did not when tested on the cluster. This is actually correct behaviour when compared with the action of the old PDF interpreter.
*	GS + GhostPDF - close the PDF file when complete	Ken Sharp	2022-05-19	1	-1/+20
\| \| \| \| \| \| \| \| \| \|	The old code relied upon runpdf closing the file supplied to the /run (redefined PS operator) function instead of closing the file itself. For compatibility this now does the same with the new PDF interpreter code. Note that unless PDFSTOPONERROR is true we will execute the PDF file in a stopped context, so as to allow us to close the file even in the event of an error.
*	GS + GhostPDF - only set PageSpotColors for spot-capable devices	Ken Sharp	2022-05-18	1	-4/+8
\| \| \| \| \| \| \| \| \| \|	Sending PageSpotColors to the txtwrite device was causing it to throw an error on setpagdevice as it did not process the key. While that shouldn't really cause a problem (and doesn't with the following commit) we don't really want to set the key for devices which aren't capable of producing spots anyway. This only affects the Ghostscript implementation of GhostPDF.
*	Fix GhostPDF integration with GS - Use of /Page#	Ken Sharp	2022-05-12	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We added /Page# to the dictionary returned from pdfgetpage in order to get pdf2dsc.ps to work (it relies upon that key being present) and potentially other PostScript programs as well. The key was added to the dictionary in pdfdopages. However, after adding it, we then proceeded to assume it would always be present, even if we called pdfpage and friends directly, rather than using pdfdopages. Fix that assumption here by adding the key to the dictionary in pdfgetpage instead of pdfdopages. There's no way (using the old PostScript code) to use any of the other functions without using pdfgetpage first.
*	GhostPDF - revamp PDF information extraction	Ken Sharp	2022-05-10	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A customer requested that we make pdf_info.ps work with the new PDF interpreter, and generate the same information. This commit modifies the way we extract information on a page-by-page basis to potentially include the names of spot inks and information about fonts used on the page. This is now returned to the PostScript environment using a PDF dictionary instead of a C structure. The pdf_info.ps program has been updated so that it use the new information in broadly the same way as the information from the old PDF interpreter. There are differences; pdf_info.ps extracts font information itself, rather than having the interpreter do it. This is not possible with the new interpreter which is why we have the PDF interpreter do it for us. In addition the pdf_info.ps program only descended to the page level whereas the new PDF interpreter evaluates all objects on the page, potentially meaning that more fonts (and technically spot inks) might be detected. We now have an additional PostScript operator '.PDFPageInfoExt' which returns 'extended' information about a page. This is the same as .PDFPageInfo but includes the font and spot ink information. Running with -dPDFINFO using either Ghostscript or GhostPDF will print more information than before, including the spot inks and considerably more information about fonts than the pdf_info.ps program emits, including embedding status, descendant fonts (and their membedding status) and the presence of ToUnicode CMaps. Updated documentation for all of the above.
*	Bug 705250: Maintain glyph name -> GID -> charstring mapping	Chris Liddell	2022-04-25	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	In Type 2 fonts, we were "collapsing" the mapping of the glyph name to GID to charstring mapping into a single glyph name to charstring map. That is contrary to the description in the PLRM 3rd Edition. This becomes a problem when attempting to re-encode the font by replacing the glyph name to GID encoding in the CharStrings dictionary, where we expect to find a charstring, we find an integer (GID). This changes how we create Type 2 fonts from CFF so it maintains that two step mapping, and re-encoding will now work.
*	Slight tidy up of /newpdf_gather_parameters	Chris Liddell	2022-04-14	1	-15/+15
\| \| \| \| \| \| \|	This makes the procedure style a bit more like the rest of the PS init code, and makes it both a little efficient and (I think) a little clearer. Noticed in passing when working on oss-fuzz 46672.
*	Fix 'knownoget' for pdfi with Ghostscript	Ken Sharp	2022-03-09	1	-1/+1
\| \| \| \| \| \| \| \|	As per commit 958c044dbbf140b893874c6d634ac71400ea5a12 but this time for the definition of knownoget. In fact, given the way pdf2dsc uses this, it doesn't actually cause a problem, but leaving junk on the stack is bad practice, so let's fix it.
*	Fix 'pget' for pdfi with Ghostscript	Ken Sharp	2022-03-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	The utility program pdf2dsc.ps expects that both pget and knownoget will be available in the current dictionary after runpdfbegin is executed. To facilitate this pdfi makes such a definition but, unfortunately, it had a bug if the key was not found in the dictionary, leaving a copy of the dictionary on the operand stack when it should not. Fixed here. Thanks to William Bader for spotting the bug.
*	PDF interpreter - fix FitPage	Ken Sharp	2022-03-03	1	-3/+1
\| \| \| \| \| \| \| \| \|	When rewriting FitPage for the new interpreter I seem to have left part of the debugging code in place, which led to two incorrect values on the stack when we calculated the scale factor. I general these values were zero, leading to a scale factor of 0 and nothing being drawn. Fixed here by removing the extraneous values.
*	If simulate overprint, make sure patterns use transparency	Michael Vrhel	2022-03-02	1	-0/+19
\| \| \| \| \| \| \| \| \| \|	Thanks to Ken Sharp for providing the solution for this. If we are doing pattern simulation with the pdf14 device then the patterns should also do this. Otherwise we can get seg faults. For example: -sDEVICE=bitrgbtags -dNEWPDF=false -dOverprint=/simulate -r72 -o output.ppm -f tests_private/comparefiles/Bug693541.pdf
*	Old PDF interpreter - improved CMap handling	Ken Sharp	2022-03-02	1	-3/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The file supplied for this came from a customer who asked that we delete the file after working on it, so there is no bug report and no reproducer for this. The problem is that the PDF file has a font using a CMap, and that CMap uses /UseCMap (and the usecmap operator) to read a different (Horizontal) CMap and then modify it with Vertical glyph positions. The CMap does not have a begincodespacerange, it simply inherits the ranges from the child CMap. This causes the code added to work around Bug 690737 to read off the end of the CMap in read_CMap_stream. Since there is nothing left to process, this causes errors in the CMap processing. Following a suggestion from Chris this commit first attempts the same 'discard up to the begincodespacerange) hackery, then checks to see if the stream has any bytes left. If it does we proceed as before. If there are no bytes left, then we have discarded all of the content. So we rewind the stream to the point we were at before we tried to discard the header, impose a different SubFileDecode, looking this time for a 'begincmap' and then attempt to process the CMap as normal.