convert producing invalid pdfs from jpgs

Questions and postings pertaining to the usage of ImageMagick regardless of the interface. This includes the command-line utilities, as well as the C and C++ APIs. Usage questions are like "How do I use ImageMagick to create drop shadows?".
Post Reply
kraftydevil
Posts: 10
Joined: 2017-05-01T22:24:31-07:00
Authentication code: 1151

convert producing invalid pdfs from jpgs

Post by kraftydevil »

I've got 2 Mac OS X machines with ImageMagick installed.

One machine always works and one machine produces unreadable pdfs about 80% of the time. Either they don't open or there are blank pages after a certain point.

I'm using the basic convert command for jpg > pdf:

Code: Select all

convert path/to/images/*.jpg name_of_pdf.pdf
There are too many pdfs to open each one to check, so I am using JHOVE to verify them, as shown here: https://superuser.com/a/1204692/379229.

At first I thought it might be a version issue, but it always works on an older version:

Machine 1 (always works)

Code: Select all

$ convert --version
Version: ImageMagick 6.9.2-6 Q16 x86_64 2015-11-15 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2015 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Features: Cipher DPC Modules 
Delegates (built-in): bzlib freetype jng jpeg ltdl lzma png tiff xml zlib
$
$ which gs
/usr/local/bin/gs
$ /usr/local/bin/gs --version
9.18
Machine 2 (mostly doesn't work)

Code: Select all

$ convert --version
Version: ImageMagick 7.0.5-5 Q16 x86_64 2017-04-25 http://www.imagemagick.org
Copyright: © 1999-2017 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Features: Cipher DPC HDRI Modules 
Delegates (built-in): bzlib freetype jng jpeg ltdl lzma png tiff xml zlib
$
$ which gs
/usr/local/bin/gs
$ /usr/local/bin/gs --version
9.21
So the question is...

Are there any other dependencies or configurations I should check to ensure I get the same results no matter what machine I'm using?
Last edited by kraftydevil on 2017-05-02T03:36:41-07:00, edited 1 time in total.
Bonzo
Posts: 2971
Joined: 2006-05-20T08:08:19-07:00
Location: Cambridge, England

Re: convert producing invalid pdfs from jpgs

Post by Bonzo »

At first I thought it might be a version issue, but it always works on an older version:
Bugs can be written in by mistake and it is not as though both versions are V6 versions. Are you using the same version of Ghostscript?

V7 prefers magick rather than convert; give that a go.
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: convert producing invalid pdfs from jpgs

Post by snibgo »

IM uses Ghostscript to read PDFs, but not to write PDFs.

This may be a bug introduced in v7.0.5-5. kraftydevil: can you make a reproducible example? A test command that, when run, makes a bad PDF? Then other people can test.
snibgo's IM pages: im.snibgo.com
kraftydevil
Posts: 10
Joined: 2017-05-01T22:24:31-07:00
Authentication code: 1151

Re: convert producing invalid pdfs from jpgs

Post by kraftydevil »

Bonzo wrote: 2017-05-01T23:47:26-07:00
At first I thought it might be a version issue, but it always works on an older version:
Bugs can be written in by mistake and it is not as though both versions are V6 versions. Are you using the same version of Ghostscript?

V7 prefers magick rather than convert; give that a go.
gs is 9.18 in the working machine and 9.21 in the one that isn't. Updated OP with gs info.

I tried the magick command with the same syntax. It almost worked. For whatever reason the last page simply said "PDF" with a white background on it. When I tried it 2 more times I got a corrupt pdf.
kraftydevil
Posts: 10
Joined: 2017-05-01T22:24:31-07:00
Authentication code: 1151

Re: convert producing invalid pdfs from jpgs

Post by kraftydevil »

snibgo wrote: 2017-05-02T00:17:48-07:00 IM uses Ghostscript to read PDFs, but not to write PDFs.

This may be a bug introduced in v7.0.5-5. kraftydevil: can you make a reproducible example? A test command that, when run, makes a bad PDF? Then other people can test.
This is reproducible about 80% of the time:

Code: Select all

convert path/to/images/*.jpg name_of_pdf.pdf
Will that work? Unfortunately I can't share the image files.
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: convert producing invalid pdfs from jpgs

Post by snibgo »

By "reproducible" I mean by other people. If no-one can reproduce your problem, they can't fix it.
snibgo's IM pages: im.snibgo.com
Bonzo
Posts: 2971
Joined: 2006-05-20T08:08:19-07:00
Location: Cambridge, England

Re: convert producing invalid pdfs from jpgs

Post by Bonzo »

Will that work? Unfortunately I can't share the image files.
Can you make a pdf file from the original software that does not contain sensitive information?
kraftydevil
Posts: 10
Joined: 2017-05-01T22:24:31-07:00
Authentication code: 1151

Re: convert producing invalid pdfs from jpgs

Post by kraftydevil »

snibgo wrote: 2017-05-02T04:07:51-07:00 By "reproducible" I mean by other people. If no-one can reproduce your problem, they can't fix it.
I suppose I can only share my personal experience and frequency when reproducing this issue.

Here's a formal write up for others to try. It's not much more complex than what I've mentioned in the OP:

Environment

Code: Select all

$ convert --version
Version: ImageMagick 7.0.5-5 Q16 x86_64 2017-04-25 http://www.imagemagick.org
Copyright: © 1999-2017 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Features: Cipher DPC HDRI Modules 
Delegates (built-in): bzlib freetype jng jpeg ltdl lzma png tiff xml zlib
$
$ which gs
/usr/local/bin/gs
$ /usr/local/bin/gs --version
9.21
Example Images Used
https://drive.google.com/file/d/0Bx20O2 ... zhpd2RJbWM

Steps to reproduce
  1. Put the jpgs into a directory
  2. cd to the directory
  3. Run:

    Code: Select all

    convert *.jpg Example.pdf
Expected Result
convert creates a pdf document named Example.pdf in the same directory. It should contain X pages where each page corresponds to a passed in jpg that was matched by the "*.jpg" argument. The document should be viewable in a pdf reader.

Actual Result
The pdf document is created, but it cannot be opened by the Preview app. Attempting to do so yields an error message: 'The file “Example.pdf” could not be opened. It may be damaged or use a file format that Preview doesn’t recognize.'

It can be opened, however, by the Adobe Reader app but there are several blank pages.

Here's some output from the jhove validation tool I mentioned in my OP:

Code: Select all

$ jhove -m pdf-hul Example.pdf
Jhove (Rel. 1.16.6, 2017-04-27)
 Date: 2017-05-02 08:39:43 EDT
 RepresentationInformation: Example.pdf
  ReportingModule: PDF-hul, Rel. 1.8 (2017-03-14)
  LastModified: 2017-05-02 08:39:03 EDT
  Size: 3640291
  Format: PDF
  Version: 1.3
  Status: Well-Formed, but not valid
  SignatureMatches:
   PDF-hul
  ErrorMessage: Invalid page tree node
   Offset: 1779234
  MIMEtype: application/pdf
  PDFMetadata: 
   Objects: 118
   FreeObjects: 1
   IncrementalUpdates: 0
   DocumentCatalog: 
    PageLayout: SinglePage
    PageMode: UseNone
   Info: 
    Title: Example
    Producer: /usr/local/Cellar/imagemagick/7.0.5-5/share/doc/ImageMagick-7//index.html
    CreationDate: Tue May 02 08:39:03 EDT 2017
    ModDate: Tue May 02 08:39:03 EDT 2017
   ID: 0xb7539159b1a1e74cfd5e38ebde538b4a9caa08e091b6e318b7b4264f57747113, 0xb7539159b1a1e74cfd5e38ebde538b4a9caa08e091b6e318b7b4264f57747113
   Filters: 
    FilterPipeline: DCTDecode
   Images: 
    Image: 
     NisoImageMetadata: 
      CompressionScheme: JPEG
      ImageWidth: 736
      ImageHeight: 1091
      BitsPerSample: 8
      BitsPerSampleUnit: integer
     Name: Im0
    Image: 
     NisoImageMetadata: 
      CompressionScheme: JPEG
      ImageWidth: 1057
      ImageHeight: 1500
      BitsPerSample: 8
      BitsPerSampleUnit: integer
     Name: Im1
    Image: 
     NisoImageMetadata: 
      CompressionScheme: JPEG
      ImageWidth: 1584
      ImageHeight: 2129
      BitsPerSample: 8
      BitsPerSampleUnit: integer
     Name: Im2
    Image: 
     NisoImageMetadata: 
      CompressionScheme: JPEG
      ImageWidth: 1000
      ImageHeight: 1409
      BitsPerSample: 8
      BitsPerSampleUnit: integer
     Name: Im3
   Pages: 
    Page: 
     Sequence: 1
     Thumb: true
    Page: 
     Sequence: 2
     Thumb: true
    Page: 
     Sequence: 3
     Thumb: true
    Page: 
     Sequence: 4
     Thumb: true
The main points I see from jhove's output is that the Example.pdf is "Well-Formed, but not valid" and the error given is "Invalid page tree node". Maybe there's more but I don't know everything I'm looking for there.

Maybe it's flat out broken, but I'm guessing it's more related to my environment.

I don't have a huge knowledge of the ImageMagick space or even general image processing so for those who do, please request more specific information.
snibgo
Posts: 12159
Joined: 2010-01-23T23:01:33-07:00
Authentication code: 1151
Location: England, UK

Re: convert producing invalid pdfs from jpgs

Post by snibgo »

In your Zip file, you included Example.pdf. Adobe Acrobat Reader DC says "There was a problem reading this document (14)."

When I "convert *.jpg x.pdf" with IM v6.9.5-3, or "magick *.jpg x2.pdf" with v7.0.3-5, Adobe Reader reports no problems.

Can someone confirm that v7.0.5-5 creates a bad PDF?
snibgo's IM pages: im.snibgo.com
User avatar
fmw42
Posts: 25562
Joined: 2007-07-02T17:14:51-07:00
Authentication code: 1152
Location: Sunnyvale, California, USA

Re: convert producing invalid pdfs from jpgs

Post by fmw42 »

I have tested the folder of jpegs on IM 7.0.5.5. Q16 Mac OSX with GS 9.21 and it seems to work fine. I get no error messages and Identify says there are 8 pages. But when I try to view the pdf on several viewers including Acrobat Reader, only the first 4 pages show. The last 4 are blank.

Oddly, it also fails the same way in IM 6.9.8.4 Q16 Mac OSX and GS 9.21.
User avatar
GeeMack
Posts: 718
Joined: 2015-12-01T22:09:46-07:00
Authentication code: 1151
Location: Central Illinois, USA

Re: convert producing invalid pdfs from jpgs

Post by GeeMack »

snibgo wrote: 2017-05-02T07:08:50-07:00Can someone confirm that v7.0.5-5 creates a bad PDF?
I haven't tried the examples above, but I just encountered a possibly related issue today with 7.0.5-5 on Windows 10. I created a four layer document with PhotoShop Elements, four simple 8.5 by 11 inch images. I can read the PSD file with IM and correctly convert it into four PNGs or four JPGs, etc. When I try to convert the PSD to a PDF with a simple, no frills conversion...

Code: Select all

magick fourlayers.psd[1-4] fourpages.pdf
... the command runs without reporting any errors, and produces a PDF, but the finished file generates an error when opening in Acrobat. Then after I click the "OK" button to clear the error message, the file appears to be four blank pages. The first page is full size, and the remaining three just look like tiny squares vertically aligned below the first page.
d-ph
Posts: 1
Joined: 2017-05-04T11:49:57-07:00
Authentication code: 1151

Re: convert producing invalid pdfs from jpgs

Post by d-ph »

User avatar
dlemstra
Posts: 1570
Joined: 2013-05-04T15:28:54-07:00
Authentication code: 6789
Contact:

Re: convert producing invalid pdfs from jpgs

Post by dlemstra »

We can reproduce it and will have a patch to fix it in GIT master branch @ https://github.com/ImageMagick/ImageMagick later today. The patch will be available in the beta releases of ImageMagick @ http://www.imagemagick.org/download/beta/ by sometime tomorrow.
.NET + ImageMagick = Magick.NET https://github.com/dlemstra/Magick.NET, @MagickNET, Donate
Post Reply