Page 1 of 1

Convert pdf to png -> exact same aspect

Posted: 2019-02-28T08:43:57-07:00
by zikmout
Hi,

I have been struggling to find the answer to my question as the keywords I used match with other answers that are not related to it.

What I want to do is extremely simple: I want to convert a pdf to png and I want the png file to keep the exact same aspect (margins, background, proportions, if the original page of pdf is portrait I want the png in portrait, if it is landscape I want the output png to be landscape, no crop, etc)

Basically I want nothing to change from pdf to png appart from the format. Is it possible ?

The idea is to be able to convert the png file back to pdf file after without modifying any locations of the 'things' that are in the pages (tables, texts, etc)

Thank you a lot.

Simon

Re: Convert pdf to png -> exact same aspect

Posted: 2019-02-28T08:52:27-07:00
by snibgo
Does this do what you want?

Code: Select all

magick in.pdf out.png
If not, what's wrong with it?

Re: Convert pdf to png -> exact same aspect

Posted: 2019-02-28T09:13:21-07:00
by zikmout
Thanks for your quick reply. Using the command you mentionned.

For instance take this pdf file : https://www.dropbox.com/s/86rmsj6fr69g0 ... s.pdf?dl=0

When I convert it to png I get this one : https://www.dropbox.com/s/pj32zz8bzb4cv ... s.png?dl=0
-> This png already seems to have taken off margins, so that if the coordinates of the tables in the first pdf file are not the same as the coordinates of the table in the generated png file

And when I convert it back (from the generated png to pdf), I get this one : https://www.dropbox.com/s/nrvkmcka1i1lejr/back.pdf?dl=0
-> the generated pdf has a completely different layout from the first pdf

To explain my use case so you can better understand: I want to convert hermes.pdf to hermes.png so that I can do some machine learning on the png file and identifiy the table bbox. Once I have identified those, I want to be able to use these coordinates on the original pdf so that I can locate the coordinates of original pdf table on the original pdf document. I don't need to necessarily convert back the png to pdf again, I just want that the coordinates of the identified tables on my png are consistent with the original pdf file it comes from.

Cheers

Re: Convert pdf to png -> exact same aspect

Posted: 2019-02-28T09:53:50-07:00
by snibgo
zikmout wrote:This png already seems to have taken off margins, ...
No it hasn't. The PDF is transparent beyond the table. Those pixels are transparent in the PNG. We can make them visible...

Code: Select all

magick in.pdf -background Blur -layers Flatten out.png
... but that has no effect on the table coordinates.

IM rasterizes the PDF, converting vector to pixels. Converting that back to PDF won't vectorize the image.