Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
427 views
in Technique[技术] by (71.8m points)

linux - Merge PDF's with PDFTK with Bookmarks?

Using pdftk to merge multiple pdf's is working well. However, any easy way to make a bookmark for each pdf merged?

I don't see anything on the pdftk docs regarding this so I don't think it's possible with pdftk.

All of our files merged will be 1 page, so wondering if there's any other utility that can add in bookmarks afterwards?

Or another linux based pdf utility that will allow to merge while specifying a bookmark for each individual pdf.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can also merge multiple PDFs with Ghostscript. The big advantage of this route is that a solution is easily scriptable, and it does not require a real programming effort:

gswin32c.exe ^
          -dBATCH -dNOPAUSE ^
          -sDEVICE=pdfwrite ^
          -sOutputFile=merged.pdf ^
          [...more Ghostscript options as needed...] ^
          input1.pdf input2.pdf input3.pdf [....]

With Ghostscript you'll be able to pass pdfmark statements which can add a Table of Content as well as bookmarks for each additional source file going into the resulting PDF. For example:

gswin32c.exe ^
          -dBATCH -dNOPAUSE ^
          -sDEVICE=pdfwrite ^
          -sOutputFile=merged.pdf ^
          [...more Ghostscript options as needed...] ^
          file-with-pdfmarks-to-generate-a-ToC.ps ^
          -f input1.pdf input2.pdf input3.pdf [....]

or

gswin32c.exe ^
          -dBATCH -dNOPAUSE ^
          -sDEVICE=pdfwrite ^
          -sOutputFile=merged.pdf ^
          [...more Ghostscript options as needed...] ^
          file-with-pdfmarks-to-generate-a-ToC.ps ^
          -f input1.pdf ^
             input2.pdf ^ 
             input3.pdf [....]

For some introduction to the pdfmark topic, see also Thomas Merz's PDFmark Primer.


Edit:
I had wanted to give you an example for file-with-pdfmarks-to-generate-a-ToC.ps, but somehow forgot it. Here it is:

[/Page 1 /View [/XYZ null null null] /Title (File 1) /OUT pdfmark
[/Page 2 /View [/XYZ null null null] /Title (File 2) /OUT pdfmark
[/Page 3 /View [/XYZ null null null] /Title (File 3) /OUT pdfmark
[/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark 

This would create a ToC for the first 4 files == first 4 pages (since you guarantee your ingredient files are 1 page each for your merged output PDF).

  1. The [/XYZ null null null] part makes sure your page viewport and zoom level does not change from the current one when you follow the link. (You could say [/XYZ 222 111 2] to do this, if you want an arbitrary example.)
  2. The /Title (some string you want) thingie determines what text is in the ToC.

And, you could even add these parameters to the Ghostscript commandline directly:

gswin32c.exe ^
       -o merged.pdf ^
       [...more Ghostscript options as needed...] ^
       -c "[/Page 1 /View [/XYZ null null null] /Title (File 1) /OUT pdfmark" ^
       -c "[/Page 2 /View [/XYZ null null null] /Title (File 2) /OUT pdfmark" ^
       -c "[/Page 3 /View [/XYZ null null null] /Title (File 3) /OUT pdfmark" ^
       -c "[/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark" ^
       -f input1.pdf ^
          input2.pdf ^ 
          input3.pdf ^ 
          input4.pdf [....]



'nother Edit:

Oh, and by the way: Ghostscript does preserve the bookmarks when you use it to merge two PDF files into one -- pdftk.exe does not. Let's use the one generated by the command of my first edit (effectively concatenating 2 copies of the same file):

 gswin32c ^
    -sDEVICE=pdfwrite ^
    -o doublemerged.pdf ^
     merged.pdf ^
     merged.pdf

The file doublemerged.pdf will now have 2*4 = 8 bookmarks.

  • What's as expected: bookmarks 1, 2, 3, and 4 link to pages 1, 2, 3 and 4.
  • The problem is, that bookmarks 5, 6, 7 and 8 also link at pages 1, 2, 3 and 4.

The reason is, that the pre-existing bookmarks did address their link targets by absolute page numbers. To work around that (and bookmarks work in merged files), one would have to generate bookmarks which do point to link targets by named destinations (and make sure these are uniq across documents which are merged).

(This approach also works on linux, just use gs instead of gswin32c.)


Appendix

Above command line uses [...more Ghostscript options as needed...] as a place holder for more options.

If you do not use other options, Ghostscript will apply its built-in defaults for various parameters. However, this may give you results which may not to your liking. Since Ghostscript generates a completely new PDF based on the input, this means that some of the original objects may be changed. This is true for color spaces and for image compression levels.

How to apply parameters which leave the originally embedded images unchanged can be seen over at SuperUser: "Use Ghostscript, but tell it to not reprocess images".


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...