Jun 092013
 

PDFtk or The PDF Toolkit is an open source cross-platform tool for manipulating PDF documents. pdftk is basically a front end to the iText library (compiled to Native code using GCJ), capable of splitting, merging, encrypting, decrypting, uncompressing, recompressing, and repairing PDFs.

If pdf is electronic paper, then pdftk is an electronic stapler-remover, hole-punch, binder, secret-decoder-ring, and x-ray-glasses.
pdftk is a simple tool for doing everyday things with pdf documents. keep one in the top drawer of your desktop and use it to:

  • merge pdf documents
  • split pdf pages into a new document
  • decrypt input as necessary (password required)
  • encrypt output as desired
  • fill pdf forms with fdf data and/or flatten forms
  • apply a background watermark
  • report pdf on metrics, including metadata and bookmarks
  • update pdf metadata
  • attach files to pdf pages or the pdf document
  • unpack pdf attachments
  • burst a pdf document into single pages
  • uncompress and re-compress page streams
  • repair corrupted pdf (where possible)



Installation

The pdftk package should be available in the repository of the most common distributions so to install it you can usually use your package manager such as:

Debian, Ubuntu, Mint:

apt-get install pdftk

Fedora

yum install pdftk

Arch Linux

yaourt -Sy pdftk

Basic Usage

The basic syntax of PDFTK is the following:

pdftk input_file operations output_file

The operations corresponds to the desired action that you want to do on the files.

  • cat Concatenation
  • burst Splits a single input PDF document into individual pages.
  • dump_data Extraction of metadata, bookmarks (bookmarks) and page labels from a PDF
  • uncompress Decompression
  • attach_files Inclusion of attachments in a PDF document
  • unpack_files Extraction of attachments from a PDF document
  • fill_form Fill in PDF forms with FDF1) or XFDF2) data
  • background Applies a PDF watermark to the background of a single input PDF
  • stamp This behaves just like the background operation except it overlays the stamp PDF page on top of the input PDF document’s pages
  • generate_fdf Reads a single input PDF file and generates an FDF file suitable for fill_form
  • dump_data_fields Reads a single input PDF file and reports its metadata, bookmarks and page metrics.
  • update_info Update metadata

PDFtk Examples

Joining files

Suppose that we want to merge two documents (1.pdf and 2.pdf) in a single file (both.pdf), the command will be:

$ pdftk 1.pdf 2.pdf cat output both.pdf

The cat command assembles pages from input PDFs to create a new PDF. Use cat to merge PDF pages or to split PDF pages from documents. You can also use it to rotate PDF pages. Page order in the new PDF is specified by the order of the given page ranges, this can be useful to:

Merge specific pages from different files

Suppose that we want to merge two documents (1.pdf and 2.pdf) in this way: the first 2 pages of 1.pdf and the even pages from 10 to 20 of 2.pdf, the command will be:

pdftk A=1.pdf B=2.pdf cat A1-2 B10-20even output out.pdf

Splitting files

It’s also possible to split PDF files with pdftk. The burst option breaks a PDF into multiple files — one file for each page:

pdftk mylong_guide.pdf burst

This command will create a lot of files with names corresponding to their page numbers, such as pg_0001 to pg_0125.

Security

Encrypt a PDF document with a 128-bit key and remove all rights (default):

pdftk mydoc.pdf output mycrypted_doc.128.pdf owner_pw foo

Same as above, except password baz must also be used to open output PDF

pdftk 1.pdf output mycrypted_doc.128.pdf owner_pw foo user_pw baz

Decrypt a PDF :

pdftk mycrypted_doc.128.pdf input_pw  foo output déchiffré.pdf

Adding attachments

This feature can be useful to include a document in another format, images, or additional information with a published PDF
Pdftk can attach binary and text files to a PDF with ease. You can even specify what page of the PDF you want the attachment to appear on with a command like this:

pdftk html_tidy.pdf attach_files command_ref.html to_page 24 output html_tidy_book.pdf

This will attach to page 24 of the document html_tidy_book.pdf the html: command_ref.html

Conclusions

probably you don’t need these functionality every day, but if you have to manipulate pdf files in Linux, pdftk it’s your tool.
With it you’ll be able to do a lot of different things with ease, getting good results in no time.

References

Manipulating PDFs with the PDF Toolkit

PDFtk Examples


Popular Posts:

flattr this!

 Leave a Reply

(required)

(required)


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>