wkhtmltopdf (1) - Linux Manuals

Name

wkhtmltopdf - html to pdf converter

Synopsis


  wkhtmltopdf [GLOBAL OPTION]... [OBJECT]... <output file>

Document objects

wkhtmltopdf is able to put several objects into the output file, an object is either a single webpage, a cover webpage or a table of contents. The objects are put into the output document in the order they are specified on the command line, options can be specified on a per object basis or in the global options area. Options from the Global Options section can only be placed in the global options area.

A page objects puts the content of a single webpage into the output document.


  (page)? <input url/file name> [PAGE OPTION]...

Options for the page object can be placed in the global options and the page options areas. The applicable options can be found in the Page Options and Headers And Footer Options sections.

A cover objects puts the content of a single webpage into the output document, the page does not appear in the table of contents, and does not have headers and footers.


  cover <input url/file name> [PAGE OPTION]...

All options that can be specified for a page object can also be specified for a cover.

A table of contents object inserts a table of contents into the output document.


  toc [TOC OPTION]...

All options that can be specified for a page object can also be specified for a toc, further more the options from the TOC Options section can also be applied. The table of contents is generated via XSLT which means that it can be styled to look however you want it to look. To get an idea of how to do this you can dump the default xslt document by supplying the --dump-default-toc-xsl, and the outline it works on by supplying --dump-outline, see the Outline Options section.

Description

Converts one or more HTML pages into a PDF document, not using wkhtmltopdf patched qt.

Global Options

--collate
Collate when printing multiple copies
--no-collate
Do not collate when printing multiple copies
--cookie-jar <path>
Read and write cookies from and to the supplied cookie jar file
--copies <number>
Number of copies to print into the pdf file
-d, --dpi <dpi>
Change the dpi explicitly (this has no effect on X11 based systems)
-H, --extended-help
Display more extensive help, detailing less common command switches
-g, --grayscale
PDF will be generated in grayscale
-h, --help
Display help
--htmldoc
Output program html help
--license
Output license information and exit
--log-level <level>
Set log level to: none, error, warn or info
-l, --lowquality
Generates lower quality pdf/ps. Useful to shrink the result document space
--manpage
Output program man page
-B, --margin-bottom <unitreal>
Set the page bottom margin
-L, --margin-left <unitreal>
Set the page left margin
-R, --margin-right <unitreal>
Set the page right margin
-T, --margin-top <unitreal>
Set the page top margin
-O, --orientation <orientation>
Set orientation to Landscape or Portrait
--page-height <unitreal>
Page height
-s, --page-size <Size>
Set paper size to: A4, Letter, etc.
--page-width <unitreal>
Page width
-q, --quiet
Be less verbose, maintained for backwards compatibility; Same as using --log-level none
--read-args-from-stdin
Read command line arguments from stdin
--readme
Output program readme
--title <text>
The title of the generated pdf file (The title of the first document is used if not specified)
-V, --version
Output version information and exit

Page Options

--allow <path>
Allow the file or files from the specified folder to be loaded (repeatable)
--background
Do print background
--no-background
Do not print background
--bypass-proxy-for <value>
Bypass proxy for host (repeatable)
--cache-dir <path>
Web cache directory
--checkbox-checked-svg <path>
Use this SVG file when rendering checked checkboxes
--checkbox-svg <path>
Use this SVG file when rendering unchecked checkboxes
--cookie <name> <value>
Set an additional cookie (repeatable), value should be url encoded.
--custom-header <name> <value>
Set an additional HTTP header (repeatable)
--custom-header-propagation
Add HTTP headers specified by --custom-header for each resource request.
--no-custom-header-propagation
Do not add HTTP headers specified by --custom-header for each resource request.
--debug-javascript
Show javascript debugging output
--no-debug-javascript
Do not show javascript debugging output
--encoding <encoding>
Set the default text encoding, for input
--images
Do load or print images
--no-images
Do not load or print images
-n, --disable-javascript
Do not allow web pages to run javascript
--enable-javascript
Do allow web pages to run javascript
--javascript-delay <msec>
Wait some milliseconds for javascript finish
--load-error-handling <handler>
Specify how to handle pages that fail to load: abort, ignore or skip
--load-media-error-handling <handler>
Specify how to handle media files that fail to load: abort, ignore or skip
--disable-local-file-access
Do not allowed conversion of a local file to read in other local files, unless explicitly allowed with --allow
--enable-local-file-access
Allowed conversion of a local file to read in other local files.
--minimum-font-size <int>
Minimum font size
--page-offset <offset>
Set the starting page number
--password <password>
HTTP Authentication password
--disable-plugins
Disable installed plugins
--enable-plugins
Enable installed plugins (plugins will likely not work)
--post <name> <value>
Add an additional post field (repeatable)
--post-file <name> <path>
Post an additional file (repeatable)
-p, --proxy <proxy>
Use a proxy
--proxy-hostname-lookup
Use the proxy for resolving hostnames
--radiobutton-checked-svg <path>
Use this SVG file when rendering checked radiobuttons
--radiobutton-svg <path>
Use this SVG file when rendering unchecked radiobuttons
--run-script <js>
Run this additional javascript after the page is done loading (repeatable)
--ssl-crt-path <path>
Path to the ssl client cert public key in OpenSSL PEM format, optionally followed by intermediate ca and trusted certs
--ssl-key-password <password>
Password to ssl client cert private key
--ssl-key-path <path>
Path to ssl client cert private key in OpenSSL PEM format
--stop-slow-scripts
Stop slow running javascripts
--no-stop-slow-scripts
Do not Stop slow running javascripts
--user-style-sheet <path>
Specify a user style sheet, to load with every page
--username <username>
HTTP Authentication username
--window-status <windowStatus>
Wait until window.status is equal to this string before rendering page
--zoom <float>
Use this zoom factor

Specifying A Proxy

By default proxy information will be read from the environment variables: proxy, all_proxy and http_proxy, proxy options can also by specified with the -p switch


  <type> := "http://" | "socks5://"
  <serif> := <username> (":" <password>)? "@"
  <proxy> := "None" | <type>? <string>? <host> (":" <port>)?

Here are some examples (In case you are unfamiliar with the BNF):


  http://user:password@myproxyserver:8080
  socks5://myproxyserver
  None

Reduced Functionality

This version of wkhtmltopdf has been compiled against a version of QT without the wkhtmltopdf patches. Therefore some features are missing, if you need these features please use the static version.

Currently the list of features only supported with patch QT includes:


 * Printing more than one HTML document into a PDF file.
 * Running without an X11 server.
 * Adding a document outline to the PDF file.
 * Adding headers and footers to the PDF file.
 * Generating a table of contents.
 * Adding links in the generated PDF file.
 * Printing using the screen media-type.
 * Disabling the smart shrink feature of WebKit.

Page sizes

The default page size of the rendered document is A4, but by using the --page-size option this can be changed to almost anything else, such as: A3, Letter and Legal. For a full list of supported pages sizes please see <https://qt-project.org/doc/qt-4.8/qprinter.html#PaperSize-enum>.

For a more fine grained control over the page size the --page-height and --page-width options may be used

Reading arguments from stdin

If you need to convert a lot of pages in a batch, and you feel that wkhtmltopdf is a bit too slow to start up, then you should try --read-args-from-stdin,

When --read-args-from-stdin each line of input sent to wkhtmltopdf on stdin will act as a separate invocation of wkhtmltopdf, with the arguments specified on the given line combined with the arguments given to wkhtmltopdf

For example one could do the following:


  echo "https://qt-project.org/doc/qt-4.8/qapplication.html qapplication.pdf" >> cmds
  echo "cover google.com https://en.wikipedia.org/wiki/Qt_(software) qt.pdf" >> cmds
  wkhtmltopdf --read-args-from-stdin --book < cmds

Page Breaking

The current page breaking algorithm of WebKit leaves much to be desired. Basically WebKit will render everything into one long page, and then cut it up into pages. This means that if you have two columns of text where one is vertically shifted by half a line. Then WebKit will cut a line into to pieces display the top half on one page. And the bottom half on another page. It will also break image in two and so on. If you are using the patched version of QT you can use the CSS page-break-inside property to remedy this somewhat. There is no easy solution to this problem, until this is solved try organizing your HTML documents such that it contains many lines on which pages can be cut cleanly.

Contact

If you experience bugs or want to request new features please visit <https://wkhtmltopdf.org/support.html>

Authors


  
  Jakob Truelsen              <antialize [at] gmail.com>
  Ashish Kulkarni             <ashish [at] kulkarni.dev>
  Jan Habermann               <jan [at] habermann24.com>
  Pablo Ruiz García           <pablo.ruiz [at] gmail.com>
  Trevor North                <trevor [at] blubolt.com>
  Nate Pinchot                <nate.pinchot [at] gmail.com>
  pussbb                      <pussbb [at] gmail.com>
  Aaron Stone                 <aaron [at] serendipity.cx>
  Patrick Widauer             @a-ctor
  Peter van der Tak           <pta [at] ibuildgreen.eu>
  Benjamin Sinkula            <bsinky [at] gmail.com>
  Kasper F. Brandt            <poizan [at] poizan.dk>
  Michael Nitze               <michael.nitze [at] online.de>
  Rok Dvojmoc                 <rok.dvojmoc [at] gmail.com>
  theirix                     <theirix [at] gmail.com>
  Tomsgu                      <tomasjakll [at] gmail.com>
  Artem Butusov               <art.sormy [at] gmail.com>
  Christian Sciberras         <uuf6429 [at] gmail.com>
  Daniel M. Lambea            <dmlambea [at] gmail.com>
  Douglas Bagnall             <douglas [at] paradise.net.nz>
  peterrehm                   <peter.rehm [at] renvest.de>
  Renan Gonçalves             <renan.saddam [at] gmail.com>
  Ruslan Grabovoy             <kudgo.test [at] gmail.com>
  Sander Kleykens             <sander.kleykens [at] avnu.be>
  Adam Thorsen                <adam.thorsen [at] gmail.com>
  Albin Kerouanton            <albin.kerouanton [at] knplabs.com>
  Alejandro Dubrovsky         <alito [at] organicrobot.com>
  Arthur Cinader              @acinader
  Benoit Garret               <benoit.garret [at] gmail.com>
  Bill Kuker                  <bkuker [at] billkuker.com>
  cptjazz                     <alexander [at] jesner.eu>
  daigot                      <daigot [at] rayze.com>
  Destan Sarpkaya             @destan
  Duncan Smart                <duncan.smart [at] gmail.com>
  Emil Lerch                  <emil [at] lerch.org>
  Erik Hyrkas                 <erik.hyrkas [at] thomsonreuters.com>
  Erling Linde                <erlingwl [at] gmail.com>
  Fábio C. Barrionuevo da Luz <bnafta [at] gmail.com>
  Fr33m1nd                    <lukion [at] gmx.de>
  Frank Groeneveld            <frank [at] frankgroeneveld.nl>
  Immanuel Häussermann        <haeussermann [at] gmail.com>
  Jake Petroules              <jake.petroules [at] petroules.com>
  James Macdonald             <james [at] kingfisher-systems.co.uk>
  Jason Smith                 <JasonParallel [at] gmail.com>
  John Muccigrosso            @Jmuccigr
  Julien Le Goff              <julego [at] gmail.com>
  Kay Lukas                   <kay.lukas [at] gmail.com>
  Kurt Revis                  <krevis [at] snoize.com>
  laura                       @holamon
  Marc Laporte                <marc [at] laporte.name>
  Matthew M. Boedicker        <matthewm [at] boedicker.org>
  Matthieu Bontemps           <matthieu.bontemps [at] gmail.com>
  Max Sikstrom                <max.sikstrom [at] op5.com>
  Nolan Neustaeter            <github [at] noolan.ca>
  Oleg Kostyuk                <cub.uanic [at] gmail.com>
  Pankaj Jangid               <pankaj.jangid [at] gmail.com>
  robinbetts                  <robinbetts [at] yahoo.com>
  Sem                         <spam [at] esemi.ru>
  Stefan Weil                 <sw [at] weilnetz.de>
  Stephen Kennedy             <sk4425 [at] gmail.com>
  Steve Shreeve               <steve.shreeve [at] gmail.com>
  Sven Nierlein               <sven [at] nierlein.org>
  Tobin Juday                 <tobinibot [at] gmail.com>
  Todd Fisher                 <todd.fisher [at] gmail.com>
  Костадин Дамянов            <maxmight [at] gmail.com>
  Emmanuel Bouthenot          <kolter [at] openics.org>
  Rami                        @icnocop
  Khodeir-hubdoc              @Khodeir-hubdoc
  Jonathan Jefferies          @jjok
  Joe Ayers                   <joseph.ayers [at] crunchydata.com>
  Jeffrey Cafferata           <jeffrey [at] jcid.nl>
  rainabba
  Mehdi Abbad
  Lyes Amazouz
  Pascal Bach
  Mário Silva