Creating PDF documents using wkhtmltopdf

October 17, 2022 - Reading time: 2 minutes

I often use wkhtmltopdf to generate a PDF from web content with a common use case being the generation of invoices. More recently, I have also used it to generate data remediation documents to help businesses comply with some of their regulatory requirements.

Here's the link https://wkhtmltopdf.org/

The project consists of two utilities, one to generate PDF's and the other to generate images. wkhtmltopdf describes itself as:

... an open source (LGPLv3) command line tool to render HTML into PDF and various image formats using the Qt WebKit rendering engine. It runs entirely "headless" and does not require a display or display service.

However, on our Linux virtual machines sitting in Azure, I've always installed xvfb (X virtual framebuffer) to provide a virtual graphics buffer, perfect for those VM's that don't have a graphics adapter or screen.

It is easy enough to install from the command line using:

sudo apt install xvfb

Then, install wkhtmltopdf

sudo apt install wkhtmltopdf

In most cases, I access this via an exec command in most programming languages, although you should always be careful about what you are executing before actually running these commands!

// build the command that will run via PHP.
// remember to sanitise your inputs for $path and $filename!
$wkcmd = 'xvfb-run wkhtmltopdf "'.$path.'" '.$filename;
// execute the command
exec($wkcmd);
// In this instance, I'm using pdfbox to encrypt the output PDF with the user's ID number.
exec('java -jar pdfbox-app-2.0.21.jar Encrypt -U '.$pii['id_number'].' '.$filename);
// do what ever you need to do with the generated PDF i.e.: email it, store it.
// don't forget to clean up the file system afterwards!

Most of the code I have written uses PHP to generate and render the HTML in to a PDF in a temporary directory, and if necessary, makes use of PDFBox to apply additional transformations, like password protecting the PDF.