Custom HTML to PDF document generation with iTextSharp

A common requirement is the ability to create PDF documents in code usually from some HTML source. This example show how this can be achieved with iTextSharp's XMLWorkerHelper which comes as part of the tools package.

To begin you will need to install the iTextSharp nuget package

PM> Install-Package iTextSharp

You will also need to install the XML worker.

PM> Install-Package itextsharp.xmlworker

There are a few caveats to this method however; first the HTML must be valid XHTML since we are actually using an XML worker this makes sense. 

In the most basic case lets assume everything is contained in one chunk of valid XHTML with all the styles in line and no external assets. The following block of code parses the XHTML and writes it to the PDF Document.

public byte[] Render(string contentHtml)
{
    byte[] data;

    using (MemoryStream ms = new MemoryStream())
    {
        using (Document doc = new Document())
        {
            PdfWriter writer = PdfWriter.GetInstance(doc, ms);

            doc.Open();

            using (var srHtml = new StringReader(contentHtml))
            {
                 XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, srHtml);
            }
        }
        data = ms.ToArray();
    }
    return data;
}

Because we are using XHTML as the source this could potentially be manipulated with code or taken from some other source quite easily, so long as it is valid markup.

Another common requirement is for custom headers or footers again using XHTML as the source. iTextSharp provides an interface that you can hook into to render header and footer content all you need to do is extend the PdfPageEventHelper like so:

public class PDFHeaderFooter : PdfPageEventHelper
{
    protected ElementList header;

    public PDFHeaderFooter(string headerHtml) 
    {
        header = XMLWorkerHelper.ParseToElementList(headerHtml, null);
    }

    public override void OnStartPage(PdfWriter writer, Document document)
    {
        base.OnStartPage(writer, document);

        PdfPTable headerTbl = new PdfPTable(1);
        headerTbl.WidthPercentage = 100f;

        PdfPCell cell = new PdfPCell();
        cell.Border = 0;
        
        foreach (IElement e in header)
        {
            cell.AddElement(e);
        }

        headerTbl.AddCell(cell);
        headerTbl.WriteSelectedRows(0, -1, document.Left, document.PageSize.Top, writer.DirectContent);
    }
}

Now add the custom page event to the writer so the PDF generation method will then look as follows:

public byte[] Render(string contentHtml, string headerHtml)
{
    byte[] data;

    using (MemoryStream ms = new MemoryStream())
    {
        using (Document doc = new Document())
        {
             PdfWriter writer = PdfWriter.GetInstance(doc, ms);
             writer.PageEvent = new PDFHeaderFooter(headerHtml));

             doc.Open();

             using (var srHtml = new StringReader(contentHtml))
             {
                 XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, srHtml);
             }

             data = ms.ToArray();          
        }

        return data;
    }
}

That is of course a very simplistic case but the setup can be extended to implement extra features as required.