Open source
Background processing for long docs
Plug-in based pipelines
Ai optional
Link and image handling
Built from real-world PDFs and automated tests
What it does
Lots of important information still arrives as PDFs. This tool helps you move that content onto the website as proper pages, without spending hours manually reformatting.
With the PDF importer you can:
- upload a PDF
- extract text, restore links, and pull out images into Drupal Media
- (optionally) use AI to add structure like headings, lists, tables, sensible pagination, and page titles
- save the result into Drupal in a consistent, reviewable format
Why it matters
PDFs can be hard to read on mobile, difficult to keep up to date, and risky for accessibility. Converting a single document into clean, structured HTML can take hours (sometimes days) if you’re doing it by hand. This module reduces that grind, so teams can publish faster and spend more time improving content quality.
“PDFs are chaos.”
That’s why the importer has automated tests using a library of real-world PDFs, so we can keep improving reliability as more organisations adopt it.
Built to be reusable across Drupal
This isn’t a one-off script or a brittle one-client solution. We’re actively working to uncouple the importer from LocalGov Drupal so it can benefit more Drupal sites and distributions.
The goal is a flexible “import engine” that can:
- support different content models (pages, publications, documents, knowledge bases)
- be configured for different document types and organisational needs
- remain open source and community-driven
The importer is up for an award!!
How it works
At its simplest:
- Upload a PDF
- The importer runs an import pipeline (background processing for longer documents)
- Editors review the output in Drupal and publish
A key design decision is to process AI structure in one go for the whole document (not page-by-page). That improves consistency (titles, headings, page breaks) and avoids awkward splits mid-table or mid-list.
Open source and actively funded by partners
This module is being built in the open, with funding and collaboration:
- Prototype funded by Chicken
- v1.0 funded by Southwark Council
- v1.1 funded by West Lindsey District Council
If you want this tool to exist (and get better), partner funding is what makes that possible.
Want a demo or to explore co-funding?
Get in touch and we’ll show you the importer workflow, what it already handles well, and what we’d have build next with partners.