About this Catalog

Alfresco Addons Catalog is a community-driven space where customers, partners, and community members can showcase add-ons, solutions, and real-world use cases built around Alfresco. Browse the catalog to quickly discover extensions and ideas that may inspire or accelerate your projects.


Want to share your own work?

Click “+ Submit Entry” to suggest a new listing, every submission is reviewed by the Hyland Team before it appears here.

Alfresco OCR Transform Engine

by Angel Borroy

Community

Alfresco Transform Engine that converts PDF files to searchable, text-layer PDFs using OCR (ocrmypdf / Tesseract). Compatible with ACS 7.0+ as a local or async T-Engine.

screenshot

Compatibility ACS 7.x, ACS 23.x, ACS 25.x

License GPL-2.0

Keywordstransformer, ocr, pdf, tesseract, text-layer

Download

About

Runs ocrmypdf (a Tesseract wrapper) inside an Alfresco Transform Engine to produce OCR’d PDFs.

  • Original PDF kept as version 1.0; OCR’d version saved as 1.1
  • Includes a companion embed-metadata Repository add-on that adds the OCR action to Alfresco Share folder rules
  • Configurable ocrmypdf arguments (e.g. --skip-text, --force-ocr, language)
  • Deployable as a local T-Engine (Community) or async T-Engine via ActiveMQ (Enterprise)
  • Community: add localTransform.ocr.url=http://transform-ocr:8090/ to Alfresco JAVA_OPTS
  • Enterprise: register URL + queue (ocr-engine-queue) with the Transform Router

Update Entry Request Removal