Extract Text With OCR

This workflow allows you to use Optical Character Recognition (OCR) to extract text from images (including image-only PDFs). The workflow will also allow you to segment the text (if necessary) and add embeddings to ApertureDB using a pre-trained model. You can then use a query text or image to search for similar text segments, and then find the underlying image or PDF document.

This video shows how to use this workflow to extract text from images and PDFs.

Creating the workflow

Creating and deleting workflows

For general information on creating workflows in ApertureDB Cloud see Creating and Deleting Workflows.

[object Object] — Choose which media you want to extract text from, images or PDFs. You can pick both, but you should select at least one. Note that text PDFs are ignored, and only image-only PDFs are processed by this workflow.
Select an OCR method from the available models
In addition to extracting text, there is also the option to segment the text and add embeddings.
Click the blue button at the bottom.

See the GitHub repository for more information

For more detailed information about what this workflow is doing, additional information about the parameters, and how to run the workflow outside of the ApertureDB Cloud, see the ocr-extraction documentation in GitHub.

See the results

If you go to the "My Instances" page and click on "Connect" for the instance you used, you will see an option to go to the Web UI for your instance. You will see the number of descriptors in the database increase as the workflow runs. Click on the refresh button to update the count.

You can also do a "find similar" search using the "SEMANTIC SEARCH" tab in the WebUI.

Creating the workflow​

See the results​

Creating the workflow

See the results