home / content / news

news: 6

This data as json

rowid date body
6 2022-06-30 [s3-ocr](https://datasette.io/tools/s3-ocr) is a new tool which can run OCR (via Amazon Textract) against every PDF file in an S3 bucket and write the results to a searchable SQLite database, ready to use with Datasette. Read more about it in [s3-ocr: Extract text from PDF files stored in an S3 bucket](https://simonwillison.net/2022/Jun/30/s3-ocr/).
Powered by Datasette · Queries took 0.857ms