We have two automations that can help you extract text from attachments:
Extract Text From PDFs: This is used exclusively for PDF files and can only extract actual text (i.e. not text from images within a PDF, for example when documents are scanned as pictures).
Analyze Airtable Attachments with GPT Assistants: This automation uses OpenAI's GPT Assistants API and is very powerful when it comes to analyzing attachments. It supports image files (e.g. png, jpg, gif, and webp), PDFs, Word files (.doc, .docx), text files and several others. You can find out more about how this works here. The linked article also contains links to OpenAI's documentation that has current lists of file types and known limitations. As of November 2024, GPT Assistants still do not support images within PDF files.