We have two automations that can help you extract text from attachments:
Extract Text From PDFs: This is used exclusively for PDF files and can only extract actual text (i.e. not text from images within a PDF, for example when documents are scanned as pictures). It can handle one attachment field. To extract text from PDFs spread across multiple fields, you'd either need multiple instances of this automation, or you'd need to first combine the attachment fields into one.
Analyze Airtable Attachments with GPT Assistants: This automation uses OpenAI's GPT Assistants API and is very powerful when it comes to analyzing attachments. It can handle multiple attachment fields at a time and supports image files (e.g. png, jpg, gif, and webp), PDFs, Word files (.doc, .docx), text files and several others. You can find out more about how this works here. The linked article also contains links to OpenAI's documentation that has current lists of file types and known limitations. As of November 2024, GPT Assistants still do not support images within PDF files.