The Document to Text step lets you extract the text content from a document file. We use deterministic AI models to parse content from a variety of file types, including PDFs, Word documents, and more.

Document to Text step

Options

NameTypeDescription
FileDocument FileThe document you want to extract text from.

Outputs

NameTypeDescription
File ContentsPlain TextThe text extracted from the document.

Tips

  • If the document has structured elements, such as tables or multiple side-by-side elements, the extracted text may be parsed incorrectly. We try our best to remediate this issues, but we recommend cleaning the output up with a Generate Text step if this happens consistently.