The Document to Text step lets you extract the text content from a document file. We use deterministic AI models to parse content from a variety of file types, including PDFs, Word documents, and more.Documentation Index
Fetch the complete documentation index at: https://docs.respell.ai/llms.txt
Use this file to discover all available pages before exploring further.
Options
| Name | Type | Description |
|---|---|---|
| File | Document File | The document you want to extract text from. |
Outputs
| Name | Type | Description |
|---|---|---|
| File Contents | Plain Text | The text extracted from the document. |
Tips
- If the document has structured elements, such as tables or multiple side-by-side elements, the extracted text may be parsed incorrectly. We try our best to remediate this issues, but we recommend cleaning the output up with a Generate Text step if this happens consistently.
Support File Types
The only mimetypes supported for OCR are:- application/pdf
- image/gif
- image/tiff
- image/jpeg
- image/png
- image/bmp
- image/webp

