EXTRACTTEXT
Overview
The EXTRACTTEXT workflow application extracts text content from an input file (.pdf
, .docx
, or .txt
) and returns the extracted text and its length. It supports optional parameters for maximum file size, trimming, and text normalization (Unix-style line breaks).
Required parameters
Parameter
Type
Direction
Description
FILE
FILE
IN
The file from which to extract the text (must be .pdf
, .docx
, or .txt
)
TEXT
TEXT
OUT
The extracted (and possibly normalized/trimmed) text
LENGTH
NUMERIC
OUT
The length (number of characters) of the extracted text
Optional parameters
Parameter
Type
Direction
Description
MAX_FILE_SIZE
NUMERIC
IN
Maximum allowed file size in MB
TRIM_SIZE
NUMERIC
IN
Maximum number of characters to keep from the extracted text
NORMALIZE
TEXT
IN
Whether to normalize line endings Possible values:
Y
N
true
false
Last updated