Text Cleanup
This task replaces commonly incompatible characters in a given field
Configuration
To use this task go to the task tab in your job. Select the task from the drop down and click the plus circle to configure the task. Click done after making any changes to save.
Condition check
It will execute the task when the condition's result is 'true', 't', 'on', '1', or 'yes' (case-insensitive), or run on all conditions if left empty. This condition is evaluated for each document, determining whether the task should be executed based on the specified values.
Example: If I only want to run this task for PDF documents I would use the expression: equals('#{rd.mimetype}',"application/pdf")
Input Field
The field to perform a text cleanup on.
Output field
The field where the resulting cleaned up text will be saved.
Replace with closest Latin character
Normalizes characters to their standard normalization forms described in Unicode Standard Annex #15 — Unicode Normalization Forms (https://www.unicode.org/reports/tr15/).
Replace filename incompatible characters
Replaces filename incompatible characters (/, \, *, >, “, :, |, <) with a given text.
Replace whitespace characters
Replaces whitespace characters with a given text.
Replace non-printable characters
Replaces non-printable characters with a given text.
API Keys
Processor: textCleanupTask
Key |
Display Name |
Type |
---|---|---|
use_condition | Check a condition before executing this task. | Boolean |
task_condition |
Condition |
String |
task_stop_proc |
Stop Processing |
Boolean |
input_field |
Input Field |
String |
output_field |
Output field |
String |
replace_latin_closest |
Replace with closest Latin character |
Boolean |
replace_filename_incompatible |
Replace filename incompatible characters |
Boolean |
filename_incompatible_replacement_text |
Replace filename incompatible characters with |
String |
replace_whitespace |
Replace whitespace characters |
Boolean |
whitespace_replacement_text |
Replace whitespace characters with |
String |
replace_non_printable |
Replace non-printable characters |
Boolean |
non_printable_replacement_text |
Replace non-printable characters with |
String |