Processors

Overview

Tasks are run for each document that has been read. A number of tasks can be set to remove documents from the queue if they don't fit certain criteria. Documents removed this way will not be processed by any following tasks. It should be noted that Calculated Fields<mapping link> are calculated before tasks, while Fields Mappings are handled after tasks.

The following is an updated list of Job Tasks for 3Sixty. Click on the details link following each definition for details on its purpose, how and when to use it, and how to configure it. The fields in each task are validated


Commonly Used

General

  • Basic JDBC: Retrieve metadata from an outside SQL database during a migration, adding it to the repository document before mappings are performed.

  • Buffer Content to File System: Buffer files to a temporary directory.

  • Convert Array Value To String: Converts a property on a repository document from an array to a String.

  • Convert String to Boolean: Check the value of a string and return true of it matches the expected value.

  • Convert To UTC: Convert date fields to the given format, with the given offset.

  • Date-Based Folder Path: Take one of the date fields on the Repository Document and use it to generate the parent folder path for the document.

  • Field Lookup: Perform a look-up operation and update the matching fields of the repository documents.

  • HTTP: The purpose of this task is to execute a GET or POST HTTP call for each repository document. If the status is 200 (OK), then processing will continue, if not, the document will be skipped.

  • JavaScript Processing: Run JavaScript against a repository document during processing.

  • Pause: This task can be useful when outputting to a repository with rate limits, such as Box or SharePoint.

  • Remove Mappings: Remove mappings from a repository document if the field has no values.

  • Two Way Sync: Filter out any unnecessary documents when doing an incremental sync between two systems.

Metadata

File

  • Attach Content - External Repository: This allows the user to read metadata from any repo connector and attach the file to it that matches the metadata field given.

  • Attach Content - File System: This allows the user to read metadata from the file system connector and attach the file to it that matches the metadata field given.

  • Attach Content - FTP: Sets the content of a document to the content found on an FTP server at the location found using the Path Expression.

  • Attach Content - S3: Attach the content to the repository document from an S3 bucket.

  • Filter Expression: Allows you to remove files based on the expression used.

  • Generate Thumbnail: Generate a thumbnail for a repository document and adds it as a rendition.

  • Hash Value Generator: Creates a hash of the document content and sets it on the repository document.

  • HTML to PDF: This task takes a single argument, which is a file path to an XHTML stylesheet ( *.xsl). The binary stream is taken from each repository document and converts it to a pdf using the template.

  • Property XML Parser Job

  • Remove Renditions Matching Binary Mime Type: Remove all renditions that have the same MimeType as the Original Document Binary.

  • Skip Blank File Name: Skips a file during migration if its file name is blank.

  • Skip On Empty Field: This task will skip a document if the supplied fields are all blank.

  • Text to PDF: Convert text binaries from a Repository Document into a PDF file on output.

  • Text/HTML to EML: Converts text or HTML Files to email messages.

  • Unzip: Unzips compressed files.

Text Extraction

  • AWS Textract: Extracts text from PNGs, JPGs, and PDFs and stores it on the repository document in the simflofy_ai_text field.

  • Google Vision Text Extraction: Extracts text from .tiff, .PDF and .gif files and stores it on the repository document in the simflofy_ai_texts field.

  • Tesseract Text Extraction: Use Tesseract OCR to scan for text from images and PDF files, saving that text to a field in the repository documented called simflofy_ai_texts.

Image Analysis

  • AWS Image Recognition: Detects real world objects in images and adds these labels to the repository document on the field simflofy_ai_labels using the AWS Rekognition system.

  • Google Vision Image Labels: Detects real world objects in images and adds these labels to the repository document on the field simflofy_ai_labels using Google Vision.

  • Watson Image Analysis: Use IBM Watson Image Analysis to analyse an image, adding its response to a specified field.

Alfresco

  • Alfresco Job Run History Nodes: Getting the Alfresco Node Reference from the Current Job Run History, in order to update an existing document from a previous Job run in Alfresco, rather than creating a new one.

  • Alfresco Property Mapping Nodes: Get existing node references in an Alfresco instance, in order to update them, rather than create a new one if the existing file has moved from its original ingestion location.

ACL

  • CMIS ACL Modification: When used with a CMIS Repository connection it will use the Repository Document id, and gather the current ACL for the document. It will then generate a new ACL based on the parameters.

  • FileNet ACL Modification: Change the permission lists of integrated documents in the IBM FileNet Repository.

  • File System ACL Extraction: Extracts ACLs from the Windows or Linux filesystem document and adds them to the repository document.

  • Generic ACL Mapper: Create simple rules for matching principles and permissions from one system to another.

Classifier

  • Open AI Compatible Chat Completion Task: Using Artificial Intelligence for processing document in a multitude of ways by using OpenAI Chat Completions Task. Q&A, Metadata Extraction, Text Summarization and PII Detection are just a handful of many example use cases for this task.

  • Open AI Compatible Embeddings Task: Creating Embeddings by using OpenAI Embeddings Task. Embeddings can be used to enable Clustering, Semantic Search and many other NLP tasks.

  • Redactor: Will search the content of PDF documents while migrating and redact any words phrases or patterns based on the targets set in the task's configurations

Others

  • File Format Converter: This task will convert files of one format to another format during migration. It can also convert images or non text files into searchable content PDFs.

  • Index User Groups: Used for Search Security to index user and group information onto each document.

  • Lookup Destination Id From Job Run History Task: Getting the rd destination Id from the Current Job Run History, in order to update an existing document from a previous Job run, rather than creating a new one.

  • Remove Empty Fields: Removes fields from the metadata of a document if there is no value set in the field.


 

Related Articles:

Getting Started With 3Sixty

Adding Tasks to Integration Jobs

Federated Discovery Widgets