Elasticsearch

ElasticSearch is a search engine based on the Lucene library. It provides a distributed, multi-tenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. ElasticSearch is developed in Java.

  • Version Support: 3Sixty currently only supports version 7.17+ of ElasticSearch and does not support version 8

  • AWS Compatibility: As of September 2021 this connector will not work with AWS instances of ElasticSearch. AWS has its own version, now called OpenSearch, which is incompatible with current ElasticSearch libraries.

Tip:  To configure ElasticSearch to handle larger file sizes:
In installed elasticsearch/config/elasticsearch.yml
Set http.max_content_length to a value greater than 100MB
See https://www.elastic.co/guide/en/elasticsearch/reference/7.17/modules-network.html for more options


Authentication Connection

  • Name: Unique connection name

  • Username: Username for Authentication or blank when no auth needed.

  • Password: Password for Authentication or blank when no auth needed.

  • Server URL: Server URL with protocol, host and port http://127.0.0.1:9200/

  • Socket Timeout in milliseconds: How long to wait before requests fail


Integration Connection

Integration Connection Configuration

  • Connection Name: This is a unique name given to the connector instance upon creation.

  • Description: A description of the connector to help identify it better.

  • Authentication Connector: Your FileNet Auth Connector


Job Configuration

Note:  ID ENCODING
3Sixty uses the source repository id of a document as a default value for the id in ElasticSearch. These can sometimes contain illegal characters, especially if they are file paths, such as from a Filesystem or Amazon S3. As part of the indexing process, the value of this field will be encoded to ensure its validity. Currently, only slashes, spaces and apostrophes are encoded, but this will likely change to full encoding in the future to better support non-standard character sets.

Note:  FILE CONTENT
If Include Binaries is checked in the Details tab, the connector will convert it to a base64 encoded String and store it in the binaryData field

Note:  Note: ElasticSearch does not support writing of multiple versions of a document and will only write the latest one picked up in a migration. All other versions will be ignored instead of being audited and will not be counted as Skipped

  • ID Attribute: The field that will be used to set the document id

  • Index Name: The name of the collection where the indexes will be created.

    • If the collection already exists and does not have the required mappings, 3Sixty will attempt to update the mappings

  • Batch size: The number of documents to generate before sending a request.

  • Out Renditions as array to the renditionData field: If there are multiple renditions, they will be stored as a list of base64 encoded strings.

  • Term Vectors: Term vectors increase the size of an index but are required for highlighting and More Like This searches.

    • All text based default 3Sixty fields are included by default

    • Term vectors can only be applied to text fields.

    • Term vectors will be enabled for any custom text field added to mappings


Content Search Connection

A Content View Connector defines the who, what and how of search. A better term may be "Data Set" because the data you search and find is based on the configuration of the Content View Connection. More info

Search Configuration

Legacy Fields: All other fields in this tab are legacy features used for the Solr Search Connection and will be removed in future releases.

  • Collection: The name of the collection to query against. ElasticSearch refers to these and "Indexes", but for our purposes they are collections.

  • Sort Field/Order: Will contain the values in your field list. Allows you to choose which field to sort on and whether to sort ascending or descending.

  • Facet Fields: Facet fields are simply occurrence counts for the entered fields. Content type counting is the most common example. Facet fields are required for a number of sidebar widgets.

  • Field List: The field values to return in a result set. Similar to the SELECT Field1, Field2 clause in SQL.

  • Result Link: Used on the Discovery UI to determine what to do when a user clicks on the link to the document.

  • Facet Limit: Maximum number of facet values to return.

  • Highlight:Yes if you want contextual highlighting, No otherwise.

  • Highlighted Fields: Comma delimited list of fields for highlighting (i.e. content).

  • Highlight Field Length: The maximum number of characters to highlight.

  • External Links: Setup external links for the search results. The widget is not

Search Security

Only one of these options may be selected at a time:

  • Filter: The authenticated user's group id is added to each search request. Used in tandem with the User group index task to only allow specified ids to search indexed content

  • Restrict: The restricted users or groups cannot use this connector. Views that use it will not be visible to them, and they will not be able to use it through the Search APIs

 

Viewing Indexed Content

  1. Create a Search Connection for ElasticSearch if you have not already. Use the authentication connection you used for indexing

  2. Using the configuration section above, pick the fields you wish to see and get counts for.

    1. You can add the basic 3Sixty metadata by clicking Add All Default Fields

  3. Under the Federation Menu > Content Views, Create A New Content View.


Content Service Connector

This section covers the specific configuration of the Content Service Connector. For a description of how to set up a content services connector generically see Content Service Connectors.

Supported Method

  • createFile

  • createFolder

  • deleteObjectByID

  • getFileContent

  • getObjectProperties

  • getTypes

  • listFolderItems

  • updateFile

  • updateProperties


API Keys

Elasticsearch Connector: Read=true: Write=true: MIP=false

Repo (Read) Specs

Key

Description

Data Type

modInfo This repository does not use the start and end time fields in the Details tab. Include any date checks in your query. Info

indexes

Indexes to crawl

String

elQuery

Query to run against each index. Leave blank to gather all documents within date range.

String

pathfield

The field that contains the absolute path to a file. If present will be used to set the parent path

String

prependIndex

Prepend index name to parent path

Boolean

pathincludesfiles

If true and Path Field is not empty, will be used to set the file name as well.

Boolean

filefield

If set, the field will be used to set the file name. Most output will fail documents if file name is not set.

String

datecreatedfield

Created Date Field

String

mimetypefield

The path which contains the mimetype. If not set, the mimetype will be guessed based on file extension

String

filelengthfield

The field which contains file length metadata. The size will be 0 otherwise

String

datemodifiedfield

Modified Date Field

String

binaryField

If including file content, what field contains the file binary data?

String

includeRenditions

Retrieve Renditions

Boolean

renditionFields

The fields that contain renditions. They will be checked in order. "" "If the field is an array the content will be processed in array order"

String

Output (Write) Specs

Key

Description

Data Type

id_attribute ID Attribute String
index_name Index Name (must be lower case or the job will fail) String

elBatchSize

Batch size.

Integer

includedUnMapped

Include Un-Mapped Properties

Boolean

outputRenditions

Output Renditions as array to the renditionData field.

Boolean

es_vectorlist

Term Vector Fields

String


Looking to integrate with ElasticSearch? We can help.