More Like This Widget

Overview

A facet search type widget that produces more results like the ones selected. This widget can be used to identify duplicate or similar documents across connectors based on the file name and the content of the file.

Tutorial Example: Setting up Elasticsearch to use MLT Widget


Configuration

  • Name: More Like This

  • Widget Definition Id: MLTWidget

  • Field: na

  • Label: More Like This

  • RefDocs: The sample size of documents used. The MLT search requires a source set of documents to start using as search criteria.

  • maxqt: The maximum number of query terms that will be selected. How many common phrases will be selected from the source documents to begin the search? Increasing this value gives greater accuracy at the expense of query execution speed.

  • mintf: The minimum term frequency (how often a word or phrase shows up) below which the terms will be ignored from the input document. A setting of 1 means that if a document matches a term one time, it will be included in the results.

  • mindf: The minimum document frequency (how many matches a document gets) below which the terms will be ignored from the input document.

  • minwl: The minimum word length below which the terms will be ignored.

  • maxwl: The maximum word length above which the terms will be ignored. Defaults to unbound (0)

  • mltfl: The list of fields checked for similarities. The default values are fields that have term vectors by default.

  • btnLabel: The label for the More Like This search button.


Configuration Example

  • Name: More Like This
  • Widget Definition Id: MLTWidget

  • Field: na

  • Label: More Like This

  • RefDocs: 5

  • maxqt: 25

  • mintf: 1

  • mindf: 1

  • minwl: 0

  • maxwl: 0

  • mltfl: content,simflofy_filename

  • btnLabel: Perform Full 'More Like This' Search


Preview