Reference library

Article number: 301930

Sitekit CMS FAQ

How does an index search work

From 10.4 onward a Sitekit CMS indexed search has the following logic.

The indexed search supports double-quotes for exact phrase matching so searching for "jobs essex" (with the double-quotes) will return all results that contain the phrase 'jobs essex'. Individual terms are now searched for separately (this is logical 'or'), so searching for 'jobs essex' (without the single quotes) will return all results that contain 'jobs' or 'essex'.

It also supports + and - to force the results to contain a certain word (e.g. by searching 'jobs +essex' only results that contain "essex" or both "essex" and "jobs" will be returned, but not pages containing only "jobs"). The minus symbol can be used to exclude results that contain the search term (e.g. by searching jobs -essex, all search results containing the word "essex" will be excluded).

Results are weighted so that the result at the top will be the one that contains most of the terms, closest to the start of the indexed text. The weighting calculation is basically the sum of the positions of the first instance of each term within the indexed text, with an absent term given a weight of the indexed text length to penalise it.

The search algorithm ignores insignificant small search terms. Any one or two letter terms and common small words like 'the', 'to', 'his' 'her', 'this', 'for', 'all', 'and', will be ignored unless they are inside double quotes. If the entire search text is comprised of 'insignificant' terms then none of them will be ignored, for example 'to be or not to be' would not be ignored.

The current ignore list is as follows

  1. All one and two letter words: "a", "if", "to", "at" etc.
  2. All of the following longer words in common usage: "the", "not", "his", "her", "this", "for", "all", "and", "that", "but", "are", "you", "any", "can", "was", "our", "has", "its", "too", "she", "use", "put", "let", "did", "with", "they", "have", "from", "had", "some", "what", "there", "other", "were", "your", "when", "use", "how", "which", "their", "will", "them", "then", "these", "than", "may", "been", "came", "very", "through", "just", "much"
  • Related CMS feature: