Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

For a detailed overview of the scoring algorithm of the Lucene search engine see this page: https://lucene.apache.org/core/3_6_0/scoring.html

 

Back to top

 

 

...

Index structure

The XperienCentral search engine has a variable set of fields, depending on the type of documents that has been indexed. Below is a list of the most import fields that are always part of the search index:

...

Field NameDescriptionExample Values

Children

Contains URLs to the child pages of this document

http://127.0.0.1/web/show/id=26111/langid=43/dbid=2/typeofpage=75501
127.0.0.1:9000/web/show/id=26111/langid=43/channel=pdf

Contenttype

The content type.

Possible values include: page, element_holder, image, flash, product, jellyfishdownload, jellyfishdocument

Description

The description of the document taken from the HTML description. meta tag.

This combination enables continuous web innovation

Hostname

The hostname of the document.

127.0.0.1

Keyword

A keyword.

WebManager

Keywords

Meta keywords taken from the HTML keywords meta tag.

WebManager

Langid

The language ID of the document.

43 (=Dutch), 42=(English)

Location

The URL of the document.

http://127.0.0.1:8080/web/News/import-wcm.swf.htm

Longdate

The date the document was created (only relevant for Word documents or PDFs, less for HTML pages).

20080922102830396

Modified

The date the document was last modified.

20080922102830396

Indexed

The date the document was first indexed.

20080922102830396

Pagepath

The combination of WebID’s that lead to the document.

p26111p70532

Explanation: the document is below the homepage (id=26111) and a subpage (id=70532) below the homepage)

Pagepath_00_name

The name of the root page of the document.

Home

Pagepath_00_url

The URL of the root page of the document.

http://127.0.0.1:8080/web/Home.htm

Pagepath_xx_name

The name of the level xx page that leads to the document. The range of xx is between 00 and the depth of the website.

DeveloperWeb 

Pagepath_xx_url

The URL of the level xx page that leads to the document.

http://127.0.0.1:8080/web/DeveloperWeb.htm

WebID

The ID of the web Initiative to which the document belongs.

26111

...