A module for full text indexed searching based on Lucene.
ft:close() as empty-sequence()
Close the current Lucene document and flush it to disk. Subsequent calls to ft:index will write to a new Lucene document.
ft:facets($nodes as node()*, $dimension as xs:string) as map(*)
Return a map of facet labels and counts for the result of a Lucene query.
$nodes* | A sequence of nodes for which facet counts should be returned. If the nodes in the sequence resulted from different Lucene queries, their facet counts will be merged. If no node in the the sequence has facets attached or the sequence is empty, an empty map is returned. |
$dimension | The facet dimension. This should correspond to a dimension defined in the index configuration |
ft:facets($nodes as node()*, $dimension as xs:string, $count as xs:integer?) as map(*)
Return a map of facet labels and counts for the result of a Lucene query.
$nodes* | A sequence of nodes for which facet counts should be returned. If the nodes in the sequence resulted from different Lucene queries, their facet counts will be merged. If no node in the the sequence has facets attached or the sequence is empty, an empty map is returned. |
$dimension | The facet dimension. This should correspond to a dimension defined in the index configuration |
$count? | The number of facet labels to be returned. Facets with more occurrences in the result will be returned first. |
ft:facets($nodes as node()*, $dimension as xs:string, $count as xs:integer?, $paths as xs:string+) as map(*)
Return a map of facet labels and counts for the result of a Lucene query.
$nodes* | A sequence of nodes for which facet counts should be returned. If the nodes in the sequence resulted from different Lucene queries, their facet counts will be merged. If no node in the the sequence has facets attached or the sequence is empty, an empty map is returned. |
$dimension | The facet dimension. This should correspond to a dimension defined in the index configuration |
$count? | The number of facet labels to be returned. Facets with more occurrences in the result will be returned first. |
$paths+ | For hierarchical facets, specify a sequence of paths leading to the position in the hierarchy youwould like to get facet counts for. |
ft:field($node as node(), $field as xs:string) as xs:string*
Returns the value of a field attached to a particular node obtained via a full text search.Only fields listed in the 'fields' option of ft:query will be attached to the query result.
$node | the context node to check for attached fields |
$field | name of the field |
ft:field($node as node(), $field as xs:string, $type as xs:string) as item()*
Returns the value of a field attached to a particular node obtained via a full text search.Only fields listed in the 'fields' option of ft:query will be attached to the query result.Accepts an additional parameter to name the target type into which the field value should be cast. This is mainly relevant for fields having a different type than xs:string. As lucene does not record type information, numbers or dates would be returned as numbers by default.
$node | the context node to check for attached fields |
$field | name of the field |
$type | intended target type to cast the field value to. Casting may fail with a dynamic error. |
ft:get-field($path as xs:string*, $field as xs:string) as xs:string*
Retrieve the stored content of a field.
$path* | URI paths of documents or collections in database. Collection URIs should end on a '/'. |
$field | query string |
ft:has-index($path as xs:string) as xs:boolean*
Check if the given document has a lucene index defined on it. This method will return true for both, indexes created via collection.xconf or manual index fields added to the document with ft:index.
$path | Full path to the resource to check |
ft:highlight-field-matches($node as node(), $field as xs:string) as element()?
Highlights matches for the last executed lucene query within the value of a field attached to a particular node obtained via a full text search. Only fields listed in the 'fields' option of ft:query will be available to highlighting.
$node | the context node to check for attached fields which should be highlighted |
$field | name of the field to highlight |
ft:index($documentPath as xs:string, $solrExression as node()) as empty-sequence()
Index an arbitrary chunk of (non-XML) data with Lucene. Syntax is inspired by Solr.
$documentPath | URI path of document in database. |
$solrExression | XML syntax expected by Solr's add expression. Element should be called 'doc', e.g.<doc> <field name="field1">data1</field> <field name="field2" boost="value">data2</field> </doc> |
ft:index($documentPath as xs:string, $solrExression as node(), $close as xs:boolean) as empty-sequence()
Index an arbitrary chunk of (non-XML) data with Lucene. Syntax is inspired by Solr.
$documentPath | URI path of document in database. |
$solrExression | XML syntax expected by Solr's add expression. Element should be called 'doc', e.g.<doc> <field name="field1">data1</field> <field name="field2" boost="value">data2</field> </doc> |
$close | If true, close the Lucene document. Subsequent calls to ft:index will thus add to a new Lucene document. If false, the document remains open and is not flushed to disk. Call the ft:close function to explicitely close and flush the current document. |
ft:index-keys-for-field($field as xs:string, $start-value as xs:string?, $function-reference as function(*), $max-number-returned as xs:int?) as item()*
Similar to the util:index-keys functions, but returns index entries for a field associated with a lucene index.
$field | The name of the field |
$start-value? | Only keys starting with the given prefix are reported. |
$function-reference | A function reference. It can be an arbitrary user-defined function, but it should take exactly 2 arguments: 1) the current index key as found in the range index as an atomic value, 2) a sequence containing three int values: a) the overall frequency of the key within the node set, b) the number of distinct documents in the node set the key occurs in, c) the current position of the key in the whole list of keys returned. |
$max-number-returned? | The maximum number of keys to return |
ft:optimize() as empty-sequence()
Calls Lucene's optimize method to merge all index segments into a single one. This is a costly operation and should not be used except for data sets which can be expected to remain unchanged for a while. The optimize will block the index for other write operations and may take some time. You need to be a user in group dba to call this function.
ft:query($nodes as node()*, $query as item()?) as node()*
Queries a node set using a Lucene full text index; a lucene index must already be defined on the nodes, because if no index is available on a node, nothing will be found. Indexes on descendant nodes are not used. The context of the Lucene query is determined by the given input node set. The query is specified either as a query string based on Lucene's default query syntax or as an XML fragment. See http://exist-db.org/lucene.html#N1029E for complete documentation.
$nodes* | The node set to search using a Lucene full text index which is defined on those nodes |
$query? | The query to search for, provided either as a string or text in Lucene's default query syntax or as an XML fragment to bypass Lucene's default query parser |
ft:query($nodes as node()*, $query as item()?, $options as item()?) as node()*
Queries a node set using a Lucene full text index; a lucene index must already be defined on the nodes, because if no index is available on a node, nothing will be found. Indexes on descendant nodes are not used. The context of the Lucene query is determined by the given input node set. The query is specified either as a query string based on Lucene's default query syntax or as an XML fragment. See http://exist-db.org/lucene.html#N1029E for complete documentation.
$nodes* | The node set to search using a Lucene full text index which is defined on those nodes |
$query? | The query to search for, provided either as a string or text in Lucene's default query syntax or as an XML fragment to bypass Lucene's default query parser |
$options? | An XML fragment containing options to be passed to Lucene's query parser. The following options are supported (a description can be found in the docs): <options> <default-operator>and|or</default-operator> <phrase-slop>number</phrase-slop> <leading-wildcard>yes|no</leading-wildcard> <filter-rewrite>yes|no</filter-rewrite> </options> |
ft:query-field($field as xs:string*, $query as item()) as node()*
Queries a Lucene field, which has to be explicitely created in the index configuration.
$field* | The lucene field name. |
$query | The query to search for, provided either as a string or text in Lucene's default query syntax or as an XML fragment to bypass Lucene's default query parser |
ft:query-field($field as xs:string*, $query as item(), $options as node()?) as node()*
Queries a Lucene field, which has to be explicitely created in the index configuration.
$field* | The lucene field name. |
$query | The query to search for, provided either as a string or text in Lucene's default query syntax or as an XML fragment to bypass Lucene's default query parser |
$options? | An XML fragment containing options to be passed to Lucene's query parser. The following options are supported (a description can be found in the docs): <options> <default-operator>and|or</default-operator> <phrase-slop>number</phrase-slop> <leading-wildcard>yes|no</leading-wildcard> <filter-rewrite>yes|no</filter-rewrite> </options> |
ft:remove-index($documentPath as xs:string) as empty-sequence()
Remove any (non-XML) Lucene index associated with the document identified by the path parameter. This function will only remove indexes which were manually created by the user via the ft:index function. Indexes defined in collection.xconf will NOT be removed. They are maintained automatically by the database. Please note that non-XML indexes will also be removed automatically if the associated document is deleted.
$documentPath | URI path of document in database. |
ft:score($node as node()) as xs:float*
Returns a computed relevance score for the given node. The score is the sum of all relevance scores provided by Lucene for the node and its descendants. In general, the score will be a number between 0.0 and 1.0 if the query had $node as context. If the query targeted multiple descendants of $node (e.g. 'title' and 'author' within a 'book'), the score will be the sum of all sub-scores and may thus be greater than 1.
$node | the context node |
ft:search($path as xs:string*, $query as xs:string, $fields as xs:string*, $options as node()?) as node()
Search for (non-XML) data with lucene
$path* | URI paths of documents or collections in database. Collection URIs should end on a '/'. |
$query | query string |
$fields* | Fields to return in search results |
$options? | An XML fragment containing options to be passed to Lucene's query parser. The following options are supported (a description can be found in the docs): <options> <default-operator>and|or</default-operator> <phrase-slop>number</phrase-slop> <leading-wildcard>yes|no</leading-wildcard> <filter-rewrite>yes|no</filter-rewrite> </options> |
ft:search($path as xs:string*, $query as xs:string, $fields as xs:string*) as node()
Search for (non-XML) data with lucene
$path* | URI paths of documents or collections in database. Collection URIs should end on a '/'. |
$query | query string |
$fields* | Fields to return in search results |
ft:search($path as xs:string*, $query as xs:string) as node()
Search for (non-XML) data with lucene
$path* | URI paths of documents or collections in database. Collection URIs should end on a '/'. |
$query | query string |