-<xqdoc:xqdoc>
-<xqdoc:module type="library">
<xqdoc:uri>http://exist-db.org/xquery/text</xqdoc:uri>
-<xqdoc:comment>
<xqdoc:description>Extension functions for text searching</xqdoc:description>
</xqdoc:comment>
</xqdoc:module>
-<xqdoc:functions>
-<xqdoc:function>
<xqdoc:name>fuzzy-match-all</xqdoc:name>
<xqdoc:signature>text:fuzzy-match-all($a as node()*, $b as xs:string, ...) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Fuzzy keyword search, which compares strings based on the Levenshtein distance (or edit distance). The function tries to match each of the keywords specified in the keyword string $b against the string value of each item in the sequence $a.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>fuzzy-match-any</xqdoc:name>
<xqdoc:signature>text:fuzzy-match-any($a as node()*, $b as xs:string, ...) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Fuzzy keyword search, which compares strings based on the Levenshtein distance (or edit distance). The function tries to match any of the keywords specified in the keyword string $b against the string value of each item in the sequence $a.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>fuzzy-index-terms</xqdoc:name>
<xqdoc:signature>text:fuzzy-index-terms($a as xs:string?) xs:string*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Compares the specified argument against the contents of the fulltext index. Returns a sequence of strings which are similar to the argument. Similarity is based on Levenshtein distance. This function may not be useful in its current form and is subject to change.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>text-rank</xqdoc:name>
<xqdoc:signature>text:text-rank($a as node()?) xs:double</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>This is just a skeleton for a possible ranking function. Don't use this.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>match-count</xqdoc:name>
<xqdoc:signature>text:match-count($a as node()?) xs:integer</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Counts the number of fulltext matches within the nodes and subnodes in $a.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>index-terms</xqdoc:name>
<xqdoc:signature>text:index-terms($a as node()*, $b as xs:string?, $c as function, $d as xs:int) item()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>This function can be used to collect some information on the distribution of index terms within a set of nodes. The set of nodes is specified in the first argument $a. The function returns term frequencies for all terms in the index found in descendants of the nodes in $a. The second argument $b specifies a start string. Only terms starting with the specified character sequence are returned. If $a is the empty sequence, all terms in the index will be selected. $c is a function reference, which points to a callback function that will be called for every term occurrence. $d defines the maximum number of terms that should be reported. The function reference for $c can be created with the util:function function. It can be an arbitrary user-defined function, but it should take exactly 2 arguments: 1) the current term as found in the index as xs:string, 2) a sequence containing four int values: a) the overall frequency of the term within the node set, b) the number of distinct documents in the node set the term occurs in, c) the current position of the term in the whole list of terms returned, d) the rank of the current term in the whole list of terms returned.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>index-terms</xqdoc:name>
<xqdoc:signature>text:index-terms($a as node()*, $b as xs:QName+, $c as xs:string?, $d as function, $e as xs:int) item()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>This version of the index-terms function is to be used with indexes that were defined on a specific element or attribute QName. The second argument lists the QNames or elements or attributes for which occurrences should bereturned. Otherwise, the function behaves like the 4-argument version.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>highlight-matches</xqdoc:name>
<xqdoc:signature>text:highlight-matches($a as text()*, $b as function, $c as item()*) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Highlight matching strings within text nodes that resulted from a fulltext search. When searching with one of the fulltext operators or functions, eXist keeps track of the fulltext matches within the text. Usually, the serializer will mark those matches by enclosing them into an 'exist:match' element. One can then use an XSLT stylesheet to replace those match elements and highlight matches to the user. However, this is not always possible, so Instead of using an XSLT to post-process the serialized output, the highlight-matches function provides direct access to the matching portions of the text within XQuery. The function takes a sequence of text nodes as first argument $a and a callback function (defined with util:function) as second parameter $b. $c may contain a sequence of additional values that will be passed to the callback functions third parameter. Text nodes without matches will be returned as they are. However, if the text contains a match marker, the matching character sequence is reported to the callback function, and the result of the function call is inserted into the resulting node set where the matching sequence occurred. For example, you can use this to mark all matching terms with a <span class="highlight">abc</span>.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>kwic-display</xqdoc:name>
<xqdoc:signature>text:kwic-display($a as text()*, $b as xs:positiveInteger, $c as function, $d as item()*) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>This function takes a sequence of text nodes in $a, containing matches from a fulltext search. It highlights matching strings within those text nodes in the same way as the text:highlight-matches function. However, only a defined portion of the text surrounding the first match (and maybe following matches) is returned. If the text preceding the first match is larger than the width specified in the second argument $b, it will be truncated to fill no more than (width - keyword-length) / 2 characters. Likewise, the text following the match will be truncated in such a way that the whole string sequence fits into width characters. The third parameter $c is a callback function (defined with util:function). $d may contain an additional sequence of values that will be passed to the last parameter of the callback function. Any matching character sequence is reported to the callback function, and the result of the function call is inserted into the resulting node set where the matching sequence occurred. For example, you can use this to mark all matching terms with a <span class="highlight">abc</span>. The callback function should take 3 or 4 arguments: 1) the text sequence corresponding to the match as xs:string, 2) the text node to which this match belongs, 3) the sequence passed as last argument to kwic-display. If the callback function accepts 4 arguments, the last argument will contain additional information on the match as a sequence of 4 integers: a) the number of the match if there's more than one match in a text node - the first match will be numbered 1; b) the offset of the match into the original text node string; c) the length of the match as reported by the index.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>kwic-display</xqdoc:name>
<xqdoc:signature>text:kwic-display($a as text()*, $b as xs:positiveInteger, $c as function, $d as function, $e as item()*) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>This function takes a sequence of text nodes in $a, containing matches from a fulltext search. It highlights matching strings within those text nodes in the same way as the text:highlight-matches function. However, only a defined portion of the text surrounding the first match (and maybe following matches) is returned. If the text preceding the first match is larger than the width specified in the second argument $b, it will be truncated to fill no more than (width - keyword-length) / 2 characters. Likewise, the text following the match will be truncated in such a way that the whole string sequence fits into width characters. The third parameter $c is a callback function (defined with util:function). $d may contain an additional sequence of values that will be passed to the last parameter of the callback function. Any matching character sequence is reported to the callback function, and the result of the function call is inserted into the resulting node set where the matching sequence occurred. For example, you can use this to mark all matching terms with a <span class="highlight">abc</span>. The callback function should take 3 or 4 arguments: 1) the text sequence corresponding to the match as xs:string, 2) the text node to which this match belongs, 3) the sequence passed as last argument to kwic-display. If the callback function accepts 4 arguments, the last argument will contain additional information on the match as a sequence of 4 integers: a) the number of the match if there's more than one match in a text node - the first match will be numbered 1; b) the offset of the match into the original text node string; c) the length of the match as reported by the index.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>filter</xqdoc:name>
<xqdoc:signature>text:filter($a as xs:string, $b as xs:string) xs:string*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Filter substrings that match the regular expression $b in text $a.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>groups</xqdoc:name>
<xqdoc:signature>text:groups($a as xs:string, $b as xs:string) xs:string*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Tries to match the string in $a to the regular expression in $b. Returns an empty sequence if the string does not match, or a sequence whose first item is the entire string, and whose following items are the matched groups.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>groups</xqdoc:name>
<xqdoc:signature>text:groups($a as xs:string, $b as xs:string, $c as xs:string) xs:string*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Tries to match the string in $a to the regular expression in $b, using the flags specified in $c. Returns an empty sequence if the string does not match, or a sequence whose first item is the entire string, and whose following items are the matched groups.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>make-token</xqdoc:name>
<xqdoc:signature>text:make-token($a as xs:string) xs:string*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>split a string into a token</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>match-all</xqdoc:name>
<xqdoc:signature>text:match-all($a as node()*, $b as xs:string+) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Tries to match each of the regular expression strings passed in $b against the keywords contained in the fulltext index. The keywords found are then compared to the node set in $a. Every node containing ALL of the keywords is copied to the result sequence. By default, a keyword is considered to match the pattern if any substring of the keyword matches. To change this behaviour, use the 3-argument version of the function and specify flag 'w'. With 'w' specified, the regular expression is matched against the entire keyword, i.e. 'explain.*' will match 'explained' , but not 'unexplained'.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>match-all</xqdoc:name>
<xqdoc:signature>text:match-all($a as node()*, $b as xs:string+, $c as xs:string) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Tries to match each of the regular expression strings passed in $b against the keywords contained in the fulltext index. The keywords found are then compared to the node set in $a. Every node containing ALL of the keywords is copied to the result sequence. By default, a keyword is considered to match the pattern if any substring of the keyword matches. To change this behaviour, use the 3-argument version of the function and specify flag 'w'. With 'w' specified, the regular expression is matched against the entire keyword, i.e. 'explain.*' will match 'explained' , but not 'unexplained'.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>match-any</xqdoc:name>
<xqdoc:signature>text:match-any($a as node()*, $b as xs:string+) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Tries to match each of the regular expression strings passed in $b against the keywords contained in the fulltext index. The keywords found are then compared to the node set in $a. Every node containing ANY of the keywords is copied to the result sequence. By default, a keyword is considered to match the pattern if any substring of the keyword matches. To change this behaviour, use the 3-argument version of the function and specify flag 'w'. With 'w' specified, the regular expression is matched against the entire keyword, i.e. 'explain.*' will match 'explained' , but not 'unexplained'.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>match-any</xqdoc:name>
<xqdoc:signature>text:match-any($a as node()*, $b as xs:string+, $c as xs:string) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Tries to match each of the regular expression strings passed in $b against the keywords contained in the fulltext index. The keywords found are then compared to the node set in $a. Every node containing ANY of the keywords is copied to the result sequence. By default, a keyword is considered to match the pattern if any substring of the keyword matches. To change this behaviour, use the 3-argument version of the function and specify flag 'w'. With 'w' specified, the regular expression is matched against the entire keyword, i.e. 'explain.*' will match 'explained' , but not 'unexplained'.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
-<xqdoc:function>
<xqdoc:name>filter-nested</xqdoc:name>
<xqdoc:signature>text:filter-nested($a as node()*) node()*</xqdoc:signature>
-<xqdoc:comment>
<xqdoc:description>Filters out all nodes in the node set $a, which do have descendant nodes in the same node set. This is useful if you do a combined query like //(a|b)[. &= $terms] and some 'b' nodes are nested within 'a' nodes, but you only want to see the innermost matches, i.e. the 'b' nodes, not the 'a' nodes containing 'b' nodes.</xqdoc:description>
</xqdoc:comment>
</xqdoc:function>
</xqdoc:functions>
</xqdoc:xqdoc>