The xmldb module
(2Q19)
The xmldb module (http://exist-db.org/xquery/xmldb
function namespace)
contains functions for manipulating database contents. The full list of functions and their
documentation can be found in the Function Documentation Library. This article handles some of the highlights and main
uses for this module.
Manipulating Database Contents
The xmldb functions can be used to create new database collections or documents.
To illustrate this, suppose we have a large file containing several RDF metadata records, but, since our application expects each record to have its own document, we do not want to store the metadata records in a single file. SO we have to divide the document into smaller units. This can be done by the following XQuery:
xquery version "3.0";
declare namespace rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";
import module namespace xmldb="http://exist-db.org/xquery/xmldb";
let $log-in := xmldb:login("/db", "admin", "")
let $create-collection := xmldb:create-collection("/db", "output")
for $record in doc('/db/records.rdf')/rdf:RDF/*
let $split-record :=
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
{$record}
</rdf:RDF>
let $about := $record/@rdf:about
let $filename := util:hash($record/@rdf:about/string(), "md5") || ".xml"
return
xmldb:store("/db/output", $filename, $split-record)
Let's look at this example in some detail:
-
First, since we are using functions
xmldb:create-collection()
andxmldb:store()
, which require the user to be logged in as a member of thedba
group, we must log in usingxmldb:login()
. -
Once logged in, we can create a new sub-collection
output
usingxmldb:create-collection()
. -
The
for
-loop iterates over all child elements of the top RDF element. -
In each iteration, we use
xmldb:store()
to write the current child node to a new document. -
Since a unique document name is required, we need a way to generate unique names. The URI contained in the
rdf:about
attribute is unique, so we compute an MD5 key from it, append.xml
, and use this as the document's name.
Specifying the Input Document Set
A database can contain a virtually unlimited set of collections and documents. Four
functions are available to restrict the input document set to a user-defined set of documents
or collections: doc()
, collection()
,
xmldb:document()
, and xmldb:xcollection()
. The first two
are standard XPath functions, the others eXist-db specific extensions.
The differences between the XPath and the eXist-db specific functions are:
-
doc()
vs.xmldb:document()
-
While
doc()
is restricted to a single document-URI argument,xmldb:document()
accepts multiple document paths to be included into the input node set.Calling
xmldb:document()
without an argument includes every document in the database.Some examples:
doc("/db/apps/demo/data/hamlet.xml")//SPEAKER
xmldb:document('/db/test/abc.xml', '/db/test/def.xml')//title
-
collection()
vs.xmldb:xcollection()
-
The
collection()
function specifies the collection of documents to be included in the query evaluation. By default, documents found in sub-collections of the specified collection are also included.For example, suppose we have a collection
/db/test
that contains two sub-collections/db/test/abc
and/db/test/def
. In this case, the function callcollection('/db/test')
will include all of the resources found in/db/test
,/db/test/abc
and/db/test/def
.The function
xmldb:xcollection()
does not include sub-collections.
Without an URI scheme in front (like file:
or http:
), eXist-db
interprets the arguments to collection()
and doc()
as
absolute or relative paths, leading to some collection or document within the database. For
example:
-
doc("/db/collection1/collection2/resource.xml")
This refers to a resource called
resource.xml
stored in/db/collection1/collection2
. -
doc("resource.xml")
This references a resource relative to the base URI property defined in the static XQuery context. This contains an XML:DB URI pointing to the base collection (see below) for the current query context, for instance
xmldb:exist:///db
.
The base collection depends on how the query context was initialized. If you call a query
via the XML:DB API, the base collection is the collection from which the query service was
obtained. All relative URLs will be resolved relative to that collection. If a stored query is
executed via REST, the base collection is the collection in which the XQuery source resides.
In most other cases, the base collection will point to the database root /db
.
As it might not always be clear what the base collection is, we recommend to always use absolute paths. This allows using the query with different interfaces.
You can also pass a full URI to the doc()
function:
doc("http://localhost:8080/exist/servlet/db/test.xml")
The data on URI will be retrieved and stored in a temporary document in the database.