Versioning Extensions

(2Q19)


eXist-db provides a basic document versioning extension. This extension tracks all changes to a document by storing the differences between the revisions. Older versions can be restored on the fly and even queried in memory. There's also basic support to detect and intercept conflicting writes.

The versioning extension was created with human editors in mind. These will typically change documents through an editor or some form-based front-end. It should work well with documents up to several megabytes in size.

eXist-db has no control over the client. It does not know where a document update comes from and cannot directly communicate with the user. The versioning extension should therefore be seen more like a toolbox than a complete solution. Advanced functionality (merging, conflict resolution, etc.) will require support from the end-user applications.

Warning:

The versioning will not track machine-generated node-level edits using XUpdate or XQuery update extensions.

Components

The versioning extensions has the following components:

VersioningTrigger

A trigger (to be registered with a collection) that implements the core versioning functionality

VersioningFilter

A serialization filter which adds special version attributes to every document. These attributes are used to detect conflicting writes.

versioning.xqm

An XQuery module which provides a function library for end-user applications, including functions like v:doc (restore a given revision on the fly).

Setup

Versioning can be enabled for separate collections in the collection hierarchy. To enable versioning, a trigger must be registered with the top-level collection. This is done through the same collection configuration files, collection.xconf, that are used for defining indexes.

Register the versioning trigger

To enable versioning for a collection, you have to edit the collection's collection.xconf configuration file. This file must be stored below the /db/system/config collection. As described in the Configuring Indexes document, the /db/system/config collection mirrors the hierarchical structure of the main collection tree.

Within collection.xconf, you must register the trigger class org.exist.versioning.VersioningTrigger for the create, update, delete, copy and move events:

<collection xmlns="http://exist-db.org/collection-config/1.0">
  <index/>
  <triggers>
    <trigger event="create,delete,update,copy,move" class="org.exist.versioning.VersioningTrigger">
      <parameter name="overwrite" value="yes"/>
    </trigger>
  </triggers>
</collection>

If you store above document into /db/system/config/db/collection.xconf, it will enable versioning for the entire database.

Warning:

A collection.xconf at a lower level in the hierarchy will overwrite any configuration on higher levels, including these trigger definitions. Triggers are not inherited from ancestor configurations. If the new configuration doesn't define a trigger, the trigger map will be empty.

When working with nested collection configurations, you need to make sure that the trigger definitions are present in all collection.xconf files.

VersioningTrigger accepts one parameter, overwrite: if this is set to no, the trigger will check for potential write conflicts. For example, if two users opened the same document and are editing it, it may happen that the first user saves his changes without the second user recognizing it. The second user also made changes and if eXist did allow him to store his version, he would just overwrite the modifications already committed by the first user.

The overwrite="no" setting prevents this. However, eXist has no control over the client. It does not know where the conflicting document came from. All it can do is reject the write attempt and raise an error. The error should then be handled by the client. Right now there are no clients to support this. More work will be required in this area. However, clients can already use the supplied XQuery functions to check for write conflicts (see below).

Enabling the serialization filter

In order to detect conflicting writes, the versioning extension needs to keep track of the base revision to which changes were applied. It does this by inserting special metadata attributes into a document when it is retrieved from the database. For this purpose, a custom filter has to be registered with eXist's serializer.

This is done in the <serializer> section in the main configuration file, conf.xml. Add a <custom-filter> child tag to the <serializer> element and set its class attribute to org.exist.versioning.VersioningFilter:

eXist must be restarted for the versioning filter to become active.

Accessing the versioning information

The versioning extension uses the collection /db/system/versions to store base revisions and differences. The collection hierarchy below /db/system/versions mirrors the main collection tree. For each versioned resource, you'll find a document with suffix .base, which contains the base revision (the first version of the document). Each revision is stored in a document which starts with the original document name and ends with the revision number, for instance hamlet.xml.35.

eXist provides an XQuery module to access the revision history or restore a given revision. For example, to view the history of a resource:

import module namespace v="http://exist-db.org/versioning";
v:history(doc("/db/shakespeare/plays/hamlet.xml"))

This returns an XML fragment like this:

<v:history>
  <v:document>
    /db/shakespeare/plays/hamlet.xml
  </v:document>
  <v:revisions>
    <v:revision rev="35">
      <v:date>
        2009-08-22T22:19:33.777+02:00
      </v:date>
      <v:user>
        admin
      </v:user>
    </v:revision>
    <v:revision rev="36">
      <v:date>
        2009-08-22T22:38:41.629+02:00
      </v:date>
      <v:user>
        admin
      </v:user>
    </v:revision>
  </v:revisions>
</v:history>

The most important function is v:doc, which is used to restore an arbitrary revision of a document on the fly. You can use this function similar to the standard fn:doc to query the revision. For example:

import module namespace v="http://exist-db.org/versioning";
v:doc(doc("/db/shakespeare/plays/hamlet.xml"), 35)//SPEECH[SPEAKER="HAMLET"]

This will restore revision 35 of hamlet.xml and then find all <SPEECH> elements with a <SPEAKER> called "HAMLET". No indexes are available to the query engine when processing a restore document.

Detecting write conflicts

To avoid a user overwriting the changes made by another user, eXist needs to know upon which revision the user's changes are based. To make this possible, the versioning filter adds a number of metadata attributes to the root element of a document when it is serialized (for instance when opening it in an editor). The inserted metadata attributes are all in a separate versioning namespace and will never be stored in the database. The following fragment shows the added attributes:

<PLAY xmlns:v="http://exist-db.org/versioning" v:revision="36" v:key="12343e4940b24" v:path="/db/shakespeare/plays/hamlet.xml">
  ...
</PLAY>

When eXist detects a potential write conflict, it cannot do more than reject the update and raise an error. However, there's an XQuery function to check if newer revisions exist. You pass it the revision number and the unique key as given in the versioning attributes of the document root element. If the function returns the empty sequence, no newer revisions exist in the database. Otherwise, the function returns the version documents for each newer revision.

import module namespace v="http://exist-db.org/versioning";
v:find-newer-revision(doc("/db/shakespeare/plays/hamlet.xml"), 36, "12343e4940b24")

Once you made sure that you really want to store the document and overwrite any revisions, simply remove the version attributes from the root element.