| Module | Status | Priority | Test Coverage | Progress | Who |
|---|---|---|---|---|---|
| 1. Document Storage | |||||
| 1.1. File size/complexity limits | Stable | x | Tested | 100% | wolf |
|
The numbering scheme at the core of eXist 1.0 did limit the maximum size of a document to be stored within the database. In eXist releases 1.1 and later this is fixed. |
|||||
| 1.2. Collection storage | Stable, but subject to redesign | High | No tests | 0% | wolf |
|
The current organization of collections and resources causes a number of problems with respect to (a) locking, (b) query performance, (c) update performance. Right now, documents are tightly bound to the collection in which they are contained. Any operation on a document has to go through its parent collection. As a result, locking and access control becomes quite complex as we need to take care of the document and the collection. There's also a direct dependency between the size of a collection (in terms of the number of documents stored in it) and document update speed. If a collection has a large number of documents, removing a single document becomes very slow. This problem can be solved by physically decoupling documents and collections:
|
|||||
| 1.3. DOM Redesign | Open | Avg | N/A | 0% | |
|
eXist currently uses 2 DOM (document object model) implementations: one for nodes stored in the db, and a different one for in-memory nodes constructed during an XQuery. The two models can not be mixed. To process an XQuery expression on an in-memory node, the query engine needs to create a temporary persistent copy. This costs performance and did cause stability issues in the past. To solve those problems, we are currently redesigning the in-memory DOM to implement the same interfaces as the persistent DOM. The query engine should be able to mix nodes of both models. |
|||||
| 1.4. Allow metadata to be associated with a document | Open | Avg | N/A | 0% | |
|
Metadata could include system properties like last-modification date or user-defined metadata. Preferably, metadata records should be ordinary XML documents. The format should not be restricted. |
|||||
| 2. Indexing | |||||
|
Since version 1.2, there are now alternative index configuration methods which support the optimizer in rewriting a query for best performance. A new modularized indexing architecture allows to plug in arbitrary new indexes into the indexing pipeline. A N-gram and a spatial index module were added as prototypes to test the new architecture. There will be other index types added in the future, for example:
|
|||||
| 2.1. Full text indexing | Stable, but subject to redesign | Avg | Tested | 75% | perig, wolf |
|
(Align with the XQuery Full-text specification) The interfaces to the indexing system need to be redesigned to support the query engine in index selection. This includes, for example, statistical information about the frequency of index items. The redesign of the indexing system thus presents a necessary foundation for the query optimizer. The current architecture is also too limited with respect to text analysis. The general-purpose tokenizer is not suitable for language-dependent analysis. Plans are to replace these classes by Lucene's analyzer. Lucene offers a pluggable architecture in which multiple analyzers can be combined. |
|||||
| 2.2. Range indexing | Stable | x | Partially tested | 100% | |
|
No remarks available. |
|||||
| 2.3. Indexes on xml:id | Stable, but subject to redesign | Avg | Tested | 90% | |
|
Currently stored in the structural index. Should be moved to the range index. |
|||||
| 2.4. N-gram | Stable | x | Tested | 90% | wolf |
|
When dealing with texts in many non-European languages, the token-based full-text index produces insufficient results. Tokenization is currently based on Unicode code points. Most chinese characters, for example, are thus stored as single tokens. Users have to abuse the near() or phrase() function to search for character sequences consisting of more than one character, which is quite slow. It also means that real proximity searches are not available. An N-gram based index would be much more suitable for these languages. It would also allow additional functionality to be implemented, e.g. to deal with varying spellings. The main question is how the N-gram index would integrate conceptually with the existing full-text functions. A N-gram index based on the new modularized indexing architecture is in SVN trunk as of July 2007 and in eXist 1.2 and later releases. |
|||||
| 2.5. Integration of other index types (e.g. Spatial indexes, external indexes) | Beta | Avg | N/A | 75% | |
|
eXist now offers spatial indexes in SVN trunk as of July 2007 and in version 1.2 and later releases. |
|||||
| 2.6. Index-support for order-by, distinct-values | Open | Avg | N/A | 0% | |
|
Order-by expressions and other functions that need to access atomized nodes are not supported by indexes. |
|||||
| 2.7. Collation-driven indexing | Open | Avg | N/A | 0% | |
|
Maybe part of FT index redesign. |
|||||
| 3. Transactions and Recovery | |||||
|
The journal log and the recovery manager should be stable and are covered by extensive tests. However, recovery failures can not be excluded entirely. The tests can't reproduce every possible real-world scenario. However, some steps remain for eXist to become a fully transactional database system. Transaction support is currently limited to the functionality needed for crash recovery. Though we maintain transactions internally, they are currently not exposed to applications. Also, read operations are not transactional right now. In order to allow user-defined ACID transactions with support for rollback, all index files would need to be protected by the journaling log. The required functionality is basically available, but the feature is currently not regarded as high-priority. |
|||||
| 3.1. Journal log | Stable | x | Tested | 100% | |
|
No remarks available. |
|||||
| 3.2. Recovery | Stable | x | Tested | 100% | |
|
No remarks available. |
|||||
| 3.3. Internal transaction management | Stable | x | Tested | 100% | |
|
Transactions are maintained internally, but they are not exposed to applications. eXist does not yet support full ACID transactions. Read-only operations bypass the transaction system. |
|||||
| 3.4. User-definable transactions | Open | Low | N/A | 0% | |
|
Journal logs are limited to critical data required for recovery. No transaction rollbacks. |
|||||
| 4. Backup / Restore | |||||
| 4.1. Backup / Restore Tool | Stable | x | No tests | 100% | |
|
No remarks available. |
|||||
| 4.2. DB repair tool | Open | Avg | N/A | 0% | wolf |
|
Create a DB repair tool which can handle and resolve inconsistencies in the database structure. It should be possible to recreate the db if at least dom.dbx, collections.dbx and symbols.dbx are more or less intact. If a single document is damaged, it could be filtered out. We should also think about storing redundant copies of some of the vital data blocks, so we could reconstruct the collection hierarchy from dom.dbx alone. |
|||||
| 5. Configuration | |||||
| 5.1. Dynamic configuration of the database via Java Management Extensions (JMX) | Open | x | No tests | 0% | |
|
Main problem: access control and security. |
|||||
| 6. Node-level updates | |||||
| 6.1. XUpdate | Stable | x | Tested | 100% | |
|
No remarks available. |
|||||
| 6.2. XQuery Update Extensions | Stable, but subject to redesign | x | Tested | 75% | |
|
W3C released yet another working draft August 28 2007 which seems to close in on a actual draft. No merge plan yet. |
|||||
| 7. Access-Control | |||||
|
The currently implemented Unix-like access control scheme is sufficient to protect resources and collections in a multi-user environment. However, it might be too coarse-grained for some types of applications. A more dynamic ACL implementation could help here. Right now, security management forms part of the database core. This is unnecessary. A more modular architecture would allow different security managers to be plugged in. It would be the responsibility of the security manager implementation to handle ACL lists. Since version 1.1, eXist supports the XACML standard for fine-grained access control to stored XQueries, Java classes etc. |
|||||
| 7.1. User management | Stable | x | No tests | 100% | |
|
No remarks available. |
|||||
| 7.2. Access control on resources and collections | Stable, but subject to redesign | Avg | No tests | 100% | |
|
Need more dynamic ACL structures that can adapt to varying requirements. |
|||||
| 7.3. Access control on stored XQueries, XQuery functions and modules | Stable | x | incomplete | 100% | |
|
No remarks available. |
|||||
| 7.4. Java binding | Stable | x | N/A | 100% | |
|
No remarks available. |
|||||
| 8. Schema Validation | |||||
| 8.1. Validate document against schema when indexing | Stable | x | No tests | 100% | |
|
No remarks available. |
|||||
| 8.2. Validate document after node-level updates | Open | Avg | N/A | 0% | |
|
No remarks available. |
|||||
| 8.3. Locate schema's and DTDs stored in database | Beta | High | x | 90% | |
|
No remarks available. |
|||||
| 8.4. Support for catalog files in database | Beta | High | x | 90% | |
|
No remarks available. |
|||||
| 8.5. Manual validation against schema | Beta | High | Tested | 75% | |
|
No remarks available. |
|||||
| 8.6. XQuery validation features | Open | Avg | N/A | 0% | |
|
No remarks available. |
|||||
| 8.7. Store PSVI with the node tree in the database | Open | Low | N/A | 0% | dizzzz |
|
No remarks available. |
|||||
| 8.8. Static typing based on PSVI | Open | Low | N/A | 0% | dizzzz |
|
No remarks available. |
|||||
| 8.9. Support for RelaxNG and Schematron | Open | Low | No tests | 0% | dizzzz |
|
No remarks available. |
|||||
| 9. XQuery | |||||
|
The XQuery engine as well as the standard function libraries should be updated to align with the XQuery 1.0 recommendation. Basically, almost all core language features are implemented, excluding schema related features, which are currently beyond eXist's scope. XQuery support in eXist is covered by the official W3C XQuery Test Suite (XQTS) 1.02. Implementing the official XQTS XQuery test suite was a top priority in order to guarantee standard conformance and avoid future regressions. |
|||||
| 9.1. Core XPath and XQuery | Stable | x | tested | 100% | |
|
Updated to the XPath 2.0 and XQuery 1.0 recommendations. Stable, excluding schema-related features. |
|||||
| 9.2. XPath and XQuery atomic value types | Stable | Avg | tested | 99.4% | |
| 9.3. XPath and XQuery function libraries | Stable | High | tested | 99.4% | |
|
Updated to XPath 2.0 and XQuery 1.0 recommendations. Stable, excluding schema-related features. |
|||||
| 9.4. XPath and XQuery function libraries | Stable | High | tested | 99.4% | |
|
Updated to XPath 2.0 and XQuery 1.0 recommendations. |
|||||
| 9.5. XQuery serialization | Stable, but subject to redesign | Avg | tested | 80% | |
|
Though we implement most of the serialization options specified in the XQuery and XSLT serialization spec, some options need to be reworked and should be covered by tests. |
|||||
| 9.6. XQuery test suite – XQTS | Stable | x | N/A | 100% | |
| 9.7. XQuery Optimizer | Beta | High | tested | 80% | wolf, perig |
|
With the 1.2 release, eXist features a new query-rewriting optimizer. It analyzes the query at compile time and searches for optimizable subexpressions within the query tree. If it finds an optimizable expression, the optimizer will modify the query and wrap some special instructions around the optimizable code block. Together with the new indexing features (see blog article, the optimizer can achieve dramatic improvements. However, the optimizer is currently limited to predicate expressions. It does not optimize e.g. "where" clauses in a FLWOR statement. The query-rewriting should thus be extended to recognize other types of expressions beyond predicate statements. In short, we need a better static analysis of the query. Based on eXist's current indexes, it is also often difficult to decide if a certain optimization path leads to performance improvements or not. Better index statistics could help here. Also, there's a wide range of performance optimizations which could be applied if we had appropriate statistics on node distribution and frequency. |
|||||
| 9.8. Error reporting | Stable, but subject to redesign | Avg | N/A | 75% | |
|
Error reports by the XQuery parser and compiler need to be improved. |
|||||
| 9.9. Make function calls tail-recursive | Stable | x | 70 | 100% | |
|
Recursive functions may trigger a StackOverflowException. We need to handle tail-recursion. |
|||||
| 9.10. XQuery Debugger | Open | High | N/A | 0% | ljo |
|
Remote debugging protocol. Command line debugger similar to jdb as prototype Decide which functionality to expose. |
|||||
| 9.11. Drop and deprecate xmldb:collection() | Stable | x | tested | 100% | delirium |
| 9.12. Move fn:document() to xmldb:document() and deprecate | Stable | x | tested | 100% | perig |
| 10. XInclude | Stable | Low | No tests | 80% | |
|
XInclude expansion happens at serialization time. Queries across the included document fragments are not possible. Stable, but limited. |
|||||
| 11. Interfaces | |||||
| 11.1. XML:DB API | Stable | x | Tested | 100% | |
|
No remarks available. |
|||||
| 11.2. XML-RPC | Stable | x | Partially tested | 100% | |
|
Exposes the entire database functionality. |
|||||
| 11.3. REST | Stable, but subject to redesign | Low | Partially tested | 90% | delirium |
|
Does not cover administrative functions, e.g. user-management and permissions. Stable, but further functionality could be exposed. |
|||||
| 11.4. SOAP | Stable | Low | No tests | 90% | |
| 11.5. Cocoon Integration | Stable | x | No tests | 100% | |
|
General functionality tests required |
|||||
| 11.6. XQJ XQuery API for Java (JSR-225) | Beta | Low | N/A | 70% | allad, perig, ljo |
|
Could be a simpler alternative to the now somewhat bloated XML:DB API |
|||||
| 11.7. XForms filter | Beta | Low | N/A | 80% | delirium |
|
No remarks available. |
|||||
| 12. Documentation | |||||
| 12.1. XQuery on the Web | Partial | Low | N/A | 90% | |
|
Should explain in more depth how one can write webapps in XQuery, using the XQueryGenerator with Cocoon or stored XQueries. |
|||||
| 12.2. XQuery stored modules | Stable | x | N/A | 100% | |
|
calling XQuery scripts stored in the DB; import stored modules into a query passed to the DB. |
|||||
| 12.3. WebDAV | Beta | Avg | N/A | 90% | |
|
No remarks available. |
|||||
| 12.4. Deployment | Stable | x | N/A | 100% | |
|
Integration with a servlet engine, Cocoon, stand-alone server, embedded use. |
|||||
| 12.5. Index creation, index configuration and query rewriting | Stable | x | N/A | 100% | |
|
No remarks available. |
|||||
| 12.6. Validation | Stable | x | N/A | 100% | dizzzz |
|
No remarks available. |
|||||
| 12.7. Trigger | Beta | Low | N/A | 0% | delirium |
|
No remarks available. |
|||||
| 12.8. Searchable Documentation | Incomplete | Avg | N/A | 70% | |
|
XQuery functions search was improved. Missing: search in the documentation. All docs are available in XML docbook. |
|||||
| 12.9. XQDoc integration | Alpha | High | Partially tested | 60% | ljo, wolf |
|
Migrate the function documentation to XQDoc. Use XQDoc to better document all XQuery examples. |
|||||
| 13. Releases | |||||
| 13.1. 1.2 | Stable | High | N/A | 99% | |
|
Imperative release with new features from 2006–2007 to replace version 1.0-line. The last release based on Java 1.4. |
|||||
| 13.2. 1.4 | Open | Low | N/A | 0% | |
|
Based on Java 5. |
|||||
| 14. Other Tasks | |||||
| 14.1. I18n | Open | Low | N/A | 20% | |
|
Provide translations for error messages, console outputs etc. At least, resource bundles should be used, so others can translate them if they want. |
|||||
| 14.2. Clean up/upgrade libraries | Beta | Low | N/A | 60% | |
|
All libraries included with eXist need to be checked. |
|||||
| 14.3. Move to Java 5 | Open | Low | N/A | 0% | |
|
Change build taget in switch for development branch 1.3 to Java 5 after eXist version 1.2 is released. This will make some more stuff easier. Especially since the jmx monitors and junit 4 tests are by requirements targeted for Java 5 already. |
|||||
| 14.4. Move to ANTLR3 parser | Instable | High | N/A | 60% | ljo |
|
Change the parser to ANTLR3 which is better performant, LL* lookahead capable and processes whitespace in a better manner than ANTLR2 which we currently are using. Maybe use gunit for testing? |
|||||
| 14.5. Move to AtomicWiki | Stable | High | N/A | 100% | wolf |
|
Change from the current spam-ridden and unmaintained wiki to our own Atom-based AtomicWiki. |
|||||
| Percentage | Description |
|---|---|
| 0 | work not started |
| 20 | 1-20 Percentage of completion |
| 40 | 21-40 Percentage of completion |
| 60 | 41-60 Percentage of completion |
| 80 | 61-80 Percentage of completion |
| 99 | 81-99 Percentage of completion |
| Done | 100 Percentage of completion |
| Priority | Description |
|---|---|
| 1. Highest |
|
| 2. High |
|
| 3. Avg |
|
| 4. Low |
|
| 5. x |
|