Stanford CoreNLP annotation pipeline XQuery Module
An XQuery module to integrate the Stanford CoreNLP annotation pipeline library suite
into eXist-db. The package can be installed via the package manager in the eXist-db dashboard or you can build
it yourself.
Examples
The module currently provides support to create a Named Entity Recognition (NER) classifier model and run some of the pipeline tools, including the NER classifier, on your documents.
Named entity recognition and classification
To create a NER classifier based on your own document's data start with the following steps:
- Upload your word processor document for tokenization and formatting of the document tokenize and format
- Annotate the spreadsheet you received in step 1.
- If you do not annotate the whole text, make sure to make a new spreadsheet document only containing the annotated part.
- Upload the annotated spreadsheet to train the classifier model train the classifier. The classifier model will be returned to you within a minute or two depending on the size of your provided text.