XWiki and semantic technologies

01 Apr 2011 5 min read

At XWiki we are doing research and participating in different research projects. This gives us the opportunity to experiment with latest technologies and to develop prototypes that will eventually improve XWiki products.

One of these projects is SCRIBO: Semi-automatic and Collaborative Retrieval of Information Based on Ontologies. It is now over and its goal was to provide algorithms and collaborative free software for the automatic extraction of knowledge from digital documents.

Thanks to SCRIBO we were able to develop the annotation system that is already available in XWiki Enterprise since early versions of the 2.x series. This feature was already described in a previous post with a nice video recording of the presentation Anca gave at FOSDEM 2011.

Besides annotations, we developed a backend based on the UIMA architecture for plugging semantic analysis components, analyze the content of the wiki and display the results as "automatically generated" annotations.

By using this backend it is possible to enrich the wiki with information that is implicitly available in the content, text and attachments, but not directly exploitable by the system.

scribo1.png

Extracted information is stored in a knowledge base using RDF. This allows developers to use the backend to perform complex semantic queries on the extracted content (e.g., all the documents, pages or attachments, that mention a city)

scribo2.png

The quality and the detail of the extracted knowledge depend on the semantic analysis components that are plugged in the framework. This is easily configurable from within XWiki, and the UIMA architecture allows easy reuse of existing components available on the market.

Now that the SCRIBO project is over we will work to make this prototypes part of the standard XWiki distribution and to empower XWiki with semantic technologies. Stay tuned!

You may also be interested in: