Table of Contents
This section explains how to get started with Xemeiha, both as a XSLT Processor or as a complete XML Database & WebServer.
Have a look here http://sourceforge.net/projects/xemeiah/ for more information on Xemeiah.
Since 0.5.1, a repository is available for both Debian unstable and Ubuntu karmic :
Add the following line to your
/etc/apt/sources.list for Debian :
deb http://xemeiah.sf.net/debian unstable main
or for Ubuntu :
deb http://xemeiah.sf.net/debian karmic main
Then, install one of the package (using apt-get install or your prefered dpkg frontend) :
xemeiah-xsl for the XSL-only
xemeiah-webserver for the
Xemeiah WebServer suite
xemeiah-media-player for the
Web-based Xemeiah Media Player
No other Operating System or Linux Distribution is supported yet, but help is welcome !
RedHat, Windows, *BSD, ...
The freshest sources are always retrievable from the SVN repository :
INSTALL file for further
information on compile/install procedure.
To process an XSL stylesheet, just run :
xem xsl 'stylesheet.xsl' 'file.xml'
xem will output result to stdout, unless a xsl:result-document has been specified in stylesheet.
One can redirect output to a file using :
xem xsl 'stylesheet.xsl' 'file.xml' > result.html
To use EXSLT extensions, add 'exslt' module to parameters :
xem xsl --module exlst 'stylesheet.xsl' 'file.xml'
Xemeiah WebServer uses 'persistence' library as command-line handler. As a result, the first argument provided to xem shall be pers or persistence.
Xemeiah WebServer also uses XProcessor libraries : webserver, xemfs, xemprocessor, ...
xem pers --store='path-to-store' format
If no --store argument is provided, then the default store file used is xem-main.xem.
Services run are set in the procedure file defined in
procedure-aliases.xml, for alias
One can edit his own version of the startup procedure and
define it in
First connect to http://localhost:1789/browse.
As no collection is set yet, the collection configuration page will be prompted by default. Set path to the Media Collection and start scanning files.
Since the beginning of the project, each minor version series had the following focus :
Introducing Xemeiah's Document memory model (page-based segmentation, ease of binary serialization, COW mechanism).
DOM implemented as using stack-optimized references to these large page-based.
Introducing XPath's binary format (called Xem::XPathSegment & Xem::XpathStep).
Introducing generic XML-based recursive processing, with stacked garbage collection.
Demonstrating the capabilites of Xem::XProcessor, the generic XML processor.
Demonstrating large documents capabilities (> 4Gbytes)
As of 0.5.1, the following development roadmap is forecasted
Bindings to other languages are planned :
Java : using JNI (Java Native Interface), provide a org.w3c.dom.* implementation named org.xemeiah.dom.*. The best advantage is to benefit from memory optimizations from C++ implementation in Java for large documents.
org.xemeiah.dom.* may includes :
DOM Node, Element, Attribute, ...
XPath implementation (standardized as org.w3c.dom.xpath.*)
Python bindings : if someone is interested...
0.5.x will also include a large cleanup work in WebServer
configuration and bootstrap procedure, including
xem-standard directory, for a more
flexible modular structuration of the Xem XML code provided for
the WebServer to work.
A security model must be introduced for two different aspects :
These version series will focus on extending Persistence Layer design & functionalities, including :
Crash recovery :
Branch Commit & Merge :
Xem::NetStore and distributed computing : the Xem::BranchManager model allows extensions to a distributed network of Xemeiah servers, with automatic synchronization between server nodes. A technology preview is forecasted for these releases.
This section details Xemeiah's main concept and overall internal architecture. This may be a pre-requisite for starting to develop using Xemeiah's framework.
Detailed and up-to-date Xemeiah API documentation may be found here : http://xemeiah.sourceforge.net/doc/html.
Xemeiah's fundation class is Xem::Store , responsible which provides access to most of Xemeiah's core functionalities.
Xem::Store is in charge of :
providing and bookkeeping lower-level resources, such as low-level memory management,
referencing all XML QNames (markup names and namespaces) in use, using the Xem::KeyCache class,
referencing and garbage-collecting all instanciated documents, directly or using its dedicated Xem::BranchManager class.
Xem::Document is the class instanciated each time an XML document is opened. This class provides access to the optimized in-memory XML document structure, and provides document-wide functionalities necessary to build up the DOM layer.
The Xem::Document class uses a specialized Xem::DocumentAllocator class for in-document memory management.
Global Xemeiah's design is based on the idea that references shall be instanciated on the stack, with a short lifetime, and access to contents pointed by these references shall be handled by a fast low-level layer.
As a result, a Xem::NodeRef contains no information except an internal pointer (the Xem::SegmentPtr), which is used by the Xem::DocumentAllocator layer to fetch real information concerning this node (the ElementSegment structure for an element, the Xem::AttributeSegment structure for an attribute).
This implies that each Xem::NodeRef must be instanciated with a proper reference to a Xem::Document, and this reference shall remain unchanged for the whole Xem::NodeRef lifetime.
Nota : Because attributes are always retrieved from elements, a Xem::AttributeRef reference stores two pointers : one providing access to attribute contents, and one referencing the element which holds this attribute.
Attributes stored inside of Xemeiah are not restricted to base types (String, Number, Integer, QNames, …).
Xemeiah offers the opportunity to build feature-rich typed attributes which can store a large amount of data.
As a result, an element can contain various attributes which share the same name but have different types (see next section on XPath processing for further details).
For example, the Xem::SKMapRef class and subclasses provides access to a Skip-List based implementation of various associative containers, each of which being stored as attributes.
This includes :
hash-based maps to other elements (see Xem::ElementMapRef and Xem::ElementMultiMapRef)
lists of parsed QNames ( Xem::QNameListRef)
Binary Large Object (BLOB) with fast seeking algorithm (Xem::BlobRef)
Xem::XPathParser is the class responsible for parsing, validating and optimizing XPath expressions. Each parsed XPath expression is stored in memory using a Xem::XPathSegment structures.
An XPath expression can eventually (and will generally) be stored as an attribute, on the same element which provided the XPath expression as a string-format attribute.
Let's consider the following example :
<xsl:if test=”count(*) > 3”> ... </xsl:if>
The 'xsl:if' element has a string-typed attribute 'test', which contains the XPath expression. When the XPath class is instanciated for this attribute :
KeyId testKeyId = getKeyCache().getLocalKeyId('test'); Xem::XPath xpathExpression ( xslIfElement, testKeyId );
Xem::XPath constructor will search for an AttributeType_XPath attribute in the xslIfElement named 'test', and will call Xem::XPathParser if it finds none. The Xem::XPathParser will in turn store the parsed format back to the xslIfElement element for future reuse.
XPath evaluation is built upon the following mechanisms :
XPath expressions are split into several atomic steps. These steps represent both steps threw the document (such as 'grandfather/father/child/grandchild') and all other kinds of expression components, such as comparators, XPath functions,
all environment variables (the $variable resolution) are delegated to the Xem::XProcessor class (see below)