MRML: the Multimedia Retrieval Markup Language
Multimedia Retrieval Markup Language
Home
News
Specification
Documentation
Extensions
FAQ
Resources
People
Links
Disclaimer
Contact us
Site map
 

MRML Specification: Version 0.1

AimsRequirementsIntroduction to MRMLCreating a connectionOpening a sessionConfiguring the interfaceQueries and responsesProperty sheetsExtensionsMRML DTDMRML state machineExample messagesComplements

  • Status: Final - Reformulated and superseded by version 1.0.
  • Requirements
  • Hard copy: Appeared as [ 1]
  • Structure (DTD)
  • Comment: this version is the first formal specification that has been issued by the Viper team at University of Geneva. It has been the base for the implementation of all client-server implementation.

Aims
^

MRML is the Multimedia Retrieval Markup Language. The aim is to standardize access to Mutlimedia Retrieval software components. MRML is an XML-based protocol. It corresponds to a well-defined DTD (resp. schema).

Requirements
^

  • Simplicity: The first aim is to keep simplicity in both the understanding of MRML and the developement of related software.
  • Extensibility: Our main concern was to provide a framework which permits independent growth of the products of different research groups (followed by periodical code merging).
  • No preferred implementation language: We want to leave the developer the freedom of choice of the implementation language. A standard like this is unlikely to be adopted by the research community, if it works only with a given ``mainstream'' computing environment.
  • Independence of third-party libraries: We want the use of the communication protocol to be as independent from third party libraries as possible. A group should be able to provide its own tools within finite time.

Introduction to MRML
^

MRML-based communications have the structure of a remote procedure call: the client connects to the server, sends a request, and stays connected to the server until the server breaks the connection. The server shuts down the connection after sending the MRML message which answers the request.

MRML, in its current specification (and implementation) state, supports the following features:
  • request of a capability description from the server,
  • selection of a data collection classified by query paradigm; it is possible to request collections which can be queried in a certain manner,
  • selection and configuration of a query processor, also classified by query paradigm; MRML also permits the configuration of meta-queries during run time,
  • formulation of QBE queries,
  • transmission of user interaction data.
The final feature reflects our strong belief that affective computing [ 2] will soon play a role in the field of content-based multimedia retrieval. MRML already supports this by allowing the logging of some user interaction data. In particular, this is the case for the history-forward and history-backward functionalities of the SnakeCharmer interface.

Creating a connection
^

An MRML connection is typically initiated by the client who requests this connection to the server. An MRML server listens on a port for MRML messages on a given TCP socket. When connecting, the client requests the basic properties of the server, and waits for an answer. Skipping standard XML headers, the corresponding MRML code is:
 
<mrml >
      <get-server-properties />
</mrml>
The server then informs the client of its capabilities. This message is empty in this version of MRML, but it allows for the extension of the protocol:
 
<mrml >
      <server-properties />
</mrml>

Opening a session
^

Using similar simple messages, the client can request a list of the collections available on the server, together with descriptions of the ways in which they can be queried.
 
<mrml >
      <open-session  
        user-name = "anonymous"  
        session-name = "charmer_default_session" />
      <get-collections />
      <get-algorithms />
</mrml>
The client can then open a session on the server, and configure it according to the needs of its user (interactive client) or its own needs (eg meta-query agents). The client can also request the algorithms which can be used with a given collection:
 
<mrml >
      <get-algorithms   collection-id = "collection-1" />
</mrml>
This request is answered by sending the corresponding list of algorithms. This handshaking mechanism allows both interactive clients and programs (such as meta-query agents or automatic benchmarkers) to obtain information describing the server. In a similar simple manner, the client can open and close sessions for a user, and configure the algorithms chosen by the user. This enables multi-user servers and also on-the-fly learning by the query processor.

Configuring the interface
^

The client can request property sheet descriptions from the server. Different algorithms will have different relevant parameters which should be user-configurable (eg feature sets, speed vs. quality). Viper, for example, offers several weighting functions and a variety of methods for, and levels of, pruning. All these parameters are irrelevant for CIRCUS. Thanks to MRML property sheets, the interface can adapt itself to these specific parameters. At the same time, MRML specifies the way the interface will turn these data into XML to send them back to the server. Here is short example of interface configuration:
 
<property-sheet  
  property-sheet-id = "s1"  
  type = "numeric"  
  numeric-from = "1"  
  numeric-to = "100"  
  numeric-step = "1"  
  caption = "% features evaluated"  
  send-type = "attribute"  
  send-name = "cui-percentage-features" />
This specifies a display element which will allow the user to enter an attribute with the caption "% of features evaluated". The values the user will be able to enter are integers between 1 and 100 inclusive. The value will be sent as an attribute named cui-percentage-features This mechanism allows the use of complex property sheets, which can send XML text containing multiple elements.

Queries and responses
^

Query formulation
The query step is dependent on the query paradigms offered by the interface and the search engine. MRML currently includes only QBE, but it has been designed to be extensible to other paradigms. A basic QBE query consists of a list of images and the corresponding relevance levels assigned to them by the user. In the following example, the user has marked two images, the image 1.jpg positive (user-relevance="1") and the image 2.jpg negative (user-relevance="-1"). All query images are referred to by their URLs.
 
<mrml  
  session-id = "1"  
  transaction-id = "44" >
      <query-step  
        session-id = "1"  
        resultsize = "30"  
        algorithm-id = "algorithm-default" >
            <user-relevance-list >
                  <user-relevance-element  
                    image-location = "http://viper.unige.ch/1.jpg"  
                    user-relevance = "1" />
                  <user-relevance-element  
                    image-location = "http://viper.unige.ch/2.jpg"  
                    user-relevance = "-1" />
            </user-relevance-list>
      </query-step>
</mrml>

Query formulation
The server will then return the retrieval result as a list of images, again represented by their URLs.
 
<mrml >
      <query-result >
            <query-result-element-list >
                  <query-result-element  
                    calculated-similarity = "0.484531"  
                    image-location = "1.jpg" />
                  <query-result-element  
                    calculated-similarity = "0.476623"  
                    image-location = "2.jpg" />
                  <query-result-element  
                    calculated-similarity = "0.437316"  
                    image-location = "4.jpg" />
                  <query-result-element  
                    calculated-similarity = "0.312506"  
                    image-location = "3.jpg" />
            </query-result-element-list>
      </query-result>
</mrml>
Queries can be grouped into transactions. This allows the formulation and logging of complex queries. This may be applied in systems which process a single query using a variety algorithms, such as the split-screen version of TrackingViper [ 3] or the system described by Lee et al. [ 4] . It is important in these cases to preserve in the logs the knowledge that two queries are logically related one to another.

Property sheets
^

MRML property sheets are a method to work around the fact that the a common set of configuration parameters for image databases is difficult to find and probably awkward to use. We suggest to achieve this by sending code which allows to build GUIs (i.e. the subset you would need for configuration of an algorithm), along with a specification of how to generate pieces of XML code from the GUI's state. This code is XML and it will not be executed, so, to our knowledge, there is no inherent security hole.
A very simple example
Viper is a system which uses inverted files for the indexation of images. Each image is translated in a variable{length sequence of features which describe the image. Each feature is assigned a weight determined dependent on the frequency of the feature within the image and within the collection. How exactly this is done, depends on the weighting functions. Both retrieval performance and processing speed of the system depend on the weighting function. Viper gives the possibility to choose the weighting function at runtime, using an at- tribute cui-weighting-function of the algorithm element. The following property sheet gives the possibility to choose between two weighting functions. The basic need" of a system would be to specify the collection, i.e. the database on which the retrieval is to be performed. For testing and comparison it would be interesting to have the choice between several algorithms (e.g. wavelet coefficient/color histogram based). A choice out of a list of two elements:
 
<property  
  id = "p1"  
  type = "subset"  
  caption = "Weighting function"  
  visibility = "visible"  
  sendtype = "attribute"  
  sendname = "cui-weighting-function"  
  minsubsetsize = "1"  
  maxsubsetsize = "1" >
      <property  
        id = "p2"  
        type = "setelement"  
        caption = "Best fully weighted"  
        visibility = "visible"  
        sendtype = "value"  
        sendvalue = "best-fully"  
        defaultstate = "selected" />
      <property  
        id = "p3"  
        type = "setelement"  
        caption = "Classical IDF"  
        visibility = "visible"  
        sendtype = "value"  
        sendvalue = "classical-idf"  
        defaultstate = "unselected" />
</property>
What does this do exactly? it defines a list of which the user is allowed to chose a subset of size between 1 and 1, i.e. an exclusive choice. When asked for its state this list will generate an attribute, i.e. the text given by send-name, plus =: cui-weighting-function=. The value of the attribute will be determined as follows: follows: property Ele- ments p1 and p2 are identical in structure. They denote the elements of our set which can either be selected or unselected. If selected they send a text which will be placed like an attribute value (value). This text will be "best-fully" or "classical-idf", depending on which of the two list items is chosen by the user. As a result: the piece of MRML above will enable the interface to set up a property sheet which comprises a list of two items, of which one can be selected. Depending on the selection, the interface will send either cui-weighting-function="best-fully" or cui-weighting-function="classical-idf" to the server. The MRML client will use this property sheet when generating a configure-session message.

More complex: generate XML subtrees
The following example describes the generation of whole document subtrees. This feature is not yet immediately useful for Viper or CIRCUS. However it provides an explanation on how the text generated in the previous is included into configure-session message. More important is the fact that it provides a general framework for describing (GUI) entities which can send XML. Consider the following example: Imagine an algorithm which runs the query image through a series of filters before running them through a simple query processor. Being a research system, we would like these filters to be run-time-configurable. Each filter needing some parameters, the and number of filters being variable, we simply need to define some new MRML tags which permit us to describe the sequence of filters. We would an output like the one given below:
 
<cui-filter-list >
      <cui-filter  
        cui-filter-type = "horizontal-gabor"  
        cui-filter-gabor-sdev = "50"  
        cui-filter-gabor-wavelength = "10" />
      <cui-filter  
        cui-filter-type = "gauss"  
        cui-filter-gauss-sdev = "5" />
</cui-filter-list>
The corresponding property sheet would look like:
 
<property  
  id = "p1"  
  type = "panel"  
  caption = "Filter Sequence"  
  visibility = "invisible"  
  send-type = "element"  
  send-name = "cui-filter-sequence" >
      <property  
        id = "p1"  
        type = "multi-set"  
        caption = "Filter"  
        visibility = "visible"  
        send-type = "element"  
        send-name = "cui-filter"  
        minsubsetsize = "0"  
        maxsubsetsize = "5" >
            <property  
              id = "p11"  
              type = "set-element"  
              caption = "Gaussian blur"  
              visibility = "pop-up"  
              sendtype = "attribute"  
              sendname = "cui-filter-type"  
              sendvalue = "gauss"  
              defaultstate = "selected" >
                  <property  
                    id = "p111"  
                    type = "numeric"  
                    caption = "Standard deviation"  
                    visibility = "pop-up"  
                    sendtype = "attribute"  
                    sendname = "cui-filter-gauss-sdev" />
            </property>
            <property  
              id = "p12"  
              type = "set-element"  
              caption = "Horizontal Gabor"  
              visibility = "pop-up"  
              sendtype = "attribute"  
              sendname = "cui-filter-type"  
              sendvalue = "horizontal-gabor"  
              defaultstate = "selected" >
                  <property  
                    id = "p121"  
                    type = "numeric"  
                    caption = "Tile size"  
                    visibility = "pop-up"  
                    sendtype = "attribute"  
                    sendname = "cui-filter-gabor-sdev"  
                    numeric-from = "5"  
                    numeric-to = "100"  
                    numeric-step = "5" />
                  <property  
                    id = "p122"  
                    type = "numeric"  
                    caption = "Tile size"  
                    visibility = "pop-up"  
                    sendtype = "attribute"  
                    sendname = "cui-filter-gabor-wavelength"  
                    numeric-from = "2"  
                    numeric-to = "20"  
                    numeric-step = "1" />
            </property>
      </property>
</property>
The example above shows exactly the described scenario: The user has the choice to use sequences of 0 up to 5 filters. The filters can be either Gaussian blur or horizontal gabor filters (yes, this is a toy example). The Gaussian blur can be configured by giving a number between 1 and 100, which will be sent as an attribute (cui-filter-gauss-sdev). The gabor filter can be configured using the two parameters cui-filter-gabor-sdev and cui-filter-gabor-wavelength. Both the configuration panels will pop-up when the corresponding filter has been selected in the sequence. In the following section we describe how the text is actually generated, and how dialog dynamics is specified.

Extensions
^

In order to demonstrate how easily MRML can be extended to other query paradigms, we give as an example QBE for images with user annotation. We assume that the user is invited to associate textual comments with images he or she marks as relevant or irrelevant. Since a tag for this purpose does not yet exist in MRML, we add an attribute cui-user-annotation="..." to the element. The prefix cui- is added to avoid name clashes with extensions from other groups which use MRML.
 
<user-relevance-list >
      <user-relevance-element  
        image-location = "file:/images/1.jpg"  
        user-relevance = "1"  
        cui-user-annotation = "tropical fish" />
</user-relevance-list>
It is important to note here that servers which do not recognize the cui-user-annotation attribute still can make use of the remaining information contained in the user-relevance-element element. As an example of how not to extend MRML, we give an extension with the same semantics but which does not respect the principle of graceful degradation:
 
<user-relevance-list >
      <cui-user-relevance-element  
        image-location = "file:/images/1.jpg"  
        user-relevance = "1"  
        user-annotation = "tropical fish" />
</user-relevance-list>
Instead of adding an attribute to an existing MRML element (user-relevance-element), a new element was defined that contained the same kind of extension, namely cui-user-relevance-element. Consequently, servers which do not recognize this element will not be able to exploit any relevance information.

MRML DTD
^

see this separate page

MRML state machine
^

Client-server communication in MRML is a sequence of connections. In each connection a single request or a small groups of requests is answered by the server using a single message or a small group of messages. The state machine in the figure below describes the communication starting with the point where the first makes contact with the server. The client establishes first contact with the server by sending a get-server-properties message. As a response the client receives a server-properties which is empty for standard MRML. However, this message is an important stub for extensions which concern the connection itself (e.g. finding out, if the server is able to do a session using a permanent connection. After receiving the configuration description, the client will ask for a list of sessions for a user, using the get-sessions tag. The reply is a session-list. The client will now open one session using open-session, getting an acknowledge-session-op as return. The opened session is required to have a sensible default state, ie a state which allows queries. Opening the session, the server has received the user's name, password and the session-id. ie it has all the necessary information for knowing which collections and algorithms the user should see. Please note, that no one is forced to do user-dependent configuration of the system, but MRML gives the possibility of doing so. So, after opening the session, the client has the possibility to request both lists of collections and algorithms. Both collections and algorithms are described by the query-paradigms they allow, as well as some other parameters. In particular, an algorithm can contain as an attribute the ID of a collection on which it will be used. Getting both a list of collections and a list of algorithms, the client has enough information to configure the session which has been opened: when configuring the session, the client sends a configure-session signal which contains an algorithm with the attributes algorithm-id and algorithm-type set. The attribute collection-id a suitable algorithm. After this, the session is fully configured and can be queried (using query-step). Queries can be grouped into transactions for group queries for logging and learning purposes. Intermixed with the queries, the client is able to send user-data to the server. These user data tags contain user interaction information for logging and learning purposes. MRML state machine

Example messages
^

Below are now examples of actual messages exchanged between an MRML-compliant server and an MRML compliant client.
Listing collections
 
<collection-list >
      <collection  
        collection-id = "TSR500"  
        collection-name = "TSR500"  
        cui-algorithm-id-list-id = "ail-inverted-file"  
        cui-base-dir = "/home/viper/databases/TSR500/"  
        cui-feature-description-location = "InvertedFileFeatureDescription.db"  
        cui-feature-file-location = "url2fts.xml"  
        cui-inverted-file-location = "InvertedFile.db"  
        cui-number-of-images = "500"  
        cui-offset-file-location = "InvertedFileOffset.db" >
            <query-paradigm-list >
                  <query-paradigm   cui-indexing-type = "inverted_file" />
            </query-paradigm-list>
      </collection>
      <collection  
        collection-id = "c-1-Lausanne"  
        collection-name = "Lausanne 6100"  
        cui-algorithm-id-list-id = "ail-inverted-file"  
        cui-base-dir = "/home/viper/databases/Lausanne6100/"  
        cui-feature-description-location = "InvertedFileFeatureDescription.db"  
        cui-feature-file-location = "url2fts.xml"  
        cui-inverted-file-location = "InvertedFile.db"  
        cui-number-of-images = "6100"  
        cui-offset-file-location = "InvertedFileOffset.db" >
            <query-paradigm-list >
                  <query-paradigm   cui-indexing-type = "inverted_file" />
            </query-paradigm-list>
      </collection>
</collection-list>

Listing algorithms
This setup configures the interface, as it stands in the Viper demo.
 
<algorithm-list >
      <algorithm  
        algorithm-id = "adefault"  
        algorithm-name = "Classical IDF (sep. norm.)"  
        algorithm-type = "adefault"  
        collection-id = "c-1-Lausanne"  
        cui-base-type = "multiple"  
        cui-block-color-blocks = "no"  
        cui-block-color-histogram = "no"  
        cui-block-texture-blocks = "no"  
        cui-block-texture-histogram = "no"  
        cui-uses-result-urls = "yes"  
        cui-weighting-function = "ClassicalIDF" >
            <property-sheet  
              maxsubsetsize = "4"  
              minsubsetsize = "1"  
              property-sheet-id = "cui-p1"  
              property-sheet-type = "subset"  
              send-type = "none" >
                  <property-sheet  
                    caption = "Colour histogram"  
                    property-sheet-id = "cui-p11"  
                    property-sheet-type = "set-element"  
                    send-boolean-inverted = "yes"  
                    send-name = "cui-block-color-histogram"  
                    send-type = "attribute"  
                    send-value = "yes" />
                  <property-sheet  
                    caption = "Colour blocks"  
                    property-sheet-id = "cui-p12"  
                    property-sheet-type = "set-element"  
                    send-boolean-inverted = "yes"  
                    send-name = "cui-block-color-blocks"  
                    send-type = "attribute"  
                    send-value = "yes" />
                  <property-sheet  
                    caption = "Gabor histogram"  
                    property-sheet-id = "cui-p13"  
                    property-sheet-type = "set-element"  
                    send-boolean-inverted = "yes"  
                    send-name = "cui-block-texture-histogram"  
                    send-type = "attribute"  
                    send-value = "yes" />
                  <property-sheet  
                    caption = "Gabor blocks"  
                    property-sheet-id = "cui-p14"  
                    property-sheet-type = "set-element"  
                    send-boolean-inverted = "yes"  
                    send-name = "cui-block-texture-blocks"  
                    send-type = "attribute"  
                    send-value = "yes" />
                  <property-sheet  
                    caption = "Prune at % of features"  
                    from = "20"  
                    property-sheet-id = "cui-p15"  
                    property-sheet-type = "numeric"  
                    send-name = "cui-pr-percentage-of-features"  
                    send-type = "attribute"  
                    send-value = "70"  
                    step = "5"  
                    to = "100" />
            </property-sheet>
            <query-paradigm-list >
                  <query-paradigm   interaction-type = "qbe" />
            </query-paradigm-list>
            <algorithm  
              algorithm-id = "a1"  
              algorithm-name = "Color histogram"  
              cui-base-type = "inverted_file"  
              cui-block-color-blocks = "yes"  
              cui-block-texture-blocks = "yes"  
              cui-block-texture-histogram = "yes"  
              cui-more-config = "p-i-1"  
              cui-pr-percentage-of-features = "100"  
              cui-weighting-function = "ClassicalIDF" />
            <algorithm  
              algorithm-id = "a2"  
              algorithm-name = "Color blocks"  
              cui-base-type = "inverted_file"  
              cui-block-color-histogram = "yes"  
              cui-block-texture-blocks = "yes"  
              cui-block-texture-histogram = "yes"  
              cui-more-config = "p-i-1"  
              cui-weighting-function = "ClassicalIDF" />
            <algorithm  
              algorithm-id = "a3"  
              algorithm-name = "Texture histogram"  
              cui-base-type = "inverted_file"  
              cui-block-color-blocks = "yes"  
              cui-block-color-histogram = "yes"  
              cui-block-texture-blocks = "yes"  
              cui-more-config = "p-i-1"  
              cui-pr-percentage-of-features = "100"  
              cui-weighting-function = "ClassicalIDF" />
            <algorithm  
              algorithm-id = "a4"  
              algorithm-name = "Texture blocks"  
              cui-base-type = "inverted_file"  
              cui-block-color-blocks = "yes"  
              cui-block-color-histogram = "yes"  
              cui-block-texture-histogram = "yes"  
              cui-more-config = "p-i-1"  
              cui-weighting-function = "ClassicalIDF" />
      </algorithm>
</algorithm-list>

Complements
^

query-paradigms
Algorithms are described by their algorithm-id and their algorithm-type, as well as by their query-paradigm-list. A query-paradigm-list contains query-paradigm elements which contain an unspecified number of attributes. One of which can be the attribute query-mode which at present has the possible values "qbe" or "browsing". All other attributes presently are extensions. The main use of the query paradigm list is to enable clients to determine which collection can be used with which algorithm. In short, an algorithm can used with a collection, if their query-paradigm-list match. Two query-paradigm-list L1 and L2 match, if there is at least one pair of query-paradigm E1 in L1, E2 in L2 such that E1 and E2 match. Two query-paradigm E2 match, if for the sets of their attribute-value pairs S(1,2) holds:
(a, V1) in S1 and (a,V2) in S2 the V1=V2
In particular, a query-paradigm tag without attributes matches any other query-paradigm tag.

algorithms
As it was said, Algorithms are described by their algorithm-id and their algorithm-type, as well as by their query-paradigm-list, and (optionally) a allows-children element (which in turn contains another query-paradigm-list). As described in the last section, the first query-paradigm-list specifies which collection can be queried with this algorithm, and it informs the client about its properties. The client or its user can then decide if to proceed or not. It is possible to specify algorithms recursively. Algorithms can contain other algorithms, possibly several of one type algorithm-type, the algorithm-id however, has to be unique in one configure-session statement. It is thus possible to let the client specify meta-queries. Which kind of meta queries can be built, decided by the allows-children tag. An algorithm A1 is allowed to contain another algorithm A2, if the query-paradigm-list contained in the allows-children tag of A2 matches the query-paradigm-list of A2.

References


[1] Bib entryPostscript file PDF file Wolfgang   Müller, Zoran Pe\ucenovi\'c, Arjen P. de Vries, David   McG. Squire, Henning   Müller, Thierry   Pun. MRML: Towards an extensible standard for multimedia querying and benchmarking (Draft Proposal). Technical report number No. 99.04, Computer Vision Group, Computing Centre, University of Geneva, rue Général Dufour, 24, CH-1211 Genève, Switzerland, October, 1999.
[2]
[3]
[4]
Visit also: 
VIPER: Visual Information  Processing for Enhanced RetrievalGIFT: Home of the GNU Image Finding ToolThe Benchathlon: the home of CBIR benchmarkingFer-de-Lance: Intelligence for the Free Desktop
(c) CUI
04/12/2004
 Top | Home

Site maintained by Stéphane Marchand-Maillet