|
| |
MRML Specification: Version 0.1
|
|
-
Status: Final - Reformulated and superseded by version 1.0.
-
Requirements
- Hard copy: Appeared as
[
1]
- Structure (DTD)
-
Comment: this version is the first formal specification that has been issued by the Viper team at University of Geneva. It has been the base for the implementation of all client-server implementation.
|
MRML is the Multimedia Retrieval Markup Language. The aim is to standardize access to Mutlimedia Retrieval software components.
MRML is an XML-based protocol. It corresponds to a well-defined DTD (resp. schema).
|
|
-
Simplicity: The first aim is to keep simplicity in both the understanding of MRML and the developement of related software.
-
Extensibility: Our main concern was to provide a framework which permits independent growth of the products of different research groups (followed by periodical code merging).
-
No preferred implementation language: We want to leave the developer the freedom of choice of the implementation language. A standard like this is unlikely to be
adopted by the research community, if it works only with a given
``mainstream'' computing
environment.
-
Independence of third-party libraries:
We want the use of the communication protocol to be as
independent from third party libraries as possible. A group should
be able to provide its own tools within finite time.
|
|
MRML-based communications have the structure of a remote procedure call:
the client connects to the server, sends a request, and stays connected to
the server until the server breaks the connection. The server shuts down
the connection after sending the MRML message which answers the request.
MRML, in its current specification (and implementation) state, supports
the following features:
- request of a capability description from the server,
- selection of a data collection classified by query paradigm; it
is possible to request collections which can be queried in
a certain manner,
- selection and configuration of a query processor, also classified
by query paradigm; MRML also permits the configuration of meta-queries
during run time,
- formulation of QBE queries,
- transmission of user interaction data.
The final feature reflects our strong belief that affective computing
[
2]
will soon play a role in the field of content-based
multimedia retrieval. MRML already supports this by allowing the logging of
some user interaction data. In particular, this is the case for the
history-forward and history-backward functionalities of the SnakeCharmer
interface.
|
|
An MRML connection is typically initiated by the client who requests this connection to the server.
An MRML server listens on a port for MRML messages on a given TCP socket. When connecting, the client requests the basic properties of the server, and waits for an answer. Skipping standard XML headers, the corresponding MRML code is:
| |
<mrml
>
<get-server-properties
/>
</mrml>
|
The server then informs the client of its capabilities. This message is
empty in this version of MRML, but it allows for the extension of the protocol:
| |
<mrml
>
<server-properties
/>
</mrml>
|
|
|
Using similar simple messages, the client can request a list of the
collections available on the server, together with descriptions of the ways
in which they can be queried.
| |
<mrml
>
<open-session
user-name = "anonymous"
session-name = "charmer_default_session"
/>
<get-collections
/>
<get-algorithms
/>
</mrml>
|
The client can then open a session on the server, and configure it
according to the needs of its user (interactive client) or its own needs
(eg meta-query agents). The client can also request the algorithms which
can be used with a given collection:
| |
<mrml
>
<get-algorithms
collection-id = "collection-1"
/>
</mrml>
|
This request is answered by sending the corresponding list of algorithms.
This handshaking mechanism allows both interactive clients and programs
(such as meta-query agents or automatic benchmarkers) to obtain information
describing the server.
In a similar simple manner, the client can open and close sessions for a
user, and configure the algorithms chosen by the user. This enables
multi-user servers and also on-the-fly learning by the query processor.
|
| Configuring the interface |
|
|
The client can request property sheet descriptions from the server.
Different algorithms will have different relevant parameters which should
be user-configurable (eg feature sets, speed vs. quality). Viper, for
example, offers several weighting functions and a variety of
methods for, and levels of, pruning. All these parameters are irrelevant
for CIRCUS. Thanks to MRML property sheets, the interface can adapt itself
to these specific parameters. At the same time, MRML specifies the way the
interface will turn these data into XML to send them back to the server.
Here is short example of interface configuration:
| |
<property-sheet
property-sheet-id = "s1"
type = "numeric"
numeric-from = "1"
numeric-to = "100"
numeric-step = "1"
caption = "% features evaluated"
send-type = "attribute"
send-name = "cui-percentage-features"
/>
|
This specifies a display element which will allow the user to enter
an attribute with the caption "% of features
evaluated". The values the user will be able to enter are integers
between 1 and 100 inclusive. The value will be sent as an attribute named
cui-percentage-features
This mechanism allows the use of complex property sheets, which can send XML
text containing multiple elements.
|
|
Query formulation
The query step is dependent on the query paradigms offered by the interface
and the search engine. MRML currently includes only QBE, but it has been
designed to be extensible to other paradigms.
A basic QBE query consists of a list of images and the corresponding
relevance levels assigned to them by the user. In the following example,
the user has marked two images, the image 1.jpg positive
(user-relevance="1") and the image 2.jpg
negative (user-relevance="-1"). All query images are
referred to by their URLs.
| |
<mrml
session-id = "1"
transaction-id = "44"
>
<query-step
session-id = "1"
resultsize = "30"
algorithm-id = "algorithm-default"
>
<user-relevance-list
>
<user-relevance-element
image-location = "http://viper.unige.ch/1.jpg"
user-relevance = "1"
/>
<user-relevance-element
image-location = "http://viper.unige.ch/2.jpg"
user-relevance = "-1"
/>
</user-relevance-list>
</query-step>
</mrml>
|
Query formulation
The server will then return the retrieval result as a list of images, again
represented by their URLs.
| |
<mrml
>
<query-result
>
<query-result-element-list
>
<query-result-element
calculated-similarity = "0.484531"
image-location = "1.jpg"
/>
<query-result-element
calculated-similarity = "0.476623"
image-location = "2.jpg"
/>
<query-result-element
calculated-similarity = "0.437316"
image-location = "4.jpg"
/>
<query-result-element
calculated-similarity = "0.312506"
image-location = "3.jpg"
/>
</query-result-element-list>
</query-result>
</mrml>
|
Queries can be grouped into transactions. This allows the formulation and
logging of complex queries. This may be applied in systems which process a
single query using a variety algorithms, such as the split-screen version
of TrackingViper
[
3]
or the system described by Lee et al.
[
4]
.
It is important in these cases to preserve in the logs the knowledge that
two queries are logically related one to another.
|
|
MRML property sheets are a method to work around the fact that the a common set of
configuration parameters for image databases is difficult to find and probably awkward
to use.
We suggest to achieve this by sending code which allows to build GUIs (i.e. the
subset you would need for configuration of an algorithm), along with a specification of
how to generate pieces of XML code from the GUI's state. This code is XML and it will
not be executed, so, to our knowledge, there is no inherent security hole.
A very simple example
Viper is a system which uses inverted files for the indexation of images. Each image
is translated in a variable{length sequence of features which describe the image. Each
feature is assigned a weight determined dependent on the frequency of the feature within
the image and within the collection. How exactly this is done, depends on the weighting
functions. Both retrieval performance and processing speed of the system depend on the
weighting function.
Viper gives the possibility to choose the weighting function at runtime, using an at-
tribute cui-weighting-function of the algorithm element. The following property sheet
gives the possibility to choose between two weighting functions.
The basic need" of a system would be to specify the collection, i.e. the database on
which the retrieval is to be performed. For testing and comparison it would be interesting
to have the choice between several algorithms (e.g. wavelet coefficient/color histogram
based).
A choice out of a list of two elements:
| |
<property
id = "p1"
type = "subset"
caption = "Weighting function"
visibility = "visible"
sendtype = "attribute"
sendname = "cui-weighting-function"
minsubsetsize = "1"
maxsubsetsize = "1"
>
<property
id = "p2"
type = "setelement"
caption = "Best fully weighted"
visibility = "visible"
sendtype = "value"
sendvalue = "best-fully"
defaultstate = "selected"
/>
<property
id = "p3"
type = "setelement"
caption = "Classical IDF"
visibility = "visible"
sendtype = "value"
sendvalue = "classical-idf"
defaultstate = "unselected"
/>
</property>
|
What does this do exactly?
it defines a list of which the user is allowed to chose a subset of size between 1 and
1, i.e. an exclusive choice.
When asked for its state this list will generate an attribute, i.e. the text given by
send-name, plus =: cui-weighting-function=.
The value of the attribute will be determined as follows: follows: property Ele-
ments p1 and p2 are identical in structure. They denote the elements of our set
which can either be selected or unselected. If selected they send a text which
will be placed like an attribute value (value). This text will be "best-fully" or
"classical-idf", depending on which of the two list items is chosen by the user.
As a result: the piece of MRML above will enable the interface to set up a property
sheet which comprises a list of two items, of which one can be selected. Depending on the
selection, the interface will send either
cui-weighting-function="best-fully"
or
cui-weighting-function="classical-idf"
to the server. The MRML client will use this property sheet when generating a configure-session
message.
More complex: generate XML subtrees
The following example describes the generation of whole document subtrees. This feature
is not yet immediately useful for Viper or CIRCUS. However it provides an explanation
on how the text generated in the previous is included into configure-session message.
More important is the fact that it provides a general framework for describing (GUI)
entities which can send XML.
Consider the following example: Imagine an algorithm which runs the query image
through a series of filters before running them through a simple query processor. Being a
research system, we would like these filters to be run-time-configurable. Each filter needing
some parameters, the and number of filters being variable, we simply need to define some
new MRML tags which permit us to describe the sequence of filters. We would an output
like the one given below:
| |
<cui-filter-list
>
<cui-filter
cui-filter-type = "horizontal-gabor"
cui-filter-gabor-sdev = "50"
cui-filter-gabor-wavelength = "10"
/>
<cui-filter
cui-filter-type = "gauss"
cui-filter-gauss-sdev = "5"
/>
</cui-filter-list>
|
The corresponding property sheet would look like:
| |
<property
id = "p1"
type = "panel"
caption = "Filter Sequence"
visibility = "invisible"
send-type = "element"
send-name = "cui-filter-sequence"
>
<property
id = "p1"
type = "multi-set"
caption = "Filter"
visibility = "visible"
send-type = "element"
send-name = "cui-filter"
minsubsetsize = "0"
maxsubsetsize = "5"
>
<property
id = "p11"
type = "set-element"
caption = "Gaussian blur"
visibility = "pop-up"
sendtype = "attribute"
sendname = "cui-filter-type"
sendvalue = "gauss"
defaultstate = "selected"
>
<property
id = "p111"
type = "numeric"
caption = "Standard deviation"
visibility = "pop-up"
sendtype = "attribute"
sendname = "cui-filter-gauss-sdev"
/>
</property>
<property
id = "p12"
type = "set-element"
caption = "Horizontal Gabor"
visibility = "pop-up"
sendtype = "attribute"
sendname = "cui-filter-type"
sendvalue = "horizontal-gabor"
defaultstate = "selected"
>
<property
id = "p121"
type = "numeric"
caption = "Tile size"
visibility = "pop-up"
sendtype = "attribute"
sendname = "cui-filter-gabor-sdev"
numeric-from = "5"
numeric-to = "100"
numeric-step = "5"
/>
<property
id = "p122"
type = "numeric"
caption = "Tile size"
visibility = "pop-up"
sendtype = "attribute"
sendname = "cui-filter-gabor-wavelength"
numeric-from = "2"
numeric-to = "20"
numeric-step = "1"
/>
</property>
</property>
</property>
|
The example above shows exactly the described scenario: The user has the choice to
use sequences of 0 up to 5 filters. The filters can be either Gaussian blur or horizontal
gabor filters (yes, this is a toy example).
The Gaussian blur can be configured by giving a number between 1 and 100, which
will be sent as an attribute (cui-filter-gauss-sdev). The gabor filter can be configured
using the two parameters cui-filter-gabor-sdev and cui-filter-gabor-wavelength.
Both the configuration panels will pop-up when the corresponding filter has been
selected in the sequence.
In the following section we describe how the text is actually generated, and how dialog
dynamics is specified.
|
|
In order to demonstrate how easily MRML can be extended to other query
paradigms, we give as an example QBE for images with user annotation. We
assume that the user is invited to associate textual comments with images
he or she marks as relevant or irrelevant. Since a tag for this purpose
does not yet exist in MRML, we add an
attribute cui-user-annotation="..." to the element. The
prefix cui- is added to avoid name clashes with extensions
from other groups which use MRML.
| |
<user-relevance-list
>
<user-relevance-element
image-location = "file:/images/1.jpg"
user-relevance = "1"
cui-user-annotation = "tropical fish"
/>
</user-relevance-list>
|
It is important to note here that servers which do not recognize the
cui-user-annotation attribute still can make use of the
remaining information contained in the user-relevance-element element.
As an example of how not to extend MRML, we give an extension with
the same semantics but which does not respect the principle of graceful
degradation:
| |
<user-relevance-list
>
<cui-user-relevance-element
image-location = "file:/images/1.jpg"
user-relevance = "1"
user-annotation = "tropical fish"
/>
</user-relevance-list>
|
Instead of adding an attribute to an existing MRML element (user-relevance-element),
a new element was defined that
contained the same kind of extension, namely cui-user-relevance-element.
Consequently, servers which do not
recognize this element will not be able to exploit any relevance
information.
|
|
Client-server communication in MRML is a sequence of
connections. In each connection a single request or a small groups of
requests is answered by the server using a single message or a small
group of messages. The state machine in the figure below
describes the communication starting with the point where the first
makes contact with the server.
The client establishes first contact with the server by sending a
get-server-properties message. As a response the client receives
a server-properties which is empty for standard
MRML. However, this message is an important stub for extensions which
concern the connection itself (e.g. finding out, if the server is able
to do a session using a permanent connection.
After receiving the configuration description, the client will ask
for a list of sessions for a user, using the get-sessions
tag. The reply is a session-list. The client will now open
one session using open-session, getting an
acknowledge-session-op as return. The opened session is
required to have a sensible default state, ie a state which allows
queries.
Opening the session, the server has received the user's name, password
and the session-id. ie it has all the necessary information for
knowing which collections and algorithms the user should see. Please
note, that no one is forced to do user-dependent configuration of the
system, but MRML gives the possibility of doing so. So, after opening
the session, the client has the possibility to request both lists of
collections and algorithms.
Both collections and algorithms are described by the query-paradigms
they allow, as well as some other parameters. In particular, an
algorithm can contain as an attribute the ID of a collection on which
it will be used.
Getting both a list of
collections and a list of algorithms, the client has enough
information to configure the session which has been opened: when
configuring the session, the client sends a configure-session
signal which contains an algorithm with the attributes
algorithm-id and algorithm-type set. The attribute
collection-id a suitable algorithm.
After this, the session is fully configured and can be
queried (using query-step). Queries can be grouped into
transactions for group queries for logging and learning purposes.
Intermixed with the queries, the client is able to send
user-data to the server. These user data tags contain user
interaction information for logging and learning purposes.
|
|
Below are now examples of actual messages exchanged between an MRML-compliant server and an MRML compliant client.
Listing collections
| |
<collection-list
>
<collection
collection-id = "TSR500"
collection-name = "TSR500"
cui-algorithm-id-list-id = "ail-inverted-file"
cui-base-dir = "/home/viper/databases/TSR500/"
cui-feature-description-location = "InvertedFileFeatureDescription.db"
cui-feature-file-location = "url2fts.xml"
cui-inverted-file-location = "InvertedFile.db"
cui-number-of-images = "500"
cui-offset-file-location = "InvertedFileOffset.db"
>
<query-paradigm-list
>
<query-paradigm
cui-indexing-type = "inverted_file"
/>
</query-paradigm-list>
</collection>
<collection
collection-id = "c-1-Lausanne"
collection-name = "Lausanne 6100"
cui-algorithm-id-list-id = "ail-inverted-file"
cui-base-dir = "/home/viper/databases/Lausanne6100/"
cui-feature-description-location = "InvertedFileFeatureDescription.db"
cui-feature-file-location = "url2fts.xml"
cui-inverted-file-location = "InvertedFile.db"
cui-number-of-images = "6100"
cui-offset-file-location = "InvertedFileOffset.db"
>
<query-paradigm-list
>
<query-paradigm
cui-indexing-type = "inverted_file"
/>
</query-paradigm-list>
</collection>
</collection-list>
|
Listing algorithms
This setup configures the interface, as it stands in the Viper demo.
| |
<algorithm-list
>
<algorithm
algorithm-id = "adefault"
algorithm-name = "Classical IDF (sep. norm.)"
algorithm-type = "adefault"
collection-id = "c-1-Lausanne"
cui-base-type = "multiple"
cui-block-color-blocks = "no"
cui-block-color-histogram = "no"
cui-block-texture-blocks = "no"
cui-block-texture-histogram = "no"
cui-uses-result-urls = "yes"
cui-weighting-function = "ClassicalIDF"
>
<property-sheet
maxsubsetsize = "4"
minsubsetsize = "1"
property-sheet-id = "cui-p1"
property-sheet-type = "subset"
send-type = "none"
>
<property-sheet
caption = "Colour histogram"
property-sheet-id = "cui-p11"
property-sheet-type = "set-element"
send-boolean-inverted = "yes"
send-name = "cui-block-color-histogram"
send-type = "attribute"
send-value = "yes"
/>
<property-sheet
caption = "Colour blocks"
property-sheet-id = "cui-p12"
property-sheet-type = "set-element"
send-boolean-inverted = "yes"
send-name = "cui-block-color-blocks"
send-type = "attribute"
send-value = "yes"
/>
<property-sheet
caption = "Gabor histogram"
property-sheet-id = "cui-p13"
property-sheet-type = "set-element"
send-boolean-inverted = "yes"
send-name = "cui-block-texture-histogram"
send-type = "attribute"
send-value = "yes"
/>
<property-sheet
caption = "Gabor blocks"
property-sheet-id = "cui-p14"
property-sheet-type = "set-element"
send-boolean-inverted = "yes"
send-name = "cui-block-texture-blocks"
send-type = "attribute"
send-value = "yes"
/>
<property-sheet
caption = "Prune at % of features"
from = "20"
property-sheet-id = "cui-p15"
property-sheet-type = "numeric"
send-name = "cui-pr-percentage-of-features"
send-type = "attribute"
send-value = "70"
step = "5"
to = "100"
/>
</property-sheet>
<query-paradigm-list
>
<query-paradigm
interaction-type = "qbe"
/>
</query-paradigm-list>
<algorithm
algorithm-id = "a1"
algorithm-name = "Color histogram"
cui-base-type = "inverted_file"
cui-block-color-blocks = "yes"
cui-block-texture-blocks = "yes"
cui-block-texture-histogram = "yes"
cui-more-config = "p-i-1"
cui-pr-percentage-of-features = "100"
cui-weighting-function = "ClassicalIDF"
/>
<algorithm
algorithm-id = "a2"
algorithm-name = "Color blocks"
cui-base-type = "inverted_file"
cui-block-color-histogram = "yes"
cui-block-texture-blocks = "yes"
cui-block-texture-histogram = "yes"
cui-more-config = "p-i-1"
cui-weighting-function = "ClassicalIDF"
/>
<algorithm
algorithm-id = "a3"
algorithm-name = "Texture histogram"
cui-base-type = "inverted_file"
cui-block-color-blocks = "yes"
cui-block-color-histogram = "yes"
cui-block-texture-blocks = "yes"
cui-more-config = "p-i-1"
cui-pr-percentage-of-features = "100"
cui-weighting-function = "ClassicalIDF"
/>
<algorithm
algorithm-id = "a4"
algorithm-name = "Texture blocks"
cui-base-type = "inverted_file"
cui-block-color-blocks = "yes"
cui-block-color-histogram = "yes"
cui-block-texture-histogram = "yes"
cui-more-config = "p-i-1"
cui-weighting-function = "ClassicalIDF"
/>
</algorithm>
</algorithm-list>
|
|
|
query-paradigms
Algorithms are described by their algorithm-id and their
algorithm-type, as well as by their query-paradigm-list.
A query-paradigm-list contains query-paradigm elements
which contain an unspecified number of attributes. One of which
can be the attribute query-mode which at present has
the possible values "qbe" or "browsing". All other attributes
presently are extensions.
The main use of the query paradigm list is to enable clients to determine
which collection can be used with which algorithm. In short, an
algorithm can used with a collection, if their
query-paradigm-list match.
Two query-paradigm-list L1 and L2 match, if there is at
least one pair of query-paradigm E1 in L1, E2 in L2
such that E1 and E2 match. Two query-paradigm E2
match, if for the sets of their attribute-value pairs S(1,2) holds:
|
(a, V1) in S1 and (a,V2) in S2 the V1=V2
|
In particular, a query-paradigm tag without attributes matches
any other query-paradigm tag.
algorithms
As it was said, Algorithms are described by their algorithm-id
and their algorithm-type, as well as by their
query-paradigm-list, and (optionally) a allows-children
element (which in turn contains another query-paradigm-list).
As described in the last section, the first query-paradigm-list
specifies which collection can be queried with this algorithm,
and it informs the client about its properties. The client or
its user can then decide if to proceed or not.
It is possible to specify algorithms recursively. Algorithms can
contain other algorithms, possibly several of one type
algorithm-type, the algorithm-id however, has to be
unique in one configure-session statement. It is thus possible
to let the client specify meta-queries. Which kind of meta queries can
be built, decided by the allows-children tag. An algorithm
A1 is allowed to contain another algorithm A2, if the
query-paradigm-list contained in the allows-children tag
of A2 matches the query-paradigm-list of A2.
|
References
[1]  Wolfgang
Müller, Zoran Pe\ucenovi\'c, Arjen P. de Vries, David
McG. Squire, Henning
Müller, Thierry
Pun.
MRML: Towards an extensible standard for multimedia querying and benchmarking (Draft Proposal).
Technical report number
No. 99.04, Computer Vision Group, Computing Centre, University of Geneva, rue Général Dufour, 24, CH-1211 Genève, Switzerland, October, 1999.
[2]
[3]
[4]
|
|