Content-Based Search and Annotations in Multimedia Digital Libraries
Laboratory of Interactive and Cooperative
Abstract
producing “knowledge centers” in which authoring, reading
and viewing multimedia documents occur seamlessly along
This paper describes a solution for the organization and
with activities such as searching, annotating and linking for
management of multimedia collections in d igital libraries. The Video U-DL-A (VUDLA) system allows for storage,
The remainder of the paper is organized as follows: Section
indexing and annotation of multimedia documents in such
2 provides some additional background on the context of
a way that text- and image-based queries can be issued in
the project. Next, we describe the design of a video digital
order to retrieve specific scenes from digital video
library and discuss the main aspects of its prototypical
collections. Technologies such as image and speech
implementation in Sections 3 and 4, respectively. We refer
processing, video streaming, multimedia databases,
to salient related work in Section 5 and, finally, we present
information retrieval and graphical user interfaces are
some preliminary results and conclusions in Section 6.
integrated to produce a novel multimedia, multimodal environment which re-valuates text as an important 2. Context knowledge tra nsmission medium. We have developed a fully operational testbed to explore multimedia data 2.1 Multimedia objects in databases properties and organization possibilities as well as a wide
For some time now, commercial database management
systems have been incorporating technologies for multimedia management extending regular data models [17].
1. Introduction
In limited ways, commercial products have proposed storage and temporal functions as well as annotating
Internet information is much more than just text:
services for multimedia materials. Exploiting or customizing
increasingly, multimedia resources are becoming available
this functionality is not trivial, so usually projects
through global networks including documents with images,
supporting multimedia technology are based on commercial
audio and video. At the same time, fast data access from
products extended with proprietary software. Moreover,
any place at any time is motivating developments in several
projects involving multimedia management usually need to
directions to organize huge multimedia information
cope with growing hardware and software requirements [6].
repositories and to make them accessible to users in a simple and effective way [15]. The challenge we are facing is
2.2 Streaming media
to move ahead from a text-based information retrieval
Streaming media technology continuously sends data from
paradigm that has been derived from and applied to
a server to a client. Depending on available bandwidth, the
conventional databases onto new retrieval alternatives that
server may adjust the transfer rate allowing for real-time
include audio and video as data sources, and generally
visualization on the client side, which in this way has
developing mechanisms for handling data formats with
2.3 Content-based search
In this paper we describe the design and implementation of
Traditional functions over multimedia utilize what may be
VUDLA, a system component that endows a digital library
referred to as their syntactical properties: moving forward
with capabilities for managing multimedia collections. In
and rewinding functions manipulate the temporal structure
particular, VUDLA introduces a data model for digital video
of these data types [3]. Nevertheless, particular event
as well as content-based search and annotation
search or automatic summarizing requires semantic
mechanisms. In order to extend the digital library
document analysis. Currently, semantic analysis is
architecture, VUDLA integrates diverse technologies such
performed by humans in tasks such as interactive CD
as speech and image processing, video streaming,
creation, but for digital libraries the amounts of data to be
multimedia databases and information retrieval, thus
processed makes it necessary to define automatic
techniques for generating content representation of
(publications concerning this work can be found at [9]). We
multimedia information. Metadata generation for multimedia
data can be accomplished using speech recognition [31]
3.1 Digital library architecture
and image processing as indexing technologies, whereas information retrieval models may be used to organize
The general architecture of VUDLA, including the new components designed to support multimedia functionality,
metadata and facilitate searching through the resulting index space.
is illustrated in Figure 1. The architecture is an evolution of the one proposed at the inception of our digital libraries
2.4 Annotations
program [19] As can be observed, this is a layered,
Enabling text and graphic annotations in video materials is
extensible, client-server architecture comprising collections,
important in order to analyze and share views on the
data management and modeling facilities as well as a variety
contents presented in multimedia format. Simone [24]
of services on the server side, and various user interfaces
observes that writing has allowed for more articulate,
and work environments on the client side. In Figure 1, the
refined and complex forms of expression, whereas
major components that needed to be developed to extend
knowledge expressed or acquired from audiovisual media
the existing digital library (DL) architecture are presented in
tends to be less articulate and less subtle. Some of the
a darker shade, whereas components that were adapted or
advantages of the cognitive processes associated with
used directly by VUDLA (as described in the next section)
Pacing. In general, the pace of reading is determined by the
Digital Library Client
reader (“pulling” text as desired), whereas the pace of viewing a video is decided by its author, who “pushes”
Multimedia Interfaces Corrigibility. A reader may stop at any point in time to reflect on the text just read; a viewer cannot do this easily.
Encyclopedic references. As a result of its user-controlled
pace, reading allows users to stop and use complementary sources, whereas this cannot be done easily when viewing
XML, Dublin Core, DTDs, OpenGIS, etc.
without disrupting the intended rhythm of the materials.
Citability. A text that has been read can be easily cited or
Digital Collections (theses, images, videos, etc.)
even quoted. What has been viewed does not exhibit this
Digital Library Server
Simone explains that nowadays reading is clearly losing
Figure 1. General DL architecture with video
adherents mainly due to the effortless way in which
management capabilities.
audiovisual informa tion is acquired (“the effort of reading
The storage and retrieval of digital video constitute a first
cannot compete with the ease of viewing”). Linking,
step towards the exploration of video in a digital library.
annotating and searching in VUDLA as an effort to combine
Visualizing, controlling, annotating, querying, and linking
the sequential intelligence derived from reading with the
video segments to and from other resources are among the
simultaneous intelligence fostered by viewing.
desirable capabilities. Thus, VUDLA has been designed to include the following functionality:
3. Adding video to a digital library
We have extended the architecture of a digital library to
Interfaces that allow diverse users to maintain
include multimedia collections and services. Our work is
multimedia collections as well as to visualize,
framed by an initiative we termed University Digital
search, broadcast (or multi-cast), annotate, link
Libraries for All (U-DL-A). U-DL-A has produced a highly
from and refer to any portion of their contents.
distributed digital library that now comprises a wide range
• A video streaming server to provide an efficient
of collections, services and user interfaces. Collections
include theses and dissertations, university publications
• A speech recognition service to generate textual
and historic archives. Services vary from information
transcriptions of the audio track that accompanies
retrieval methods to agent services and interoperability
the video (word spotting). These metadata can be
mechanisms. Finally, user interfaces include personal and
used later on when searching the video collections
group spaces, visualization aids and user agents
for the occurrence of specific utterances.
• An image processing service to create vector
have been typed and should be displayed at specific times.
representations from the key video frames based
In addition, before the second textual comment is presented,
on their color and texture contents. This is also
helpful metadata when searching the video
collections for the occurrence of images or frames with certain characteristics.
4. The VUDLA prototype
• An information retrieval server to provide rich
4.1 Supporting Technologies
access means to the metadata generated by the
As noted previously, we conceive of the construction of
VUDLA as the result of integrating various technologies.
Various information retrieval models should be
Our prototype has resulted from integrating the following:
available so the user may be presented with
• A graphical user interface (GUI) implementing most
multiple paths to explore the multimedia
of the desired functionality. We relied for this on
Java and its specialized classes for graphical user interfaces (Swing), Internet communication (RTP),
• A speech processing component for generating
metadata from the accompanying audio track. Since our materials are mostly in Spanish, we used
a speech recognizer based on neural networks that has been trained for Mexican Spanish [23]. This
work specializes a speech processing toolkit known as CSLU [26], of which a “shell” version
has been used [22]. The corpus used to train the 3-layers neural network was generated in Mexico at
the speech recognition lab of the University.
• The Ozono media streaming server [13], a video-
Figure 2. Metadata layers of a video segment.
on-demand component based on Java JMF. We are using the Solaris -based version 1.1.5 of Ozono,
3.2 Digital Video Data Modeling
guaranteed quality of service at 350 kbps.
There has been a significant amount of work in the area of
• The Hermes Information Retrieval Server, which
digital video data modeling that can be applicable when
provides access to several information retrieval
defining the structure of a multimedia collection.
models and facilitates access to collections in the
Approaches vary in such aspects as the levels of
library [14]. At present, Hermes implements three
granularity (from frames to segments) or content
models for information retrieval: vector space
considerations (from purely physical features to more
extended Boolean and latent semantic indexing [2].
elaborate models that involve the notions of scenes,
• The Informix RDBMS and its multimedia
subjects or hyper-linking provisions). Some relevant work in
extensions (Video and Image “DataBlades”) [12]
this area can be found, for example, in [27] [8] [10]. Progress
[30]. These modules provide data types and SQL
in modeling and manipulating multimedia collections has
extensions that allow for video management and
permeated to research and commercial database systems. In
image processing [4] using the data model
the domain-independent data model that we opted for in our
DL design, multiple metadata layers can be associated with any instant of a given video segment so applications can be
We describe the operation and properties of VUDLA next.
developed to manipulate multimedia collections in flexible
4.2 Adding and managing multimedia content
and varied manners. Figure 2 exemplifies a video segment
VUDLA offers user interfaces so multimedia content can be
with which five layers of metadata have been associated:
easily added to the collections. Every time a video
three words have been spotted by a speech processing
document is added, the speech recognition and image
service; four images have been taken from video frames and
processing systems generate the metadata that will be used
processed as color and texture vectors; a website is
to support video querying and viewing. All the words
associated with the second word; two textual annotations
recognized by the speech processing system are organized
periods, are returned by the information retrieval service
as a metadata stratum, whereas image processing generates
and listed in the top left portion of the interface. In the
color and texture indexes from key images taken from video
figure, the user has selected one of the segments and is
viewing its contents, as supplied by the Ozono media
An interface is also available that allows specialized users
server, on the bottom left panel. Naturally, the user may
to create annotations associated with stored multimedia
easily move back and forward in the video segment or
objects. Users may add textual or graphical annotations or
control the audio volume by using the buttons and sliders
specify web links that will be attached to materials at
displayed in this part of the interface. On the panels to the
specific points in time. These analytic and supplementary
right of the main interface, three major metadata strata are
metadata are also organized as metadata strata.
displayed as they appear in the timeline when the video is played. In the figure, a textual annotation is being displayed
The information retrieval comp onent takes audio
when the video reaches a specific instant. Similarly, a
transcriptions resulting from speech processing and text
reference to a website and a related image are shown. The
annotations provided by users. Lemmatization and
user may choose to stop the video and check the
stopword elimination processes are applied to create
annotations or visit the referenced websites or just continue
to examine the video segment. The numbers to the right of
the annotations provide handles to multiple associated metadata so they can be revisited by the user at will. The text search module uses the temporal DBMS component to manipulate time pointers to video segments. These functions allow VUDLA to establish relationships among the different metadata layers. For example, the Overlap function reports if two segments have a time point in common.
Web links are presented as URLs and their contents can be displayed in a web browser, according to user preferences. Moreover, when materials of the collection are visualized, all web links associated to a video document are active and can be selected through the visual component when the current displaying time matches the “validity” time of a link attached to that material. This is possible because our data model includes a metadata stratum for web links referring to video objects.
For image searching, the interface allows the user to load a target image from the local file system. This image’s color and texture components are compared with those derived
from the key images stored at the corresponding metadata
Figure 3. Client user interface of VUDLA.
stratum. In our current implementation, when the
comparison results in a similarity measure of 95% or higher in at least one of the two criteria, the video segment
4.3 VUDLA Interfaces
associated to the corresponding key image is incorporated
The main user interface of VUDLA is illustrated in Figure 3.
to the result set. The level of similarity should be tuned
The three tabs at the top allow users to switch from
according to application and user requirements.
navigating the collections to searching their contents or to receiving a multicast video for one-way videoconferencing.
5. Preliminary results
Though the latter is a useful feature, we focus our
The VUDLA prototype has been undergoing tests in two
discussion on the first two modes (navigation and search),
major areas: functionality and usability. In the first area, the
as they present very similar interfaces and demonstrate our
main challenge is the integration of diverse technologies
and get them to work seamlessly. In the second area, we aim at demonstrating that VUDLA enables the use of digital
The figure shows the results of searching the video
video in new, much richer ways when compared to its
collection for utterances that are related to “libraries”
analog equivalent, and that users are able to perform tasks
(“bibliotecas” in Spanish). The metadata of video segments
involving multimedia resources not only without the
that matched the queries, as well as the relevant time
limitations referred to in Section 2, but with the added value
time but substantially improving the system’s user
of functionality made possible only by the digital nature of
interface. The current implementation of VUDLA has been
in use for one semester. Our video collection is mostly academic, completely in Spanish, and it includes interviews
5.1 Speech recognition
with researchers from various areas, lectures and software
Tests were carried out to determine the reliability of the
presentations. Its use has been restricted to our research
speech recognition system. Using a general purpose corpus
center and a few volunteer users, but is now open for beta
of 300 voices sampled at 8 kHz in a free-noise env ironment,
testing at http://ict.udlap.mx/people/anibal/
a threshold was established to make decisions from the
ranking that the system gives to the words of the vocabulary. We defined a vocabulary for each video in the
What is most important for our research interests, we have
repository by choosing a set of words that might identify
been able to observe users interact with the new multimedia
collections as we expected. It is useful to revisit the relevant media traits introduced in Section 2.4, which guided our
In general, with our current settings, 60% of the words in
the audio track were correctly ranked (detected as part or not of the audio track) by the system and can be used for
Pace. In VUDLA, the interaction with resources in all
content-based queries. When analyzing recognition
available media is completely under the user’s control. The
problems, we found that most of them were related to
user determines the pace at which temporal media are
substitution problems due to the fact that the recognizer
assigns a high ranking to words that are phonetically similar
Corrigibility. Users of VUDLA may stop at any point in
to those in the vocabulary. If more precision is needed, the
time to reflect on resources being examined, regardless of
speech processing component allows for finer precis ion
their type. If audio or video are momentarily suspended,
other activities may take place and streaming will continue when the user so decides.
5.2 Video Streaming Communication with the Ozono media streaming server Encyclopedic references. Not only can complementary
allows VUDLA users to navigate through the video
sources be used while examining multimedia resources in
collections and within each video segment in a transparent
VUDLA, but those resources may be associated directly
fashion. Video is delivered in response to requests
with a section of a document and available immediately for
submitted from the graphical user interface and can be
displayed according to specified parameters (e.g. starting at
Citability. Content-based search over digital video using
text or images, as well as the possibility to point to any instant within a video segment make multimedia collections
5.3 Information Retrieval
This is one of the most robust system components. Textual audio transcriptions, user annotations and other metadata
VUDLA redefines the concepts of reading and viewing. In
are used extensively in the application of the available
the digital library, reading may be an isolated activity, but
information retrieval models. Metadata and rankings
the reader may also choose to be aware of the presence of
returned by the Hermes server are displayed has allowed
other readers and decide to interact with them by taking
users to identify video segments that best fit their needs.
advantage of the library’s collaboration facilities. Though videoconferencing has been mentioned as one of the
5.4 Image Processing
features of VUDLA, other communication and awareness
Image characteristics are highly variable and factors such as
facilities are available in the overarching digital library (as
lighting, focus, or subject position or motion greatly
described, for example in [20] and [21]).
influence any color or texture analysis. In our current implementation, color and texture comparisons have been
6. Related work
useful only when images used in queries come directly from
Given its integrative nature, our project builds upon a
(or are very similar) to frames in video sequences in the
significant number of related efforts. Work on multimedia
has been undertaken from several perspectives and much progress has been done in areas such as compression,
5.5 Usability
transmission and storage techniques. Semantic analysis and
From the most preliminary interface designs, usability has
indexing make it possible to perform content-based search
been the most important area to evaluate [16]. Around ten
and summarization in multimedia documents.
important usability problems were initially detected and
Informedia [29][28] is one of the most complex projects in
corrected, some of them requiring important development
the area of multimedia information management. Its first
stage also focused on the integration of speech, image and
7. Ongoing work
language processing for creating digital video libraries.
VUDLA is a relatively young project and can use that fact
About 1500 hours of TV news have been segmented into
to its advantage, as much of the ongoing work in the area
some 40,000 independent segments and used to explore
(e.g., projects referred to in the previous section) can be
issues in storage, retrieval and applications of digital video.
thought of as an opportunity for improvement. For instance,
A second stage of the project is focusing on summarizing
visual information retrieval does not yet produce impressive
and visualizing video information as well as on spatial and
results in VUDLA. In that respect, new processing
temporal analysis for query processing. The important
techniques are being studied and simulated, included
ramifications and accomplishments of the Informedia
contour and form detection. In that direction, the Video-
Cuebik system [7] appears as a promising alternative to
The QBIC (query by image content) [5] system was
developed to retrieve images based on their visual
Some of the existing functionality in our prototypical
characteristics (color, texture and form). The user is able to
implementation has ample room for improvement. For
specify search parameters in queries. For video
example, we plan to provide various formatting options for
management, the system has functions for scene detection,
textual annotations that we think will improve text
histogram change analysis, and camera movements.
readability. We will also implement frame hotspots, which
In order to search in multimedia documents where most of
should map specific frame regions to annotations or web
the information is in audio format, researchers of the project
links. We also have started work to explore heuristics and
VideoMail [11] designed a speech recognition system based
formal models for relationships among video metadata [1].
in Hidden Markov Models and a text retrieval engine. The
Other scheduled evolutions are the development of a
user interface presented multimedia information to the user
mechanism involved that allows for annotations to be kept
even if all the background processing was text -based.
personal, not shared for dealing with an ever-growing
An interesting development in multimodal interfaces is the
number of annotations and exploring new video data
Music Library [32], a system with tune recognition
functionality. Acoustic input mechanisms allow users to
As mentioned previously, VUDLA was originally developed
sing or whistle a melody to search through more than 10,000
using the Informix RDBMS and its extensions for multimedia
management. This made our prototype strongly dependent
In [18] the author attempts to describe human behavior in a
on proprietary data types and functions. In order to
video sequence. Based upon silhouette and facial
generalize the operation of the system so any RDBMS can
expression recognition, an estimation model is constructed
be used, we are in the process to migrate our current
to establish probabilities of what the person in the video is
implementation to MySQL. The process required the
implementation of functions at the various architectural levels. About 80% of the functionality available in the
Our project has been strongly influenced by Informedia and
Informix-based version is now operational with MySQL.
other projects. We also aim at integrating technologies and generally exploring digital video as a medium with
Overall, we think of VUDLA as a testbed to explore the
properties that are different from text, still images and
potential of multimedia digital libraries. Systematic user
analog video. However, we have emphasized issues that
involvement and observation, corpus construction, and
have not appeared in other projects or have received only
technology integration are driving issues in our agenda.
marginal attention from other research groups. From
8. Conclusions
technical and cultural viewpoints, we are interested in developing video corpora in Spanish and exploring
Multimedia collections are increasingly popular and in many
mechanisms for retrieval that are particularly appropriate for
ways becoming a fundamental means for knowledge
our collections. From a more philosophical perspective, we
dissemination. New digital video collections are created
are interested in the impact of the deployment of multimedia
every day and occupy huge amounts of disk space. Novel
collections in the context of communities with a strong,
digital libraries integrate technologies and provide
reading-based research tradition. Our proposal ultimately
interfaces for seamless media integration that foster rich
aims at contributing to make digital libraries a catalyst for
interactions between users and media and among media
the transformation of work practices in knowledge-intensive
themselves. We have described VUDLA, a research and
development project that integrates image and speech processing with database and information retrieval techniques to produce a DL with desirable media properties.
One of the concerns regarding the prevalence of video and
[8] Hjesvold, R., Midstraum, R. Modeling and querying video
the abandonment of reading is that video appears as a
data. Proceedings of the 20th VLDB Conference, 1994, 686-
regression from the structured, sequential reasoning
promoted by reading to more primitive, sensorial forms of
[9] ICT. Laboratory of Interactive and Cooperative
Technologies. Universidad de las Américas, Puebla, México.
experienced by the user’s senses but associative thinking is
also promoted. While still some of the mind structuring
[10] Jiang, J., Elmagarmid, A. K. Spatial and temporal content-
properties of knowledge construction based primarily on reading may be at risk, there sure are new, more powerful
based access to hypervideo databases. The VLDB Journal 7, 4, 1998, 226-238.
cognitive processes triggered by multimedia DLs. Our work makes a contribution to the study of the properties and
[11] Jones, G. and Foote, J. Video Mail Retrieval using voice: An
means for organization, manipulation and presentation of
Overview of the Stage 2 System. In Proceedings of the
multimedia collections in digital libraries.
MIRO workshop, University of Glasgow, 1995.
[12] Informix. Informix Video Foundation DataBlade Module
9. Acknowledgments
We would like to express our appreciation to Hugo López
[13] López, H., Lazo, A. The Ozono Digital Media Server.
and Arturo Lazo of our Videoconferencing Department for
Internal Report. Special Projects and Videoconferencing
their support in the construction of the Ozono Media
Department. Universidad de las Américas, Puebla, México.
Streaming Server. We also thank our colleagues in the
2002. Available from http://www.udlap.mx/~proyesp.
Tlatoa Speech Processing Lab, whose support allowed us to use the CSLU Toolkit. We appreciate the colaboration of
[14] Maldonado, F. Hermes: Servidor y biblioteca de modelos de
Fernanda Maldonado in the design and implementation of
recuperación de información. B. Eng. Thesis. Universidad de
[15] Maybury, M. T. (Editor). Intelligent Multimedia Information
This work has received financial support from the National
Retrieval. The MIT Press. Bedford, Mass., 1997.
Council of Science and Technology (Conacyt Projects No. 35804-A and G33009-A).
[16] Nielsen, J. and Molich, R. Heuristic evaluation of user
interfaces. In Proceedings ACM CHI'90 (Seattle WA, April
10. References
[1] Ahanger, G., Little, T.D.C. Data semantics for improving
[17] Ozsu, T. M. Issues in Multimedia Database management.
retrieval performance of digital news video systems.
http://www.cs.ualberta.ca/~database/multimedia.
Database semantics. Kluwer Academic Publisher, 1999, 47-
[18] Pentland, A. Perceptual intelligence. Communications of the
[2] Baeza, R. and Ribeiro-Neto, B. Modern Information
[19] Sánchez, J. A., Leggett, J. A. Agent services for users of
digital libraries. Journal of Networks and Computer
[3] Bernard, M., Dubois F. An Agent -based Architecture for
Content-Based Multimedia Browsing. Intelligent Multimedia
[20] Sánchez, J. A., García, A. J., Proal, C., Fernández, L. 2001.
Enabling the collaborative construction and reuse of
knowledge through a virtual reference environment.
[4] Del Bimbo, A. Visual Information Retrieval. Morgan
Proceedings of the Seventh International Workshop on
Groupware (Darmstadt, Germany, Sept. 6-8). IEEE
[5] Flickner, M., Sawhney, H., Niblack, W. Query by Image and
Computer Society Press, Los Alamitos, Calif. 90-97.
Video Content: The QBIC System. IEEE Computer,
[21] Sánchez, A., Proal, C., Carballo, A., Pérez, D. Personal and
Group Spaces: Integrating resources for users of digital
[6] Grosky, I. W. Managing Multimedia Information in Database
Systems. Communications of the ACM, December 1997, 72-
Workshop on Human Factors in Computing Systems (IHC
2001, Florianópolis, SC, Brazil, Oct. 15-17).
[7] Hauptmann, A., Papernick, N. Video-Cuebik: Adapting
[22] Schalkwyk, J., Colton, D., Fanty, M., The CSLUsh toolkit
Image Search to Video Shots. In Proceedings ACM JCDL'02
for automatic speech recognition, Technical Report No.
(Portland, Oregon, July 13-17), 2002, 156-157.
CSLU-011-96, Center for Spoken Language Understanding,
Oregon Graduate Institute of Science & Technology, 1996.
[23] Serridge, B., Cole, R., Barbosa, A., Munive, N., Vargas, A.
[28] Wactlar, H. D. Informedia - Search and Summarization in the
Creating a Mexican Spanish version of the CSLU toolkit.
Video Medium. In Proceedings on Imagina 2000 conference
Proceedings of the International Conference on Spoken
Language Processing (ICSLP), Sydney, Australia, 1998.
[29] Wactlar, H. D., Kanade, T., Smith, M. A. Stevens, S. M.
[24] Simone, R. La Terza Fase. Laterza. Rome, Italy, 2000.
Intelligent Access to Digital Video: Informedia Project. IEEE
[25] Sun. Java Media Framework API Guide. Sun Microsystems,
[30] Westermann, U. and Klas, W. Architecture of a DataBlade
[26] Sutton, S., Cole, R., de Villiers, J., Schalkwyk, J., Vermeulen,
Module for the Integrated Management of Multimedia
P., Macon, M., Yan, Y., Kaiser, E., Rundle, B., Shobaki, K,
Assets. In Proceedings on Multimedia Intelligent Storage and
Hosom, P., Kain, A., Wouters, J., Massaro, M., Cohen, M.
Retireval Management (MISRM, Orlando, Florida), October
Universal Speech Tools: the CSLU Toolkit. In Proceedings of
the International Conference on Spoken Language Processing
[31] Witbrock, M. and Hauptmann, A. Speech recognition for a
(ICSLP), Sydney, Australia, November 1998, 3221-3224.
digital video library.http://www.informedia.cs.cmu.edu.
[27] Tusch, R. Kosch, H., Böszörményi, L. VIDEX: an integrated
[32] Witten, I., McNab, R., Jones, S. Managing Complexity in a
generic video indexing approach. Proceedings of the eighth
Distributed Digital Library, IEEE Computer 32(2), 1999, 74-
ACM International Conference on Multimedia, 2000, 448-
“GREEN” PROGRAM SHAKE RECIPES If you are lactose intolerant, we recommend using rice milk or soymilk for your shakes if not juice. French Vanilla ALOHA DELIGHT BANANA FRUIT SHAKE THE YOGURT THING PASSIONATE PAPAYA SH. MADNESS PRUNE ORANGE JULIUS & BANANA HERBAL ALL-BRAN SH. HAWAIIAN PUNCH SHAKE AMERICAN APPLE PIE ITALIAN SODA SHAKE ROOT BEER FLOA
Jawbone® Unveils New Data from Sleep Behavior Study, UP® Features and Caffeine Tracker to Demonstrate the Impact of Sleep on Everyday Life SAN FRANCISCO – March 6, 2014 – is today sharing findings from a comprehensive sleep behavior study based on data from the UP® system – the wristband, app and data service that helps you understand how you sleep, move and eat so you can ma