Tag Archives: project

MNEMOSYNE Project

MNEMOSYNE is a three years research project co-funded by the MICC – University of Florence and the Tuscany – European Social Fund. The project is about the study and experimentation of smart environments for the protection and promotion of artistic and cultural heritage. It adopts natural interaction paradigms for access and manipulation of multimedia information and rely on the passive analysis of visitors behaviors [1] to estimate their interests.

mnemosyneThe current vision techniques applied in cultural heritage environments, such as museums, usually provide solutions for protection by detecting situations of potential risk and then notify operators responsible for safety. The idea of the project is to use techniques derived from video-surveillance scenarios to design an automatic profiling system capable of understanding the personal interest of many visitors.

The computer vision system monitors and analyzes the movements and behaviors of visitors in the museum (through the use of fixed cameras) in order to extract a profile of interests for each visitors. This profile of interest is then used to personalize the delivery of in-depth multimedia content enabling an augmented museum experience. Visitors interact with the multimedia content through a large interactive table installed inside the museum. The project also includes the integration of mobile devices (such as smartphones or tablets) offering a take-away summary of the visitor experience and suggesting possible theme-related paths in the collection of the museum or in other places of the city.

My work in this project was to build the back-end and computer vision systems, especially the passive profiling of the visitors inside the museum, with related research themes such as Re-Identification [2, 3, 4], Tracking [5], and Person Detection [6]. The whole MNEMOSYNE system [7] is currently under deployment at the Bargello Museum in Florence.

Related publications

[1] [pdf] [doi] S. Karaman, A. D. Bagdanov, G. D’Amico, L. Landucci, A. Ferracani, D. Pezzatini, and A. Del Bimbo, “Passive Profiling and Natural Interaction Metaphors for Personalized Multimedia Museum Experiences,” in MM4CH’13 – New Trends in Image Analysis and Processing – ICIAP 2013, Naples, Italy: Springer, 2013, p. 247–256.
[Bibtex]
@incollection{karaman2013passive,
title = {Passive Profiling and Natural Interaction Metaphors for Personalized Multimedia Museum Experiences},
author = {Karaman, Svebor and Bagdanov, Andrew D and D’Amico, Gianpaolo and Landucci, Lea and Ferracani, Andrea and Pezzatini, Daniele and Del Bimbo, Alberto},
booktitle = {MM4CH'13 - New Trends in Image Analysis and Processing -- ICIAP 2013},
doi = {10.1007/978-3-642-41190-8_27},
pages = {247--256},
address = {Naples, Italy},
year = {2013},
note={Oral Presentation},
publisher = {Springer}
}
[2] [pdf] [doi] S. Karaman and A. D. Bagdanov, “Identity Inference: Generalizing Person Re-identification Scenarios,” in Computer Vision – ECCV 2012. Workshops and Demonstrations, A. Fusiello, V. Murino, and R. Cucchiara, Eds., Firenze, Italy: Springer Berlin Heidelberg, 2012, vol. 7583, pp. 443-452.
[Bibtex]
@incollection{karamanIdInf2012,
isbn={978-3-642-33862-5},
booktitle={Computer Vision – ECCV 2012. Workshops and Demonstrations},
volume={7583},
series={Lecture Notes in Computer Science},
editor={Fusiello, Andrea and Murino, Vittorio and Cucchiara, Rita},
doi={10.1007/978-3-642-33863-2_44},
title={Identity Inference: Generalizing Person Re-identification Scenarios},
url={http://dx.doi.org/10.1007/978-3-642-33863-2_44},
publisher={Springer Berlin Heidelberg},
author={Karaman, Svebor and Bagdanov, Andrew D.},
pages={443-452},
address = {Firenze, Italy},
note={Oral Presentation. Best Paper Award},
year={2012}
}
[3] [pdf] [doi] S. Karaman, G. Lisanti, A. D. Bagdanov, and A. Del Bimbo, “From Re-identification to Identity Inference: Labeling Consistency by Local Similarity Constraints,” in Person Re-Identification, S. Gong, M. Cristani, S. Yan, and C. C. Loy, Eds., Springer London, 2014, pp. 287-307.
[Bibtex]
@incollection{KaramanReID2014,
author = {Karaman, Svebor and Lisanti, Giuseppe and Bagdanov, Andrew D. and Del Bimbo, Alberto},
title = {From Re-identification to Identity Inference: Labeling Consistency by Local Similarity Constraints},
booktitle = {Person Re-Identification},
series = {Advances in Computer Vision and Pattern Recognition},
editor = {Gong, Shaogang and Cristani, Marco and Yan, Shuicheng and Loy, Chen Change},
isbn = {978-1-4471-6295-7},
doi = {10.1007/978-1-4471-6296-4_14},
url = {http://dx.doi.org/10.1007/978-1-4471-6296-4_14},
publisher = {Springer London},
keywords = {Re-identification; Identity inference; Conditional random fields; Video surveillance},
pages = {287-307},
language = {English},
year = {2014}
}
[4] [pdf] [doi] S. Karaman, G. Lisanti, A. D. Bagdanov, and A. D. Bimbo, “Leveraging local neighborhood topology for large scale person re-identification,” Pattern Recognition, vol. 47, iss. 12, pp. 3767-3778, 2014.
[Bibtex]
@article{karaman2014leveraging,
title = "Leveraging local neighborhood topology for large scale person re-identification ",
journal = "Pattern Recognition ",
volume = "47",
number = "12",
pages = "3767 - 3778",
year = "2014",
note = "",
issn = "0031-3203",
doi = "10.1016/j.patcog.2014.06.003",
url = "http://www.sciencedirect.com/science/article/pii/S0031320314002258",
author = "Svebor Karaman and Giuseppe Lisanti and Andrew D. Bagdanov and Alberto Del Bimbo",
keywords = "Re-Identification",
keywords = "Conditional Random Field",
keywords = "Semi-supervised",
keywords = "\{ETHZ\}",
keywords = "\{CAVIAR\}",
keywords = "3DPeS",
keywords = "\{CMV100\} "
}
[5] [pdf] [doi] A. D. Bagdanov, A. Del Bimbo, D. Di Fina, S. Karaman, G. Lisanti, and I. Masi, “Multi-Target Data Association using Sparse Reconstruction,” in Proc. of International Conference on Image Analysis and Processing (ICIAP), Naples, Italy, 2013, pp. 239-248.
[Bibtex]
@inproceedings{DBLMKD13,
author = {Bagdanov, Andrew D. and Del Bimbo, Alberto and Di Fina, Dario and Karaman, Svebor and Lisanti, Giuseppe and Masi, Iacopo},
title = {Multi-Target Data Association using Sparse Reconstruction},
booktitle = {Proc. of International Conference on Image Analysis and Processing (ICIAP)},
year = {2013},
address = {Naples, Italy},
pages = {239-248},
note={Poster},
doi = {10.1007/978-3-642-41184-7_25},
publisher = {Springer Berlin Heidelberg},
keywords = {Data association; multi-target tracking; sparse methods; video surveillance},
url = {http://www.micc.unifi.it/publications/2013/DBLMKD13}
}
[6] [pdf] F. Bartoli, G. Lisanti, S. Karaman, A. D. Bagdanov, and A. Del Bimbo, “Unsupervised scene adaptation for faster multi-scale pedestrian detection,” in 22nd International Conference on Pattern Recognition (ICPR), Stockholm, Sweden, 2014.
[Bibtex]
@InProceedings{bartoliicpr2014,
author = {Bartoli, Federico and Lisanti, Giuseppe and Karaman, Svebor and Bagdanov, Andrew D. and Del Bimbo, Alberto},
title = {Unsupervised scene adaptation for faster multi-scale pedestrian detection},
note = {Oral presentation},
booktitle = {22nd International Conference on Pattern Recognition (ICPR)},
address = {Stockholm, Sweden},
year = {2014}
}
[7] [pdf] [doi] S. Karaman, A. Bagdanov, L. Landucci, G. D’Amico, A. Ferracani, D. Pezzatini, and A. Del Bimbo, “Personalized multimedia content delivery on an interactive table by passive observation of museum visitors,” Multimedia Tools and Applications, pp. 1-25, 2014.
[Bibtex]
@article{karaman2014mtap,
year={2014},
issn={1380-7501},
journal={Multimedia Tools and Applications},
doi={10.1007/s11042-014-2192-y},
title={Personalized multimedia content delivery on an interactive table by passive observation of museum visitors},
url={http://dx.doi.org/10.1007/s11042-014-2192-y},
publisher={Springer US},
keywords={Computer vision; Video surveillance; Cultural heritage; Multimedia museum; Personalization; Natural interaction; Passive profiling},
author={Karaman, Svebor and Bagdanov, AndrewD. and Landucci, Lea and D’Amico, Gianpaolo and Ferracani, Andrea and Pezzatini, Daniele and Del Bimbo, Alberto},
pages={1-25},
language={English}
}

IMMED Project

The research themes of my PhD thesis [1] were related to those of the IMMED project [2]. This 3 years project was founded by the French National Research Agency (ANR) with the identifier ANR-09-BLAN-0165-01. The acronym IMMED stands for “Indexing MultiMEdia data from wearable sensors for diagnostics and treatment of Dementia” (or “Indexation de données MultiMédia Embarquées pour le diagnostic et le suivi des traitements des Démences” in French).

This project was initiated on the analysis that the aging population implies a major public health problem: maintaining elderly at home. However, aging is accompanied by an increased prevalence of Alzheimer’s disease and related sources of loss of autonomy. Detecting problems in an everyday life, in the so-called instrumental activities of elderly, at home could be a way to avoid any accidents and costly hospitalizations in terms of psychological and physical safety (to the patient and his family) as well as economically. So far, the daily capacity assessment was based on questionnaires completed by the patient or relatives. Thanks to a wearable camera, attached to the shoulder of patients, the space where instrumental activities occur is recorded. The Activities of Daily Living recorded at home are evaluated a posteriori by a medical practitioner. Thanks to the video indexing processing and the visualization software, the specialist can easily access the key moments of this recording. This observation is a way around the forgetfulness and denial, classical phenomena in dementia and allows reducing the delay in diagnosis or risk-taking at home (medicine management, falls etc…).

Presentation of the IMMED project

Within the IMMED project, I have developed tools and methods for indexing activities of daily living in videos acquired from wearable cameras, applied in the context of dementia diagnosis by doctors. The project partners have designed a lightweight wearable device which records patient activities in their own house, thus allowing the medical specialist to spot meaningful events. The recording mode poses great challenges since the video data consists in a single sequence shot where strong motion and sharp lighting changes often appear. Because of the length of the recordings, tools for an efficient navigation in terms of activities of interest are crucial. The work conducted during my PhD introduces a video structuring approach that combines automatic motion based segmentation of the video [3] and activity recognition by a hierarchical two-level Hidden Markov Model [4]. We have leveraged a multi-modal description space [5] over visual and audio features, including mid-level features such as motion, location, speech and noise detections, producing cross-modal algorithms for the automatic indexing of daily living activities [6]. These tools have been included in a video consultation software (developed under my supervision) used by the clinical partner, which exploited it to evaluate the effects of the disease on real data obtained from the recording of more than 50 recording sessions at home [7].

This project has been referenced in MIT Technology Review and on French National Television. A European project « FP7 PI Dem@care », which was initiated by the IMMED consortium is now being in a continuity of this project. A synergy between project participants and the « LabEx Brain » for study of Parkinson disease has been established.

Related publications

[1] [pdf] S. Karaman, “Indexing of Activities in Wearable Videos : Application to Epidemiological Studies of Aged Dementia,” PhD Thesis, 2011.
[Bibtex]
@phdthesis{karaman2011phd,
title={Indexing of Activities in Wearable Videos : Application to Epidemiological Studies of Aged Dementia},
author={Karaman, Svebor},
year={2011},
school={Universit{\'e} Sciences et Technologies-Bordeaux I}
}
[2] [pdf] [doi] R. Mégret, V. Dovgalecs, H. Wannous, S. Karaman, J. Benois-Pineau, E. El Khoury, J. Pinquier, P. Joly, R. André-Obrecht, Y. Gaëstel, and J. Dartigues, “The IMMED Project: Wearable Video Monitoring of People with Age Dementia,” in Proceedings of the International Conference on Multimedia (ACMMM), Firenze, Italy, 2010, p. 1299–1302.
[Bibtex]
@inproceedings{Megret2010,
author = {M{\'e}gret, R{\'e}mi and Dovgalecs, Vladislavs and Wannous, Hazem and Karaman, Svebor and Benois-Pineau, Jenny and El Khoury, Elie and Pinquier, Julien and Joly, Philippe and Andr{\'e}-Obrecht, R{\'e}gine and Ga\"{e}stel, Yann and Dartigues, Jean-Fran\c{c}ois},
title = {The IMMED Project: Wearable Video Monitoring of People with Age Dementia},
booktitle = {Proceedings of the International Conference on Multimedia (ACMMM)},
series = {MM '10},
year = {2010},
isbn = {978-1-60558-933-6},
address = {Firenze, Italy},
pages = {1299--1302},
numpages = {4},
url = {http://doi.acm.org/10.1145/1873951.1874206},
doi = {10.1145/1873951.1874206},
acmid = {1874206},
note = {Video program},
publisher = {ACM},
keywords = {audio and video indexing, patient monitoring, wearable camera}
}
[3] [pdf] [doi] S. Karaman, J. Benois-Pineau, R. Mégret, J. Pinquier, Y. Gaestel, and J. -F. Dartigues, “Activities of daily living indexing by hierarchical HMM for dementia diagnostics,” in 9th International Workshop on Content-Based Multimedia Indexing (CBMI), Madrid, Spain, 2011, pp. 79-84.
[Bibtex]
@INPROCEEDINGS{karamanCBMI2011,
author={Karaman, S. and Benois-Pineau, J. and Mégret, R. and Pinquier, J. and Gaestel, Y. and Dartigues, J.-F.},
booktitle={9th International Workshop on Content-Based Multimedia Indexing (CBMI)},
title={Activities of daily living indexing by hierarchical HMM for dementia diagnostics},
year={2011},
month={June},
address = {Madrid, Spain},
pages={79-84},
abstract={This paper presents a method for indexing human activities in videos captured from a wearable camera being worn by patients, for studies of progression of the dementia diseases. Our method aims to produce indexes to facilitate the navigation throughout the individual video recordings, which could help doctors search for early signs of the disease in the activities of daily living. The recorded videos have strong motion and sharp lighting changes, inducing noise for the analysis. The proposed approach is based on a two steps analysis. First, we propose a new approach to segment this type of video, based on apparent motion. Each segment is characterized by two original motion descriptors, as well as color, and audio descriptors. Second, a Hidden-Markov Model formulation is used to merge the multimodal audio and video features, and classify the test segments. Experiments show the good properties of the approach on real data.},
keywords={hidden Markov models;image colour analysis;image segmentation;indexing;medical diagnostic computing;medical disorders;video recording;audio descriptors;color descriptors;daily living indexing;dementia diagnostics;dementia diseases;hidden-Markov model formulation;hierarchical HMM;human activities indexing;multimodal audio features;original motion descriptors;recorded videos;test segments;two steps analysis;video features;video recordings;wearable camera;Accuracy;Cameras;Dynamics;Hidden Markov models;Histograms;Motion segmentation;Videos},
doi={10.1109/CBMI.2011.5972524},
note={Oral Presentation},
ISSN={1949-3983}
}
[4] [pdf] [doi] S. Karaman, J. Benois-Pineau, R. Mégret, V. Dovgalecs, J. -F. Dartigues, and Y. Gaëstel, “Human Daily Activities Indexing in Videos from Wearable Cameras for Monitoring of Patients with Dementia Diseases,” in 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 2010, pp. 4113-4116.
[Bibtex]
@INPROCEEDINGS{karamanICPR2010,
author={Karaman, S. and Benois-Pineau, J. and Mégret, R. and Dovgalecs, V. and Dartigues, J.-F. and Gaëstel, Y.},
booktitle={20th International Conference on Pattern Recognition (ICPR)},
title={Human Daily Activities Indexing in Videos from Wearable Cameras for Monitoring of Patients with Dementia Diseases},
year={2010},
month={Aug},
pages={4113-4116},
abstract={Our research focuses on analysing human activities according to a known behaviorist scenario, in case of noisy and high dimensional collected data. The data come from the monitoring of patients with dementia diseases by wearable cameras. We define a structural model of video recordings based on a Hidden Markov Model. New spatio-temporal features, color features and localization features are proposed as observations. First results in recognition of activities are promising.},
keywords={feature extraction;hidden Markov models;image colour analysis;image motion analysis;video cameras;video recording;video signal processing;activity recognition;behaviorist scenario;color features;dementia disease patients;hidden Markov model;human activity indexing;localization features;patient monitoring;spatiotemporal features;video recordings;wearable cameras;Biomedical monitoring;Cameras;Hidden Markov models;Histograms;Image color analysis;Motion segmentation;Videos;Bag of Features;HMM;Localization;Monitoring;Video Indexing},
doi={10.1109/ICPR.2010.999},
note={Oral Presentation},
ISSN={1051-4651},
address={Istanbul, Turkey}
}
[5] [pdf] J. Pinquier, S. Karaman, L. Letoupin, P. Guyot, R. Megret, J. Benois-Pineau, Y. Gaestel, and J. -F. Dartigues, “Strategies for multiple feature fusion with Hierarchical HMM: Application to activity recognition from wearable audiovisual sensors,” in 21st International Conference on Pattern Recognition (ICPR), Tsukuba, Japan, 2012, pp. 3192-3195.
[Bibtex]
@INPROCEEDINGS{Pinquier2012,
author={Pinquier, J. and Karaman, S. and Letoupin, L. and Guyot, P. and Megret, R. and Benois-Pineau, J. and Gaestel, Y. and Dartigues, J.-F.},
booktitle={21st International Conference on Pattern Recognition (ICPR)},
title={Strategies for multiple feature fusion with Hierarchical HMM: Application to activity recognition from wearable audiovisual sensors},
year={2012},
month={Nov},
pages={3192-3195},
abstract={In this paper, we further develop the research on recognition of activities, in videos recorded with wearable cameras, with Hierarchical Hidden Markov Model classifiers. The visual scenes being of a strong complexity in terms of motion and visual content, good performances have been obtained using multiple visual and audio cues. The adequate fusion of features from physically different description spaces remains an open issue not only for this particular task, but in multiple problems of pattern recognition. A study of optimal fusion strategies in the HMM framework is proposed. We design and exploit early, intermediate and late fusions with emitting states in the H-HMM. The results obtained on a corpus recorded by healthy volunteers and patients in a longitudinal dementia study allow choosing optimal fusion strategies as a function of target activity.},
keywords={gesture recognition;hidden Markov models;image fusion;video signal processing;H-HMM;activity recognition;description spaces;early fusions;healthy volunteers;hierarchical HMM classifier;hierarchical hidden Markov model classifiers;intermediate fusions;late fusions;longitudinal dementia study;motion content;multiple feature fusion;optimal fusion strategies;pattern recognition;strong complexity;target activity;visual content;visual scenes;wearable audiovisual sensors;wearable cameras;Cameras;Hidden Markov models;Multimedia communication;Pattern recognition;Streaming media;Videos;Visualization},
url = {http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=6460843},
note={Poster},
address = {Tsukuba, Japan},
ISSN={1051-4651}
}
[6] [pdf] [doi] S. Karaman, J. Benois-Pineau, V. Dovgalecs, R. Mégret, J. Pinquier, R. André-Obrecht, Y. Gaëstel, and J. Dartigues, “Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia,” Multimedia Tools and Applications (MTAP), vol. 69, iss. 3, p. 1–29, 2012.
[Bibtex]
@article{karaman2012hierarchical,
title={Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia},
author={Karaman, Svebor and Benois-Pineau, Jenny and Dovgalecs, Vladislavs and M{\'e}gret, R{\'e}mi and Pinquier, Julien and Andr{\'e}-Obrecht, R{\'e}gine and Ga{\"e}stel, Yann and Dartigues, Jean-Fran{\c{c}}ois},
journal={Multimedia Tools and Applications (MTAP)},
pages={1--29},
year={2012},
volume={69},
number={3},
doi={10.1007/s11042-012-1117-x},
publisher={Springer}
}
[7] [pdf] Y. Gaëstel, S. Karaman, R. Megret, O. Cherifa, T. Francoise, B. Jenny, and J. Dartigues, “Autonomy at home and early diagnosis in Alzheimer’s Disease: Utility of video indexing applied to clinical issues, the IMMED project,” in Alzheimer’s Association International Conference on Alzheimer’s Disease (AAICAD), Paris, France, 2011, p. S245.
[Bibtex]
@inproceedings{gaestel2011,
hal_id = {hal-00978228},
url = {http://hal.archives-ouvertes.fr/hal-00978228},
title = {Autonomy at home and early diagnosis in Alzheimer's Disease: Utility of video indexing applied to clinical issues, the IMMED project},
author = {Ga{\"e}stel, Yann and Karaman, Svebor and Megret, R{\'e}mi and Cherifa, Onifade-Fagbe and Francoise, Trophy and Jenny, Benois-Pineau and Dartigues, Jean-Fran{\c c}ois},
abstract = {With ageing of the population in the world, patients with Alzheimer's disease (AD) consequently increase. People suffering from this pathology show early modifications in their "activities of daily living". Those abilities modifications are part of the dementia diagnosis, but are often not reported by the patients or their families. Being able to capture these early signs of autonomy loss could be a way to diagnose earlier dementia and to prevent insecurity at home. We first developed a wearable camera (shoulder mounted) to capture people's activity at home in a non-invasive manner. We then developed a video-indexing methodology to help physicians explore their patients' home-recorded video. This video indexing system requires video and audio analyses to automatically identify and index activities of interest where insecurity or risks could be highlightened. Patients are recruited among the Bagatelle (Talence, France) Memory clinic department patients and are suffering from mild cognitive impairments or very mild AD. We met ten patients at home and we recorded one hour of daily activities for each. The data (video and questionnaires: Activities of Daily Living/Instrumental Activities of Daily Living) are now collected on an extended sample of people suffering from mild cognitive impairments and from very mild AD. We aimed at evaluating behavioral modifications and ability loss detection by comparing the subjects' self reported questionnaires and the video analyses. This project is a successful collaboration between various fields of research. Here, technology is developed to be helpful in everyday challenges that people suffering from dementia of the Alzheimer type are faced with. The automation of the video indexing could be a great step forward in video analysis if it could reduce the time needed to embrace the patient's lifestream, helping in early diagnosis of dementia and becoming a very useful tool to keep individuals safe at home. In fact, many goals could be reached with such video analyses: an early diagnosis of dementia of the Alzheimer type, avoiding danger in home living and evaluating the progression of the disease or the effects of the various therapies (drug-therapy and others).},
language = {Anglais},
affiliation = {Institut de Sant{\'e} Publique, d'Epid{\'e}miologie et de D{\'e}veloppement - ISPED , Laboratoire Bordelais de Recherche en Informatique - LaBRI , Laboratoire de l'int{\'e}gration, du mat{\'e}riau au syst{\`e}me - IMS , MSPB Bagatelle - MSPB , Epid{\'e}miologie et Biostatistique},
booktitle = {{Alzheimer's Association International Conference on Alzheimer's Disease (AAICAD)}},
pages = {S245},
address = {Paris, France},
editor = {Alzheimer's \& Dementia: The Journal of the Alzheimer's Association },
audience = {internationale },
note = {Poster presentation. Abstract published in Journal of Alzheimer's and Dementia, volume 7 (4), pp. S245, July 2011},
collaboration = {IMMED },
year = {2011},
month = {Jul}
}