Tag Archives: IMMED

IMMED Project

The research themes of my PhD thesis [1] were related to those of the IMMED project [2]. This 3 years project was founded by the French National Research Agency (ANR) with the identifier ANR-09-BLAN-0165-01. The acronym IMMED stands for “Indexing MultiMEdia data from wearable sensors for diagnostics and treatment of Dementia” (or “Indexation de données MultiMédia Embarquées pour le diagnostic et le suivi des traitements des Démences” in French).

This project was initiated on the analysis that the aging population implies a major public health problem: maintaining elderly at home. However, aging is accompanied by an increased prevalence of Alzheimer’s disease and related sources of loss of autonomy. Detecting problems in an everyday life, in the so-called instrumental activities of elderly, at home could be a way to avoid any accidents and costly hospitalizations in terms of psychological and physical safety (to the patient and his family) as well as economically. So far, the daily capacity assessment was based on questionnaires completed by the patient or relatives. Thanks to a wearable camera, attached to the shoulder of patients, the space where instrumental activities occur is recorded. The Activities of Daily Living recorded at home are evaluated a posteriori by a medical practitioner. Thanks to the video indexing processing and the visualization software, the specialist can easily access the key moments of this recording. This observation is a way around the forgetfulness and denial, classical phenomena in dementia and allows reducing the delay in diagnosis or risk-taking at home (medicine management, falls etc…).

Presentation of the IMMED project

Within the IMMED project, I have developed tools and methods for indexing activities of daily living in videos acquired from wearable cameras, applied in the context of dementia diagnosis by doctors. The project partners have designed a lightweight wearable device which records patient activities in their own house, thus allowing the medical specialist to spot meaningful events. The recording mode poses great challenges since the video data consists in a single sequence shot where strong motion and sharp lighting changes often appear. Because of the length of the recordings, tools for an efficient navigation in terms of activities of interest are crucial. The work conducted during my PhD introduces a video structuring approach that combines automatic motion based segmentation of the video [3] and activity recognition by a hierarchical two-level Hidden Markov Model [4]. We have leveraged a multi-modal description space [5] over visual and audio features, including mid-level features such as motion, location, speech and noise detections, producing cross-modal algorithms for the automatic indexing of daily living activities [6]. These tools have been included in a video consultation software (developed under my supervision) used by the clinical partner, which exploited it to evaluate the effects of the disease on real data obtained from the recording of more than 50 recording sessions at home [7].

This project has been referenced in MIT Technology Review and on French National Television. A European project « FP7 PI Dem@care », which was initiated by the IMMED consortium is now being in a continuity of this project. A synergy between project participants and the « LabEx Brain » for study of Parkinson disease has been established.

Related publications

[1] [pdf] S. Karaman, “Indexing of Activities in Wearable Videos : Application to Epidemiological Studies of Aged Dementia,” PhD Thesis, 2011.
title={Indexing of Activities in Wearable Videos : Application to Epidemiological Studies of Aged Dementia},
author={Karaman, Svebor},
school={Universit{\'e} Sciences et Technologies-Bordeaux I}
[2] [pdf] [doi] R. Mégret, V. Dovgalecs, H. Wannous, S. Karaman, J. Benois-Pineau, E. El Khoury, J. Pinquier, P. Joly, R. André-Obrecht, Y. Gaëstel, and J. Dartigues, “The IMMED Project: Wearable Video Monitoring of People with Age Dementia,” in Proceedings of the International Conference on Multimedia (ACMMM), Firenze, Italy, 2010, p. 1299–1302.
author = {M{\'e}gret, R{\'e}mi and Dovgalecs, Vladislavs and Wannous, Hazem and Karaman, Svebor and Benois-Pineau, Jenny and El Khoury, Elie and Pinquier, Julien and Joly, Philippe and Andr{\'e}-Obrecht, R{\'e}gine and Ga\"{e}stel, Yann and Dartigues, Jean-Fran\c{c}ois},
title = {The IMMED Project: Wearable Video Monitoring of People with Age Dementia},
booktitle = {Proceedings of the International Conference on Multimedia (ACMMM)},
series = {MM '10},
year = {2010},
isbn = {978-1-60558-933-6},
address = {Firenze, Italy},
pages = {1299--1302},
numpages = {4},
url = {http://doi.acm.org/10.1145/1873951.1874206},
doi = {10.1145/1873951.1874206},
acmid = {1874206},
note = {Video program},
publisher = {ACM},
keywords = {audio and video indexing, patient monitoring, wearable camera}
[3] [pdf] [doi] S. Karaman, J. Benois-Pineau, R. Mégret, J. Pinquier, Y. Gaestel, and J. -F. Dartigues, “Activities of daily living indexing by hierarchical HMM for dementia diagnostics,” in 9th International Workshop on Content-Based Multimedia Indexing (CBMI), Madrid, Spain, 2011, pp. 79-84.
author={Karaman, S. and Benois-Pineau, J. and Mégret, R. and Pinquier, J. and Gaestel, Y. and Dartigues, J.-F.},
booktitle={9th International Workshop on Content-Based Multimedia Indexing (CBMI)},
title={Activities of daily living indexing by hierarchical HMM for dementia diagnostics},
address = {Madrid, Spain},
abstract={This paper presents a method for indexing human activities in videos captured from a wearable camera being worn by patients, for studies of progression of the dementia diseases. Our method aims to produce indexes to facilitate the navigation throughout the individual video recordings, which could help doctors search for early signs of the disease in the activities of daily living. The recorded videos have strong motion and sharp lighting changes, inducing noise for the analysis. The proposed approach is based on a two steps analysis. First, we propose a new approach to segment this type of video, based on apparent motion. Each segment is characterized by two original motion descriptors, as well as color, and audio descriptors. Second, a Hidden-Markov Model formulation is used to merge the multimodal audio and video features, and classify the test segments. Experiments show the good properties of the approach on real data.},
keywords={hidden Markov models;image colour analysis;image segmentation;indexing;medical diagnostic computing;medical disorders;video recording;audio descriptors;color descriptors;daily living indexing;dementia diagnostics;dementia diseases;hidden-Markov model formulation;hierarchical HMM;human activities indexing;multimodal audio features;original motion descriptors;recorded videos;test segments;two steps analysis;video features;video recordings;wearable camera;Accuracy;Cameras;Dynamics;Hidden Markov models;Histograms;Motion segmentation;Videos},
note={Oral Presentation},
[4] [pdf] [doi] S. Karaman, J. Benois-Pineau, R. Mégret, V. Dovgalecs, J. -F. Dartigues, and Y. Gaëstel, “Human Daily Activities Indexing in Videos from Wearable Cameras for Monitoring of Patients with Dementia Diseases,” in 20th International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 2010, pp. 4113-4116.
author={Karaman, S. and Benois-Pineau, J. and Mégret, R. and Dovgalecs, V. and Dartigues, J.-F. and Gaëstel, Y.},
booktitle={20th International Conference on Pattern Recognition (ICPR)},
title={Human Daily Activities Indexing in Videos from Wearable Cameras for Monitoring of Patients with Dementia Diseases},
abstract={Our research focuses on analysing human activities according to a known behaviorist scenario, in case of noisy and high dimensional collected data. The data come from the monitoring of patients with dementia diseases by wearable cameras. We define a structural model of video recordings based on a Hidden Markov Model. New spatio-temporal features, color features and localization features are proposed as observations. First results in recognition of activities are promising.},
keywords={feature extraction;hidden Markov models;image colour analysis;image motion analysis;video cameras;video recording;video signal processing;activity recognition;behaviorist scenario;color features;dementia disease patients;hidden Markov model;human activity indexing;localization features;patient monitoring;spatiotemporal features;video recordings;wearable cameras;Biomedical monitoring;Cameras;Hidden Markov models;Histograms;Image color analysis;Motion segmentation;Videos;Bag of Features;HMM;Localization;Monitoring;Video Indexing},
note={Oral Presentation},
address={Istanbul, Turkey}
[5] [pdf] J. Pinquier, S. Karaman, L. Letoupin, P. Guyot, R. Megret, J. Benois-Pineau, Y. Gaestel, and J. -F. Dartigues, “Strategies for multiple feature fusion with Hierarchical HMM: Application to activity recognition from wearable audiovisual sensors,” in 21st International Conference on Pattern Recognition (ICPR), Tsukuba, Japan, 2012, pp. 3192-3195.
author={Pinquier, J. and Karaman, S. and Letoupin, L. and Guyot, P. and Megret, R. and Benois-Pineau, J. and Gaestel, Y. and Dartigues, J.-F.},
booktitle={21st International Conference on Pattern Recognition (ICPR)},
title={Strategies for multiple feature fusion with Hierarchical HMM: Application to activity recognition from wearable audiovisual sensors},
abstract={In this paper, we further develop the research on recognition of activities, in videos recorded with wearable cameras, with Hierarchical Hidden Markov Model classifiers. The visual scenes being of a strong complexity in terms of motion and visual content, good performances have been obtained using multiple visual and audio cues. The adequate fusion of features from physically different description spaces remains an open issue not only for this particular task, but in multiple problems of pattern recognition. A study of optimal fusion strategies in the HMM framework is proposed. We design and exploit early, intermediate and late fusions with emitting states in the H-HMM. The results obtained on a corpus recorded by healthy volunteers and patients in a longitudinal dementia study allow choosing optimal fusion strategies as a function of target activity.},
keywords={gesture recognition;hidden Markov models;image fusion;video signal processing;H-HMM;activity recognition;description spaces;early fusions;healthy volunteers;hierarchical HMM classifier;hierarchical hidden Markov model classifiers;intermediate fusions;late fusions;longitudinal dementia study;motion content;multiple feature fusion;optimal fusion strategies;pattern recognition;strong complexity;target activity;visual content;visual scenes;wearable audiovisual sensors;wearable cameras;Cameras;Hidden Markov models;Multimedia communication;Pattern recognition;Streaming media;Videos;Visualization},
url = {http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=6460843},
address = {Tsukuba, Japan},
[6] [pdf] [doi] S. Karaman, J. Benois-Pineau, V. Dovgalecs, R. Mégret, J. Pinquier, R. André-Obrecht, Y. Gaëstel, and J. Dartigues, “Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia,” Multimedia Tools and Applications (MTAP), vol. 69, iss. 3, p. 1–29, 2012.
title={Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia},
author={Karaman, Svebor and Benois-Pineau, Jenny and Dovgalecs, Vladislavs and M{\'e}gret, R{\'e}mi and Pinquier, Julien and Andr{\'e}-Obrecht, R{\'e}gine and Ga{\"e}stel, Yann and Dartigues, Jean-Fran{\c{c}}ois},
journal={Multimedia Tools and Applications (MTAP)},
[7] [pdf] Y. Gaëstel, S. Karaman, R. Megret, O. Cherifa, T. Francoise, B. Jenny, and J. Dartigues, “Autonomy at home and early diagnosis in Alzheimer’s Disease: Utility of video indexing applied to clinical issues, the IMMED project,” in Alzheimer’s Association International Conference on Alzheimer’s Disease (AAICAD), Paris, France, 2011, p. S245.
hal_id = {hal-00978228},
url = {http://hal.archives-ouvertes.fr/hal-00978228},
title = {Autonomy at home and early diagnosis in Alzheimer's Disease: Utility of video indexing applied to clinical issues, the IMMED project},
author = {Ga{\"e}stel, Yann and Karaman, Svebor and Megret, R{\'e}mi and Cherifa, Onifade-Fagbe and Francoise, Trophy and Jenny, Benois-Pineau and Dartigues, Jean-Fran{\c c}ois},
abstract = {With ageing of the population in the world, patients with Alzheimer's disease (AD) consequently increase. People suffering from this pathology show early modifications in their "activities of daily living". Those abilities modifications are part of the dementia diagnosis, but are often not reported by the patients or their families. Being able to capture these early signs of autonomy loss could be a way to diagnose earlier dementia and to prevent insecurity at home. We first developed a wearable camera (shoulder mounted) to capture people's activity at home in a non-invasive manner. We then developed a video-indexing methodology to help physicians explore their patients' home-recorded video. This video indexing system requires video and audio analyses to automatically identify and index activities of interest where insecurity or risks could be highlightened. Patients are recruited among the Bagatelle (Talence, France) Memory clinic department patients and are suffering from mild cognitive impairments or very mild AD. We met ten patients at home and we recorded one hour of daily activities for each. The data (video and questionnaires: Activities of Daily Living/Instrumental Activities of Daily Living) are now collected on an extended sample of people suffering from mild cognitive impairments and from very mild AD. We aimed at evaluating behavioral modifications and ability loss detection by comparing the subjects' self reported questionnaires and the video analyses. This project is a successful collaboration between various fields of research. Here, technology is developed to be helpful in everyday challenges that people suffering from dementia of the Alzheimer type are faced with. The automation of the video indexing could be a great step forward in video analysis if it could reduce the time needed to embrace the patient's lifestream, helping in early diagnosis of dementia and becoming a very useful tool to keep individuals safe at home. In fact, many goals could be reached with such video analyses: an early diagnosis of dementia of the Alzheimer type, avoiding danger in home living and evaluating the progression of the disease or the effects of the various therapies (drug-therapy and others).},
language = {Anglais},
affiliation = {Institut de Sant{\'e} Publique, d'Epid{\'e}miologie et de D{\'e}veloppement - ISPED , Laboratoire Bordelais de Recherche en Informatique - LaBRI , Laboratoire de l'int{\'e}gration, du mat{\'e}riau au syst{\`e}me - IMS , MSPB Bagatelle - MSPB , Epid{\'e}miologie et Biostatistique},
booktitle = {{Alzheimer's Association International Conference on Alzheimer's Disease (AAICAD)}},
pages = {S245},
address = {Paris, France},
editor = {Alzheimer's \& Dementia: The Journal of the Alzheimer's Association },
audience = {internationale },
note = {Poster presentation. Abstract published in Journal of Alzheimer's and Dementia, volume 7 (4), pp. S245, July 2011},
collaboration = {IMMED },
year = {2011},
month = {Jul}

About me

I am a French Computer Vision and Machine Learning researcher, currently a  Research Manager at Dataminr. Previously, I spent three years as a PostDoc at the MICC (Media Integration and Communication Center) of the University of Florence in Italy, and five years as an Associate Research Scientist in the DVMM Lab at Columbia University.

Research themes

My research themes are image and video analysis, computer vision, and machine learning. I am particularly interested in semantic concept recognition in images and videos.

I did my Ph.D. at the LaBRI – University of Bordeaux, under the supervision of Jenny Benois-Pineau and Rémi Mégret. During my Ph.D. thesis, I worked on human activity recognition by Hidden Markov Models (HMM) in videos recorded from a wearable device within the IMMED project. I have also developed an object recognition approach in the Bag-of-Visual-Words framework which integrates spatial information within semi-local features: the Graph-Words. I defended my Ph.D. entitled “Indexing of Activities in Wearable Videos: Application to Epidemiological Studies of Aged Dementia” in 2011.

While at the MICC, I have been highly involved in the MNEMOSYNE project. In this project, multiple aspects of computer vision such as person detection, person tracking, and re-identification are used to passively profile the interests of visitors in a museum to provide personalized multimedia content delivery. I was also working on more general image and video classification problems.

At the DVMM Lab, I have been working mostly on large-scale image indexing and retrieval problems but I also published works on other projects such as social media understanding, grounding, scene graph generation, visual parsing, and GAN detections…

At Dataminr, I’m working on computer vision and multimodal-related problems.


Computer Vision, Machine Learning, Image Analysis, Video Analysis, Video Indexing, Object Recognition, Person Detection, Re-Identification, Passive Profiling, Behavior Analysis, Action Recognition…