PhD Thesis

The research of my PhD thesis [1] was fulfilled in the context of wearable video monitoring of patients with aged dementia. The idea was to provide a new tool to medical practitioners for the early diagnosis of elderly dementia such as the Alzheimer disease [2]. More precisely, Instrumental Activities of Daily Living (IADL) had to be indexed in videos recorded with a wearable recording device.

Such videos present specific characteristics i.e. strong motion or strong lighting changes. Furthermore, the tackled recognition task is of a very strong semantics. In this difficult context, the first step of analysis was to define an equivalent to the notion of “shots” in edited videos. We therefore developed a method for partitioning continuous video streams into viewpoints according to the observed motion in the image plane [3]. For the recognition of IADLs we developed a solution based on the formalism of Hidden Markov Models (HMM) [4]. A hierarchical HMM with two levels modeling semantic activities or intermediate states has been introduced [5]. A complex set of features (dynamic, static, low-level, mid-level) was proposed and the most effective description spaces were identified experimentally [6].

In the mid-level features for activities recognition we focused on the semantic objects the person manipulates in the camera view. We proposed a new concept for object/image description using local features (SURF) and the underlying semi-local connected graphs. We introduced a nested approach for graphs construction when the same scene can be described by levels of graphs with increasing number of nodes. We build these graphs with Delaunay triangulation on SURF points thus preserving good properties of local features i.e. the invariance with regard to affine transformation of image plane: rotation, translation and zoom. We use the graph features in the Bag-of-Visual-Words framework, hence introducing the Graph Words [7]. The problem of distance or dissimilarity definition between graphs for clustering or recognition is obviously arisen. We propose a dissimilarity measure based on the Context Dependent Kernel of H. Sahbi and show its relation with the classical entry-wise norm when comparing trivial graphs (SURF points).

Related publications

[1] S. Karaman, "Indexing of Activities in Wearable Videos : Application to Epidemiological Studies of Aged Dementia," PhD Thesis, Université Sciences et Technologies-Bordeaux I, 2011.
[2] Y. Gaëstel, S. Karaman, R. Megret, O. Cherifa, T. Francoise, B. Jenny, and J. Dartigues, "Autonomy at home and early diagnosis in Alzheimer's Disease: Utility of video indexing applied to clinical issues, the IMMED project," in Alzheimer's Association International Conference on Alzheimer's Disease (AAICAD), p. S245. Paris, France, 2011. Poster presentation. Abstract published in Journal of Alzheimer's and Dementia, volume 7 (4), pp. S245, July 2011.
[3] S. Karaman, J. Benois-Pineau, R. Mégret, J. Pinquier, Y. Gaestel, and J. -F. Dartigues, "Activities of daily living indexing by hierarchical HMM for dementia diagnostics," in 9th International Workshop on Content-Based Multimedia Indexing (CBMI), pp. 79-84. Madrid, Spain, 2011. Oral Presentation.
[4] S. Karaman, J. Benois-Pineau, R. Mégret, V. Dovgalecs, J. -F. Dartigues, and Y. Gaëstel, "Human Daily Activities Indexing in Videos from Wearable Cameras for Monitoring of Patients with Dementia Diseases," in 20th International Conference on Pattern Recognition (ICPR), pp. 4113-4116. Istanbul, Turkey, 2010. Oral Presentation.
[5] S. Karaman, J. Benois-Pineau, V. Dovgalecs, R. Mégret, J. Pinquier, R. André-Obrecht, Y. Gaëstel, and J. Dartigues, "Hierarchical Hidden Markov Model in detecting activities of daily living in wearable videos for studies of dementia," Multimedia Tools and Applications (MTAP), vol. 69, iss. 3, pp. 1-29, 2012.
[6] J. Pinquier, S. Karaman, L. Letoupin, P. Guyot, R. Megret, J. Benois-Pineau, Y. Gaestel, and J. -F. Dartigues, "Strategies for multiple feature fusion with Hierarchical HMM: Application to activity recognition from wearable audiovisual sensors," in 21st International Conference on Pattern Recognition (ICPR), pp. 3192-3195. Tsukuba, Japan, 2012. Poster.
[7] S. Karaman, J. Benois-Pineau, R. Mégret, and A. Bugeau, "Multi-layer Local Graph Words for Object Recognition," in Advances in Multimedia Modeling, K. Schoeffmann, B. Merialdo, A. Hauptmann, C. Ngo, Y. Andreopoulos, and C. Breiteneder, Eds., Springer Berlin Heidelberg, vol. 7131, pp. 29-39. Klagenfurt, Austria, 2012. Oral Presentation.
About me

I am a French Computer Vision and Machine Learning researcher currently a Postdoctoral researcher in the DVMM Lab at Columbia University. Previously, I have spent three great years at the MICC (Media Integration and Communication Center) of the University of Florence in Italy.

Research themes

My research themes are image and video analysis, computer vision and machine learning. I am particularly interested in semantic concepts recognition in images and videos.

I have made my PhD at the LaBRI – University of Bordeaux, under the supervision of Jenny Benois-Pineau and Rémi Mégret. During my PhD thesis, I have worked on human activities recognition by Hidden Markov Models (HMM) in videos recorded from a wearable device within the IMMED project. I have also developed an object recognition approach in the Bag-of-Visual-Words framework which integrates spatial information within semi-local features: the Graph-Words. I defended my PhD entitled “Indexing of Activities in Wearable Videos : Application to Epidemiological Studies of Aged Dementia” in 2011.

While at the MICC, I have been highly involved in the MNEMOSYNE project. In this project multiple aspects of computer vision such as person detection, person tracking and re-identification are used to passively profile the interests of visitors in a museum to provide personalized multimedia content delivery. I was also still working on more general image and video classification problems.

I have joined the DVMM Lab where I will be working on face recognition and hashing for image retrieval problems.


