The research of my PhD thesis [1] was fulfilled in the context of wearable video monitoring of patients with aged dementia. The idea was to provide a new tool to medical practitioners for the early diagnosis of elderly dementia such as the Alzheimer disease [2]. More precisely, Instrumental Activities of Daily Living (IADL) had to be indexed in videos recorded with a wearable recording device.
Such videos present specific characteristics i.e. strong motion or strong lighting changes. Furthermore, the tackled recognition task is of a very strong semantics. In this difficult context, the first step of analysis was to define an equivalent to the notion of “shots” in edited videos. We therefore developed a method for partitioning continuous video streams into viewpoints according to the observed motion in the image plane [3]. For the recognition of IADLs we developed a solution based on the formalism of Hidden Markov Models (HMM) [4]. A hierarchical HMM with two levels modeling semantic activities or intermediate states has been introduced [5]. A complex set of features (dynamic, static, low-level, mid-level) was proposed and the most effective description spaces were identified experimentally [6].
In the mid-level features for activities recognition we focused on the semantic objects the person manipulates in the camera view. We proposed a new concept for object/image description using local features (SURF) and the underlying semi-local connected graphs. We introduced a nested approach for graphs construction when the same scene can be described by levels of graphs with increasing number of nodes. We build these graphs with Delaunay triangulation on SURF points thus preserving good properties of local features i.e. the invariance with regard to affine transformation of image plane: rotation, translation and zoom. We use the graph features in the Bag-of-Visual-Words framework, hence introducing the Graph Words [7]. The problem of distance or dissimilarity definition between graphs for clustering or recognition is obviously arisen. We propose a dissimilarity measure based on the Context Dependent Kernel of H. Sahbi and show its relation with the classical entry-wise norm when comparing trivial graphs (SURF points).
