Options
Sukhendu Das
Loading...
Preferred name
Sukhendu Das
Official Name
Sukhendu Das
Alternative Name
Das, Sukhendu
Main Affiliation
Email
ORCID
Scopus Author ID
Researcher ID
Google Scholar ID
115 results
Now showing 1 - 10 of 115
- PublicationA motion-sketch based video retrieval using MST-CSS representation(01-12-2012)
;Chattopadhyay, ChiranjoyIn this work, we propose a framework for a robust Content Based Video Retrieval (CBVR) system with free hand query sketches, using the Multi-Spectro Temporal-Curvature Scale Space (MST-CSS) representation. Our designed interface allows sketches to be drawn to depict the shape of the object in motion and its trajectory. We obtain the MST-CSS feature representation using these cues and match with a set of MST-CSS features generated offline from the video clips in the database (gallery). Results are displayed in rank ordered similarity. Experimentation with benchmark datasets shows promising results. © 2012 IEEE. - PublicationSupervised framework for automatic recognition and retrieval of interaction: A framework for classification and retrieving videos with similar human interactions(01-04-2016)
;Chattopadhyay, ChiranjoyThis study presents supervised framework for automatic recognition and retrieval of interactions (SAFARRIs), a supervised learning framework to recognise interactions such as pushing, punching, and hugging, between a pair of human performers in a video shot. The primary contribution of the study is to extend the vectors of locally aggregated descriptors (VLADs) as a compact and discriminative video encoding representation, to solve the complex class partitioning problem of recognising human interaction. An initial codebook is generated from the training set of video shots, by extracting feature descriptors around the spatiotemporal interest points computed across frames. A bag of action words is generated by encoding the first-order statistics of the visual words using VLAD. Support vector machine classifiers (1 against all) are trained using these codebooks. The authors have verified SAFARRI's accuracy for classification and retrieval (query by example). SAFARRI is free from tracking or recognition of body parts and capable of identifying the region of interaction in video shots. It gives superior retrieval and classification performances over recently proposed methods, on two publicly available human interaction datasets. - PublicationTrajectory representation using Gabor features for motion-based video retrieval(15-07-2009)
;Dyana, A.This paper proposes a Gabor filter based representation of motion trajectory, for the purpose of motion-based video retrieval. We propose a spectro-temporal representation of the trajectory, which involves the process of detecting a set of salient points from the peaks (locally) of the Gabor filter responses. The change in trajectory direction is also represented by observing the clockwise or anti-clockwise change in direction at the salient points. The feature set (formed by the frequency, temporal location and turning direction at each salient point) provides a semantic representation of the trajectory. Our approach is a global trajectory representation where matching is performed based on edit distance and is shown to perform well even for partial trajectory matching. The system is tested using two benchmark databases of trajectories, as well as various hand-drawn and partial trajectories. We have also experimented on real world videos. Experimental results have shown better performance than existing systems based on Fourier descriptors, polynomial representation and two state-of-the-art methods of symbolic representations based on PCA and characteristics of movement. © 2009 Elsevier B.V. All rights reserved. - PublicationGraph-based clustering for apictorial jigsaw puzzles of hand shredded content-less pages(01-01-2017)
;Lalitha, K. S.; ; Reassembling hand shredded content-less pages is a challenging task, with applications in forensics and fun games. This paper proposes an efficient iterative framework to solve apictorial jigsaw puzzles of hand shredded content-less pages, using only the shape information. The proposed framework consists of four phases. In the first phase, normalized shape features are extracted from fragment contours. Then, for all possible matches between pairs of fragments transformation parameters for alignment of fragments and three goodness scores are estimated. In the third phase, incorrect matches are eliminated based on the score values. The alignments are refined by pruning the set of pairwise matched fragments. Finally, a modified graph-based framework for agglomerative clustering is used to globally reassemble the page(s). Experimental evaluation of our proposed framework on an annotated dataset of shredded documents shows the efficiency in the reconstruction of multiple contentless pages from arbitrarily torn fragments. - PublicationSystem-on-programmable-chip implementation for on-line face recognition(01-02-2007)
;Pavan Kumar, A.; In this paper, the design of a parallel architecture for on-line face recognition using weighted modular principal component analysis (WMPCA) and its system-on-programmable-chip (SoPC) implementation are discussed. The WMPCA methodology, proposed by us earlier, is based on the assumption that the rates of variation of the different regions of a face are different due to variations in expression and illumination. Given a database of sample faces for training and a query face for recognizing, the WMPCA methodology involves division of the face into horizontal regions. Each of these regions are analyzed independently by computing the eigenfeatures and comparing the same with the corresponding eigenfeatures of the faces stored in the sample database to calculate the corresponding error. The final decision of the face recognizer is based on the weighted sum of the errors computed from each of the regions. These weights are calculated based on the extent to which the various samples of the subject are spread in the eigenspace. The WMPCA methodology has a better recognition rate compared to the modular PCA approach developed by Rajkiran and Vijayan [Rajkiran, G., Vijayan, K., 2004. An improved face recognition technique based on modular PCA approach. Pattern Recognition Letters, 25(4), 429-436]. The methodology also has a wide scope for parallelism. We present an architecture that exploits this parallelism and implement the same as a system-on-programmable-chip on an ALTERA based field programmable gate array (FPGA) platform. The implementation has achieved a processing speed of about 26 frames per second at an operating frequency of 33.33 MHz. © 2006 Elsevier B.V. All rights reserved. - PublicationPredicting Video Frames Using Feature Based Locally Guided Objectives(01-01-2019)
;Bhattacharjee, PrateepThis paper presents feature reconstruction based approach using Generative Adversarial Networks (GAN) to solve the problem of predicting future frames from natural video scenes. Recent GAN based methods often generate blurry outcomes and fail miserably in case of long-range prediction. Our proposed method incorporates an intermediate feature generating GAN to minimize the disparity between the ground truth and predicted outputs. For this, we propose two novel objective functions: (a) Locally Guided Gram Loss (LGGL) and (b) Multi-Scale Correlation Loss (MSCL) to further enhance the quality of the predicted frames. LGGL aides the feature generating GAN to maximize the similarity between the intermediate features of the ground-truth and the network output by constructing Gram matrices from locally extracted patches over several levels of the generator. MSCL incorporates a correlation based objective to effectively model the temporal relationships between the predicted and ground-truth frames at the frame generating stage. Our proposed model is end-to-end trainable and exhibits superior performance compared to the state-of-the-art on four real-world benchmark video datasets. - PublicationA novel hyperstring based descriptor for an improved representation of motion trajectory and retrieval of similar video shots with static camera(01-12-2012)
;Chattopadhyay, ChiranjoyA framework has been proposed for representing the trajectory of a moving object, using a novel hyperstring based approach for efficient retrieval of video shots. The hyperstring based model unifies both the structural and kinematic features for an improved representation of the trajectory. A Constraint-driven Adjacency Graph matching (CAGM) algorithm has been proposed to measure the similarity between a pair of query and model hyperstrings. Experiments have been performed on benchmark datasets of trajectories (one synthetic and three real-world video shots), to assess the performance (using Precision-Recall metric) of the proposed model. Results have been compared with two similar published works on video retrieval using trajectories, to demonstrate the superiority of our proposed framework. © 2012 IEEE. - PublicationSLAR (Simultaneous Localization and Recognition) framework for smart CBIR(25-01-2012)
;Dwivedi, Gyanesh; ;Rakshit, Subrata ;Vora, MeghaSamanta, SuranjanaIn traditional content-based image retrieval (CBIR) methods, features are extracted from the entire image for computing similarity with query. It is necessary to design a smart object-centric CBIR to retrieve images from the gallery, having objects similar to that present in the foreground of the query image. We propose a model for a novel SLAR (Simultaneous Localization And Recognition) framework for solving this problem of smart CBIR, to simultaneously: (i) detect the location and (ii) recognize the type (ID or class) of the foreground object in a scene. The framework integrates both unsupervised and supervised methods of foreground segmentation and object classification. This model is motivated by the cognitive models of human visual perception, which generalizes from examples to simultaneously locate and categorize objects. Experimentation has been done on six categories of objects and the results have been compared with a contemporary work on CBIR. © 2012 Springer-Verlag. - PublicationFace recognition using weighted modular principle component analysis(01-12-2004)
;Pavan Kumar, A.; A method of face recognition using a weighted modular principle component analysis (WMPCA) is presented in this paper. The proposed methodology has a better recognition rate, when compared with conventional PCA, for faces with large variations in expression and illumination. The face is divided into horizontal sub-regions such as forehead, eyes, nose and mouth. Then each of them are separately analyzed using PCA. The final decision is taken based on a weighted sum of errors obtained from each sub-region.A method is proposed, to calculate these weights, which is based on the assumption that different regions in a face vary at different rates with expression, pose and illumination. © Springer-Verlag Berlin Heidelberg 2004. - PublicationFace recognition on low quality surveillance images, by compensating degradation(19-07-2011)
;Rudrani, ShivaFace images obtained by an outdoor surveillance camera, are often confronted with severe degradations (e.g., low-resolution, low-contrast, blur and noise). This significantly limits the performance of face recognition (FR) systems. This paper presents a framework to overcome the degradation in images obtained by an outdoor surveillance camera, to improve the performance of FR. We have defined a measure that is based on the difference in intensity histograms of face images, to estimate the amount of degradation. In the past, super-resolution techniques have been proposed to increase the image resolution for face recognition. In this work, we attempt a combination of partial restoration (using super-resolution, interpolation etc.) of probe samples (long distance shots of outdoor) and simulated degradation of gallery samples (indoor shots). Due to the unavailability of any benchmark face database with gallery and probe images, we have built our own database and conducted experiments on a realistic surveillance face database. PCA and FLDA have been used as baseline face recognition classifiers. The aim is to illustrate the effectiveness of our proposed method of compensating the degradation in surveillance data, rather than designing a specific classifier space suited for degraded test probes. The efficiency of the method is shown by improvement in the face classification accuracy, while comparing results obtained separately using training with acquired indoor gallery samples and then testing with the outdoor probes. © 2011 Springer-Verlag.