Options
Kaushik Mitra
Loading...
Preferred name
Kaushik Mitra
Official Name
Kaushik Mitra
Alternative Name
Mitra, Kaushik
Main Affiliation
Email
ORCID
Scopus Author ID
Google Scholar ID
9 results
Now showing 1 - 9 of 9
- PublicationJoint optic disc and cup segmentation using fully convolutional and adversarial networks(01-01-2017)
;Shankaranarayana, Sharath M. ;Ram, Keerthi; Glaucoma is a highly threatening and widespread ocular disease which may lead to permanent loss in vision. One of the important parameters used for Glaucoma screening in the cup-to-disc ratio (CDR), which requires accurate segmentation of optic cup and disc. We explore fully convolutional networks (FCNs) for the task of joint segmentation of optic cup and disc. We propose a novel improved architecture building upon FCNs by using the concept of residual learning. Additionally, we also explore if adversarial training helps in improving the segmentation results. The method does not require any complicated preprocessing techniques for feature enhancement. We learn a mapping between the retinal images and the corresponding segmentation map using fully convolutional and adversarial networks. We perform extensive experiments of various models on a set of 159 images from RIM-ONE database and also do extensive comparison. The proposed method outperforms the state of the art methods on various evaluation metrics for both disc and cup segmentation. - PublicationLWGNet - Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval(01-01-2022)
;Saha, Atreyee ;Khan, Salman S. ;Sehrawat, Sagar ;Prabhu, Sanjana S.; Fourier Ptychographic Microscopy (FPM) is an imaging procedure that overcomes the traditional limit on Space-Bandwidth Product (SBP) of conventional microscopes through computational means. It utilizes multiple images captured using a low numerical aperture (NA) objective and enables high-resolution phase imaging through frequency domain stitching. Existing FPM reconstruction methods can be broadly categorized into two approaches: iterative optimization based methods, which are based on the physics of the forward imaging model, and data-driven methods which commonly employ a feed-forward deep learning framework. We propose a hybrid model-driven residual network that combines the knowledge of the forward imaging system with a deep data-driven network. Our proposed architecture, LWGNet, unrolls traditional Wirtinger flow optimization algorithm into a novel neural network design that enhances the gradient images through complex convolutional blocks. Unlike other conventional unrolling techniques, LWGNet uses fewer stages while performing at par or even better than existing traditional and deep learning techniques, particularly, for low-cost and low dynamic range CMOS sensors. This improvement in performance for low-bit depth and low-cost sensors has the potential to bring down the cost of FPM imaging setup significantly. Finally, we show consistently improved performance on our collected real data (We have made the code avaiable at: https://github.com/at3e/LWGNet.git ). - PublicationA Bottom-Up Saliency Estimation Approach for Neonatal Retinal Images(01-01-2018)
;Shankaranarayana, Sharath M. ;Ram, Keerthi ;Vinekar, Anand; Retinopathy of Prematurity (ROP) is a potentially blinding disease occurring primarily in prematurely born neonates. Staging or classification of ROP into various stages is mainly dependant on the presence of ridge or demarcation line and its distance with respect to optic disc. Thus, computer aided diagnosis of ROP requires method to automatically detect the ridge. To this end, a new bottom up saliency estimation method for neonatal retinal images is proposed. The method consists of first obtaining a depth map of neonatal retinal image via an image restoration scheme based on a physical model. The obtain depth is then converted to a saliency map. Then the image is further processed to even out illumination and contrast variations and the border artifacts. Next, two additional saliency maps are estimated from the processed image using gradient and appearance cues. The obtained saliency maps are then fused by pixel-wise multiplication and addition operators. The obtained final saliency map facilitates the detection of demarcation line and is qualitatively shown to be more suitable for neonatal retinal images compared to the state of the art saliency estimation techniques. This method could thus serve as tool for improved and faster diagnosis. Additionally, we also explore the usefulness of saliency maps for the task of classification of ROP into four stages. - PublicationHigh-Speed HDR Video Reconstruction from Hybrid Intensity Frames and Events(01-01-2023)
;Samra, Rishabh; Shedligeri, PrasanAn effective way to generate high dynamic range (HDR) videos is to capture a sequence of low dynamic range (LDR) frames with alternate exposures and interpolate the intermediate frames. Video frame interpolation techniques can help reconstruct missing information from neighboring images of different exposures. Most of the conventional video frame interpolation techniques compute optical flow between successively captured frames and linearly interpolate them to obtain the intermediate frames. However, these techniques will fail when there is a nonlinear motion or sudden brightness changes in the scene. There is a new class of sensors called event sensors which asynchronously measures per-pixel brightness changes and offer advantages like high temporal resolution, high dynamic range, and low latency. For HDR video reconstruction, we recommend using a hybrid imaging system consisting of a conventional camera, which captures alternate exposure LDR frames, and an event camera which captures high-speed events. We interpolate the missing frames for each exposure by using an event-based interpolation technique which takes in the nearest image frames corresponding to that exposure and the high-speed events data between these frames. At each timestamp, once we have interpolated all the LDR frames for different exposures, we use a deep learning-based algorithm to obtain the HDR frame. We compare our results with those of non-event-based interpolation methods and found that event-based techniques perform better when a large number of frames need to be interpolated. - PublicationDeep Atrous Guided Filter for Image Restoration in Under Display Cameras(01-01-2020)
;Sundar, Varun ;Hegde, Sumanth ;Kothandaraman, DivyaUnder Display Cameras present a promising opportunity for phone manufacturers to achieve bezel-free displays by positioning the camera behind semi-transparent OLED screens. Unfortunately, such imaging systems suffer from severe image degradation due to light attenuation and diffraction effects. In this work, we present Deep Atrous Guided Filter (DAGF), a two-stage, end-to-end approach for image restoration in UDC systems. A Low-Resolution Network first restores image quality at low-resolution, which is subsequently used by the Guided Filter Network as a filtering input to produce a high-resolution output. Besides the initial downsampling, our low-resolution network uses multiple, parallel atrous convolutions to preserve spatial resolution and emulates multi-scale processing. Our approach’s ability to directly train on megapixel images results in significant performance improvement. We additionally propose a simple simulation scheme to pre-train our model and boost performance. Our overall framework ranks 2nd and 5th in the RLQ-TOD’20 UDC Challenge for POLED and TOLED displays, respectively. - PublicationSynthesizing Light Field Video from Monocular Video(01-01-2022)
;Govindarajan, Shrisudhan ;Shedligeri, Prasan ;Sarah,The hardware challenges associated with light-field (LF) imaging has made it difficult for consumers to access its benefits like applications in post-capture focus and aperture control. Learning-based techniques which solve the ill-posed problem of LF reconstruction from sparse (1, 2 or 4) views have significantly reduced the need for complex hardware. LF video reconstruction from sparse views poses a special challenge as acquiring ground-truth for training these models is hard. Hence, we propose a self-supervised learning-based algorithm for LF video reconstruction from monocular videos. We use self-supervised geometric, photometric and temporal consistency constraints inspired from a recent learning-based technique for LF video reconstruction from stereo video. Additionally, we propose three key techniques that are relevant to our monocular video input. We propose an explicit disocclusion handling technique that encourages the network to use information from adjacent input temporal frames, for inpainting disoccluded regions in a LF frame. This is crucial for a self-supervised technique as a single input frame does not contain any information about the disoccluded regions. We also propose an adaptive low-rank representation that provides a significant boost in performance by tailoring the representation to each input scene. Finally, we propose a novel refinement block that is able to exploit the available LF image data using supervised learning to further refine the reconstruction quality. Our qualitative and quantitative analysis demonstrates the significance of each of the proposed building blocks and also the superior results compared to previous state-of-the-art monocular LF reconstruction techniques. We further validate our algorithm by reconstructing LF videos from monocular videos acquired using a commercial GoPro camera. An open-source implementation is also made available (https://github.com/ShrisudhanG/Synthesizing-Light-Field-Video-from-Monocular-Video ). - PublicationPyramidal Edge-Maps and Attention Based Guided Thermal Super-Resolution(01-01-2020)
;Gupta, HoneyGuided super-resolution (GSR) of thermal images using visible range images is challenging because of the difference in the spectral-range between the images. This in turn means that there is significant texture-mismatch between the images, which manifests as blur and ghosting artifacts in the super-resolved thermal image. To tackle this, we propose a novel algorithm for GSR based on pyramidal edge-maps extracted from the visible image. Our proposed network has two sub-networks. The first sub-network super-resolves the low-resolution thermal image while the second obtains edge-maps from the visible image at a growing perceptual scale and integrates them into the super-resolution sub-network with the help of attention-based fusion. Extraction and integration of multi-level edges allows the super-resolution network to process texture-to-object level information progressively, enabling more straightforward identification of overlapping edges between the input images. Extensive experiments show that our model outperforms the state-of-the-art GSR methods, both quantitatively and qualitatively. - PublicationUDC 2020 Challenge on Image Restoration of Under-Display Camera: Methods and Results(01-01-2020)
;Zhou, Yuqian ;Kwan, Michael ;Tolentino, Kyle ;Emerton, Neil ;Lim, Sehoon ;Large, Tim ;Fu, Lijiang ;Pan, Zhihong ;Li, Baopu ;Yang, Qirui ;Liu, Yihao ;Tang, Jigang ;Ku, Tao ;Ma, Shibin ;Hu, Bingnan ;Wang, Jiarong ;Puthussery, Densen ;Hrishikesh, P. S. ;Kuriakose, Melvin ;Jiji, C. V. ;Sundar, Varun ;Hegde, Sumanth ;Kothandaraman, Divya; ;Jassal, Akashdeep ;Shah, Nisarg A. ;Nathan, Sabari ;Rahel, Nagat Abdalla Esiad ;Chen, Dafan ;Nie, Shichao ;Yin, Shuting ;Ma, Chengconghui ;Wang, Haoran ;Zhao, Tongtong ;Zhao, Shanshan ;Rego, Joshua ;Chen, Huaijin ;Li, Shuai ;Hu, Zhenhua ;Lau, Kin Wai ;Po, Lai Man ;Yu, Dahai ;Rehman, Yasar Abbas Ur ;Li, YiqunXing, LianpingThis paper is the report of the first Under-Display Camera (UDC) image restoration challenge in conjunction with the RLQ workshop at ECCV 2020. The challenge is based on a newly-collected database of Under-Display Camera. The challenge tracks correspond to two types of display: a 4k Transparent OLED (T-OLED) and a phone Pentile OLED (P-OLED). Along with about 150 teams registered the challenge, eight and nine teams submitted the results during the testing phase for each track. The results in the paper are state-of-the-art restoration performance of Under-Display Camera Restoration. Datasets and paper are available at https://yzhouas.github.io/projects/UDC/udc.html.