Now showing 1 - 10 of 41
  • Placeholder Image
    Publication
    Joint optic disc and cup segmentation using fully convolutional and adversarial networks
    (01-01-2017)
    Shankaranarayana, Sharath M.
    ;
    Ram, Keerthi
    ;
    ;
    Glaucoma is a highly threatening and widespread ocular disease which may lead to permanent loss in vision. One of the important parameters used for Glaucoma screening in the cup-to-disc ratio (CDR), which requires accurate segmentation of optic cup and disc. We explore fully convolutional networks (FCNs) for the task of joint segmentation of optic cup and disc. We propose a novel improved architecture building upon FCNs by using the concept of residual learning. Additionally, we also explore if adversarial training helps in improving the segmentation results. The method does not require any complicated preprocessing techniques for feature enhancement. We learn a mapping between the retinal images and the corresponding segmentation map using fully convolutional and adversarial networks. We perform extensive experiments of various models on a set of 159 images from RIM-ONE database and also do extensive comparison. The proposed method outperforms the state of the art methods on various evaluation metrics for both disc and cup segmentation.
  • Placeholder Image
    Publication
    Learning light field reconstruction from a single coded image
    (13-12-2018)
    Vadathya, Anil Kumar
    ;
    Cholleti, Saikiran
    ;
    Ramajayam, Gautham
    ;
    Kanchana, Vijayalakshmi
    ;
    Light field imaging is a rich way of representing the 3D world around us. However, due to limited sensor resolution capturing light field data inherently poses spatio-Angular resolution trade-off. In this paper, we propose a deep learning based solution to tackle the resolution trade-off. Specifically, we reconstruct full sensor resolution light field from a single coded image. We propose to do this in three stages 1) reconstruction of center view from the coded image 2) estimating disparity map from the coded image and center view 3) warping center view using the disparity to generate light field. We propose three neural networks for these stages. Our disparity estimation network is trained in an unsupervised manner alleviating the need for ground truth disparity. Our results demonstrate better recovery of parallax from the coded image. Also, we get better results than dictionary learning approaches on simulated data.
  • Placeholder Image
    Publication
    FlatNet3D: intensity and absolute depth from single-shot lensless capture
    (01-10-2022)
    Bagadthey, Dhruvjyoti
    ;
    Prabhu, Sanjana
    ;
    Khan, Salman S.
    ;
    Fredrick, D. Tony
    ;
    Boominathan, Vivek
    ;
    Veeraraghavan, Ashok
    ;
    Lensless cameras are ultra-thin imaging systems that replace the lens with a thin passive optical mask and computation. Passive mask-based lensless cameras encode depth information in their measurements for a certain depth range. Early works have shown that this encoded depth can be used to perform 3D reconstruction of close-range scenes. However, these approaches for 3D reconstructions are typically optimization based and require strong hand-crafted priors and hundreds of iterations to reconstruct. Moreover, the reconstructions suffer from low resolution, noise, and artifacts. In this work, we propose FlatNet3D-a feed-forward deep network that can estimate both depth and intensity from a single lensless capture. FlatNet3D is an end-to-end trainable deep network that directly reconstructs depth and intensity froma lensless measurement using an efficient physics-based3Dmapping stage and a fully convolutional network.Our algorithm is fast and produces high-quality results, whichwe validate using both simulated and real scenes captured usingPhlatCam.
  • Placeholder Image
    Publication
    Toward Unaligned Guided Thermal Super-Resolution
    (01-01-2022)
    Gupta, Honey
    ;
    Thermography is a useful imaging technique as it works well in poor visibility conditions. High-resolution thermal imaging sensors are usually expensive and this limits the general applicability of such imaging systems. Many thermal cameras are accompanied by a high-resolution visible-range camera, which can be used as a guide to super-resolve the low-resolution thermal images. However, the thermal and visible images form a stereo pair and the difference in their spectral range makes it very challenging to pixel-wise align the two images. The existing guided super-resolution (GSR) methods are based on aligned image pairs and hence are not appropriate for this task. In this paper, we attempt to remove the necessity of pixel-to-pixel alignment for GSR by proposing two models: the first one employs a correlation-based feature-alignment loss to reduce the misalignment in the feature-space itself and the second model includes a misalignment-map estimation block as a part of an end-to-end framework that adequately aligns the input images for performing guided super-resolution. We conduct multiple experiments to compare our methods with existing state-of-the-art single and guided super-resolution techniques and show that our models are better suited for the task of unaligned guided super-resolution from very low-resolution thermal images.
  • Placeholder Image
    Publication
    Live demonstration: Joint estimation of optical flow and intensity image from event sensors
    (01-06-2019)
    Shedligeri, Prasan
    ;
    This demonstration will show a deep learning based method to predict intensity images and optical flow from an event based sensor. The deep neural network is trained to take in as input a sequence of event frames and output a sequence of intensity images corresponding to the event frames and the optical flow between successive event frames. This setup will allow visitors to see that neural network based methods can be utilized for processing event sequence efficiently.
  • Placeholder Image
    Publication
    Real-Time Restoration of Dark Stereo Images
    (01-01-2023)
    Lamba, Mohit
    ;
    Suhas Kumar, M. V.A.
    ;
    Low-light image enhancement has been an actively researched area for decades and has produced excellent night-time single-image, video, and Light Field restoration methods. Despite these advances, the problem of extreme low-light stereo image restoration has been mostly ignored and addressing it can enable night-time capabilities to several applications such as smartphones and self-driving cars. We propose an especially light-weight and fast hybrid U-net architecture for extreme low-light stereo image restoration. In the initial few scale spaces, we process the left and right features individually, because the two features do not align well due to large disparity. At coarser scale-spaces, the disparity between left and right features decreases and the network's receptive field increases. We use this fact to reduce computations by simultaneously processing the left and right features, which also benefits epipole preservation. As our architecture does not use any 3D convolution for fast inference, we use a Depth-Aware loss module to train our network. This module computes quick and coarse depth estimates to better enforce the stereo epipolar constraints. Extensive benchmarking in terms of visual enhancement and downstream depth estimation shows that our architecture not only restores dark stereo images faithfully but also offers 4-60× speed-up with 15-100× lower floating point operations, necessary for real-world applications.
  • Placeholder Image
    Publication
    LWGNet - Learned Wirtinger Gradients for Fourier Ptychographic Phase Retrieval
    (01-01-2022)
    Saha, Atreyee
    ;
    Khan, Salman S.
    ;
    Sehrawat, Sagar
    ;
    Prabhu, Sanjana S.
    ;
    ;
    Fourier Ptychographic Microscopy (FPM) is an imaging procedure that overcomes the traditional limit on Space-Bandwidth Product (SBP) of conventional microscopes through computational means. It utilizes multiple images captured using a low numerical aperture (NA) objective and enables high-resolution phase imaging through frequency domain stitching. Existing FPM reconstruction methods can be broadly categorized into two approaches: iterative optimization based methods, which are based on the physics of the forward imaging model, and data-driven methods which commonly employ a feed-forward deep learning framework. We propose a hybrid model-driven residual network that combines the knowledge of the forward imaging system with a deep data-driven network. Our proposed architecture, LWGNet, unrolls traditional Wirtinger flow optimization algorithm into a novel neural network design that enhances the gradient images through complex convolutional blocks. Unlike other conventional unrolling techniques, LWGNet uses fewer stages while performing at par or even better than existing traditional and deep learning techniques, particularly, for low-cost and low dynamic range CMOS sensors. This improvement in performance for low-bit depth and low-cost sensors has the potential to bring down the cost of FPM imaging setup significantly. Finally, we show consistently improved performance on our collected real data (We have made the code avaiable at: https://github.com/at3e/LWGNet.git ).
  • Placeholder Image
    Publication
    Harnessing Multi-View Perspective of Light Fields for Low-Light Imaging
    (01-01-2021)
    Lamba, Mohit
    ;
    Rachavarapu, Kranthi Kumar
    ;
    Light Field (LF) offers unique advantages such as post-capture refocusing and depth estimation, but low-light conditions severely limit these capabilities. To restore low-light LFs we should harness the geometric cues present in different LF views, which is not possible using single-frame low-light enhancement techniques. We propose a deep neural network L3Fnet for Low-Light Light Field (L3F) restoration, which not only performs visual enhancement of each LF view but also preserves the epipolar geometry across views. We achieve this by adopting a two-stage architecture for L3Fnet. Stage-I looks at all the LF views to encode the LF geometry. This encoded information is then used in Stage-II to reconstruct each LF view. To facilitate learning-based techniques for low-light LF imaging, we collected a comprehensive LF dataset of various scenes. For each scene, we captured four LFs, one with near-optimal exposure and ISO settings and the others at different levels of low-light conditions varying from low to extreme low-light settings. The effectiveness of the proposed L3Fnet is supported by both visual and numerical comparisons on this dataset. To further analyze the performance of low-light restoration methods, we also propose the L3F-wild dataset that contains LF captured late at night with almost zero lux values. No ground truth is available in this dataset. To perform well on the L3F-wild dataset, any method must adapt to the light level of the captured scene. To do this we use a pre-processing block that makes L3Fnet robust to various degrees of low-light conditions. Lastly, we show that L3Fnet can also be used for low-light enhancement of single-frame images, despite it being engineered for LF data. We do so by converting the single-frame DSLR image into a form suitable to L3Fnet, which we call as pseudo-LF. Our code and dataset is available for download at https://mohitlamba94.github.io/L3Fnet/
  • Placeholder Image
    Publication
    Multi-Patch Aggregation Models for Resampling Detection
    (01-05-2020)
    Lamba, Mohit
    ;
    Images captured nowadays are of varying dimensions with smartphones and DSLR's allowing users to choose from a list of available image resolutions. It is therefore imperative for forensic algorithms such as resampling detection to scale well for images of varying dimensions. However, in our experiments we observed that many state-of-the-art forensic algorithms are sensitive to image size and their performance quickly degenerates when operated on images of diverse dimensions despite re-training them using multiple image sizes. To handle this issue, we propose two novel deep neural networks-Iterative Pooling Network (IPN), which does not assume any prior information about the original image size, and Branched Network (BN), which uses this prior knowledge to produce better results. IPN adopts a novel iterative pooling strategy that converts tensors of multiple sizes to tensors of a fixed size, as required by deep learning models with fully connected layers. BN alternatively adopts a branched architecture with dedicated pathways for images of different sizes. The effectiveness of the proposed solution is demonstrated on two problems, resampling detection and photorealism detection, which are generally solved as independent problems with different deep learning models. The code is available at https://github.com/MohitLamba94/Iterative-Pooling.
  • Placeholder Image
    Publication
    Video reconstruction by spatio-temporal fusion of blurred-coded image pair
    (01-01-2020)
    Anupama, S.
    ;
    Shedligeri, Prasan
    ;
    Pal, Abhishek
    ;
    Learning-based methods have enabled the recovery of a video sequence from a single motion-blurred image or a single coded exposure image. Recovering video from a single motion-blurred image is a very ill-posed problem and the recovered video usually has many artifacts. In addition to this, the direction of motion is lost and it results in motion ambiguity. However, it has the advantage of fully preserving the information in the static parts of the scene. The traditional coded exposure framework is better-posed but it only samples a fraction of the space-time volume, which is at best 50% of the space-time volume. Here, we propose to use the complementary information present in the fully-exposed (blurred) image along with the coded exposure image to recover a high fidelity video without any motion ambiguity. Our framework consists of a shared encoder followed by an attention module to selectively combine the spatial information from the fully-exposed image with the temporal information from the coded image, which is then super-resolved to recover a non-ambiguous high-quality video. The input to our algorithm is a fully-exposed and coded image pair. Such an acquisition system already exists in the form of a Coded-two-bucket (C2B) camera. We demonstrate that our proposed deep learning approach using blurred-coded image pair produces much better results than those from just a blurred image or just a coded image.