Publication8
Browse
Browsing Publication8 by Author "Abburi, Kiran Kumar"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- PublicationA scalable LDPC decoder on GPU(25-03-2011)Abburi, Kiran KumarA flexible and scalable approach for LDPC decoding on CUDA based Graphics Processing Unit (GPU) is presented in this paper. Layered decoding is a popular method for LDPC decoding and is known for its fast convergence. However, efficient implementation of the layered decoding algorithm on GPU is challenging due to the limited amount of data-parallelism available in this algorithm. To overcome this problem, a kernel execution configuration that can decode multiple codewords simultaneously on GPU is developed. This paper proposes a compact data packing scheme to reduce the number of global memory accesses and parity-check matrix representation to reduce constant memory latency. Global memory bandwidth efficiency is improved by coalescing simultaneous memory accesses of threads in a half-warp into a single memory transaction. Asynchronous data transfers are used to hide host memory latency by overlapping kernel execution with data transfers between CPU and GPU. The proposed implementation of LDPC decoder on GPU performs two orders of magnitude faster than the LDPC decoder on a CPU and four times faster than the previously reported LDPC decoder on GPU. This implementation achieves a throughput of 160Mbps, which is comparable to dedicated hardware solutions. © 2011 IEEE.
- PublicationCell processor based LDPC encoder/decoder for WiMAX applications(23-05-2012)Abburi, Kiran KumarEncoder and decoder are the two most important and complex components of a wireless transceiver. Traditionally, dedicated hardware solutions are used because of their computational intensive algorithms. This paper presents an alternative software-based solution that has several advantages over dedicated hardware solutions. LDPC codes are chosen for their excellent error correcting performance and cell processor is chosen for its tremendous computational power. Sparse and structural properties of LDPC codes are exploited to reduce computation and memory requirements. Several optimization techniques suitable to cell processor architecture such as multi-threading, vectorization, loop unrolling are used to improve performance. The proposed solution achieved significant performance improvement over existing software and dedicated hardware solutions. © 2012 Springer India Pvt. Ltd.