Options
Veezhinathan Kamakoti
Loading...
Preferred name
Veezhinathan Kamakoti
Official Name
Veezhinathan Kamakoti
Alternative Name
Veezhinathan, Kamakoti
Kamakoti, V.
Kamakoti, Veezhinathan
Main Affiliation
Email
Scopus Author ID
Google Scholar ID
10 results
Now showing 1 - 10 of 10
- PublicationMemMap-pd: Performance driven technology mapping algorithm for FPGAs with embedded memory blocks(01-01-2004)
;Manimegalai, R. ;Manoj Kumar, A. ;Jayaram, B.Modern day Field Programmable Gate Arrays (FPGA) include in addition to Look-up Tables, reasonably big configurable Embedded Memory Blocks (EMB) to cater to the on-chip memory requirements of systems/applications mapped on them. While mapping applications on to such FPGAs, some of the EMBs may be left unused. This paper presents a methodology to utilize such unused EMBs as large look-up tables to map multi-output combinational sub-circuits of the application, with depth minimization as the main objective along with area minimization in terms of the number of LUTs used. This paper presents a new algorithm for technology mapping onto heterogeneous architectures containing LUTs and embedded memory blocks. For the first time, the concept of reconvergence is used in the field of FPGA mapping and is shown to be effective. The algorithm consists of four main stages, namely, Pre-Processing, Reconvergence Analysis, Memory Mapping and LUT Mapping. Experimental results show that our proposed methodology, when employed on popular benchmark circuits, leads to upto 14% reduction in depth compared with the DAGMap, along with comparable reduction in area. Pre-Processing: In the first stage of the algorithm, the given circuit is converted into an equivalent two-input network. It has been shown that this conversion leads to better mapping of the circuit into LUTs by minimizing the overall depth of the decomposed circuit. Reconvergence Analysis: In this stage, the circuit obtained from the preprocessing stage is analyzed for reconvergence and overlapping reconvergent regions are identified for mapping into embedded memories. Memory Mapping: We use a 2-phase heuristic for selecting appropriate regions for memory mapping. In the first phase, the overlapping reconvergent regions that can be mapped to the memory blocks are expanded till they just satisfy the pin constraint imposed by the memory arrays. In the next phase, the best among the expanded regions are selected based on the potential depth reduction obtained by mapping the region onto embedded memory blocks. LUT Mapping: This is the final phase of the algorithm in which the residual circuit left after mapping onto memory blocks is mapped into LUTs. The DAG-Map algorithm is used to implement this mapping. - PublicationA hardware-directed face recognition system based on local eigen-analysis with PCNN(01-12-2004)
;Siva Sai Prasanna, C. ;Sudha, N.A new face recognition system based on eigenface analysis on segments of face images is discussed in this paper. The eigenfaces are extracted using principal component neural networks. The proposed recognition system can tolerate local variations in the face such as expression changes and directional lighting. Further, the system can be easily mapped onto the hardware. © Springer-Verlag Berlin Heidelberg 2004. - PublicationFace recognition using weighted modular principle component analysis(01-12-2004)
;Pavan Kumar, A.; A method of face recognition using a weighted modular principle component analysis (WMPCA) is presented in this paper. The proposed methodology has a better recognition rate, when compared with conventional PCA, for faces with large variations in expression and illumination. The face is divided into horizontal sub-regions such as forehead, eyes, nose and mouth. Then each of them are separately analyzed using PCA. The final decision is taken based on a weighted sum of errors obtained from each sub-region.A method is proposed, to calculate these weights, which is based on the assumption that different regions in a face vary at different rates with expression, pose and illumination. © Springer-Verlag Berlin Heidelberg 2004. - PublicationVNF-DOC: A Dynamic Overload Controller for Virtualized Network Functions in Cloud(01-01-2020)
;Murugasen, Sudhakar ;Raman, ShankarNetwork Function Virtualization (NFV) supports enterprises and service providers to build reliable network services in a cost-effective way. Such network services are created by combining one or more Virtual Network Functions (VNFs) hosted in private or public cloud infrastructure. However, uncontrolled VNF overload is a major cause of network service failure in NFV. Overload conditions negatively impact throughput, and hence the resiliency requirements of NFV. The ability to detect and mitigate an overload quickly, and ensuring high throughput, for varying overload condition is critical. The existing solutions are unable to meet these combined objectives in VNFs. In this paper, we propose a Dynamic Overload Controller for VNF (VNF-DOC), which uses VNF’s current and predicted load for every sampling interval, to decide on a mitigation action. It mitigates both transient and sustained overload, by dynamically using cloud auto scale, Virtual Machine buffer pool, and traffic throttling. We evaluate our solution on NFV based IP multimedia system, hosted in the AWS cloud environment. The result shows that VNF-DOC mitigates high capacity overload without any adverse side effects and achieves at least 94% throughput. VNF-DOC is robust in handling varying overload with negligible performance overhead. - PublicationToward optimal player weights in secure distributed protocols(01-01-2001)
;Srinathan, K.; A secure threshold protocol for n players tolerating an adversary structure A is feasible iff maxa∈A |a| < n/c, where c = 2 or c = 3 depending on the adversary being eavesdropping (passive) or Byzantine (active) respectively [1]. However, there are situations where the threshold protocol Π for n players tolerating an adversary structure A may not be feasible but by letting each player Pi to act for a number of similar players, say wi, a new secure threshold protocol Π′ tolerating A may be devised. Note that the new protocol Π′ has N = ∑ni=1 wi players and works with the same adversary structure A used in Π. The integer quantities wi‘s are called weights and we are interested in computing wi‘s so that 1. Π′ tolerates A even if Π does not tolerate A. 2. N =∑ni=1 wi is minimum. Since the best known secure threshold protocol over N players has a communication complexity of O(mN2 lg |F|) bits [9], where m is the number of multiplication gates in the arithmetic circuit, over the finite field F, that describes the functionality of the protocol, it is evident that the weights assigned to the players have a direct influence on the complexity of the resulting secure weighted threshold protocol. In this work, we focus on computing the optimum N. We show that computing the optimum N is NP-Hard. Furthermore, we prove that the above problem of computing the optimum N is inapproximable within (Formula Presented)for any ε > 0 (and hence inapproximable within Ω(lg |A|)), unless NP ⊂ DTIME(nlog log n), where N* is the optimum solution. - PublicationLFSR based stream ciphers are vulnerable to power attacks(01-01-2007)
;Burman, Sanjay ;Mukhopadhyay, DebdeepLinear Feedback Shift Registers (LFSRs) are used as building blocks for many stream ciphers, wherein, an n-degree primitive connection polynomial is used as a feedback function to realize an n-bit LFSR. This paper shows that such LFSRs are susceptible to power analysis based Side Channel Attacks (SCA). The major contribution of this paper is the observation that the state of an n-bit LFSR can be determined by making O(n) power measurements. Interestingly, neither the primitive polynomial nor the value of n be known to the adversary launching the proposed attack. The paper also proposes a simple countermeasure for the SCA that uses n additional flipflops. © Springer-Verlag Berlin Heidelberg 2007. - PublicationReconstructing hardware transactional memory for workload optimized systems(30-09-2011)
;Korgaonkar, Kunal ;Jain, Prabhat ;Tomar, Deepak ;Garimella, KashyapWorkload optimized systems consisting of large number of general and special purpose cores, and with a support for shared memory programming, are slowly becoming prevalent. One of the major impediments for effective parallel programming on these systems is lock-based synchronization. An alternate synchronization solution called Transactional Memory (TM) is currently being explored. We observe that most of the TM design proposals in literature are catered to match the constrains of general purpose computing platforms. Given the fact that workload optimized systems utilize wider hardware design spaces and on-chip parallelism, we argue that Hardware Transactional Memory (HTM) can be a suitable implementation choice for these systems. We re-evaluate the criteria to be satisfied by a HTM and identify possible scope for relaxations in the context of workload optimized systems. Based on the relaxed criteria, we demonstrate the scope for building HTM design variants, such that, each variant caters to a specific workload requirement. We carry out suitable experiments to bring about the trade-off between the design variants. Overall, we show how the knowledge about the workload is extremely useful to make appropriate design choices in the workload optimized HTM. © 2011 Springer-Verlag. - PublicationThe colored sector search tree: A dynamic data structure for efficient high dimensional nearest-foreign-neighbor queries(01-01-1998)
;Graf, T.; ;Janaki Latha, N. S.In this paper we present the new data structure Colored Sector Search Tree (CSST) for solving the Nearest-Foreign-Neighbor Query Problem (NFNQP): Given a set S of n colored points in ℝD, where D ≥ 2 is a constant, and a subset Sʹ ⊂ Sʹ stored in a CSST, for any colored query point q ∈ IRD a nearest foreign neighbor in Sʹ, i.e. a closest point with a different color, can be reported in O(log n(log log n)D−1) time w.r.t. a polyhedral distance function that is defined by a star-shaped polyhedron with O(1) vertices; note that this includes the Minkowski metrics d1 and d∞. It takes a preprocessing time of O(n(log n)D−1) to construct the CSST. Points from S can be inserted into the set Sʹ and removed from Sʹ in O(log n(log log n)D−1) time. The CSST uses O(n(log n)D−1) space. We present an application of the data structure in the parallel simulation of solute transport in aquifer systems by particle tracking. Other applications may be found in GIS (geo information systems) and in CAD (computer aided design). To our knowledge the CSST is the first data structure to be reported for the NFNQP. - PublicationTestable clock routing architecture for field programmable gate arrays(01-01-2003)
;Kumar, L. Kalyan ;Mupid, Amol J. ;Ramani, Aditya S.This paper describes an efficient methodology for testing dedicated clock lines in Field Programmable Gate Arrays (FPGAs). A H-tree based clocking architecture is proposed along with a test scheme. The H-tree architecture provides optimal clock skew characteristics. The H-tree architecture consumes at least 25% less of the routing resources when compared to conventional clock routing schemes. A testing scheme, which utilizes the partial reconfiguration capabilities of FPGAs through selective re-programming of the Complex Logic Blocks, to detect and locate faults in the clock lines is proposed. © Springer-Verlag Berlin Heidelberg 2003.