Now showing 1 - 10 of 60
  • Placeholder Image
    Publication
    Combining confidence score and Mal-rule filters for automatic creation of Bangla error corpus: Grammar checker perspective
    (21-03-2012)
    Kundu, Bibekananda
    ;
    ;
    Choudhury, Sanjay Kumar
    This paper describes a novel approach for automatic creation of Bangla error corpus for training and evaluation of grammar checker systems. The procedure begins with automatic creation of large number of erroneous sentences from a set of grammatically correct sentences. A statistical Confidence Score Filter has been implemented to select proper samples from the generated erroneous sentences such that sentences with less probable word sequences get lower confidence score and vice versa. Rule based Mal-rule filter with HMM based semi-supervised POS tagger has been used to collect the sentences having improper tag sequences. Combination of these two filters ensures the robustness of the proposed approach such that no valid construction is getting selected within the synthetically generated error corpus. Though the present work focuses on the most frequent grammatical errors in Bangla written text, detail taxonomy of grammatical errors in Bangla is also presented here, with an aim to increase the coverage of the error corpus in future. The proposed approach is language independent and could be easily applied for creating similar corpora in other languages. © 2012 Springer-Verlag.
  • Placeholder Image
    Publication
    Application of Supervised Machine Learning to Extract Brain Connectivity Information from Neuroscience Research Articles
    (01-12-2021)
    Sharma, Ashika
    ;
    Jayakumar, Jaikishan
    ;
    Mitra, Partha P.
    ;
    ;
    Kumar, P. Sreenivasa
    Abstract: Understanding the complex connectivity structure of the brain is a major challenge in neuroscience. Vast and ever-expanding literature about neuronal connectivity between brain regions already exists in published research articles and databases. However, with the ever-expanding increase in published articles and repositories, it becomes difficult for a neuroscientist to engage with the breadth and depth of any given field within neuroscience. Natural Language Processing (NLP) techniques can be used to mine ‘Brain Region Connectivity’ information from published articles to build a centralized connectivity resource helping neuroscience researchers to gain quick access to research findings. Manually curating and continuously updating such a resource involves significant time and effort. This paper presents an application of supervised machine learning algorithms that perform shallow and deep linguistic analysis of text to automatically extract connectivity between brain region mentions. Our proposed algorithms are evaluated using benchmark datasets collated from PubMed and our own dataset of full text articles annotated by a domain expert. We also present a comparison with state-of-the-art methods including BioBERT. Proposed methods achieve best recall and F2 scores negating the need for any domain-specific predefined linguistic patterns. Our paper presents a novel effort towards automatically generating interpretable patterns of connectivity for extracting connected brain region mentions from text and can be expanded to include any other domain-specific information. Graphic Abstract: [Figure not available: see fulltext.]
  • Placeholder Image
    Publication
    Exploiting the interplay among products for efficient recommendations
    (01-01-2019)
    Sekar, Anbarasu
    ;
    Recommender systems are built with the aim to reduce the cognitive load on the user. An efficient recommender system should ensure that a user spends minimal time in the process. Conversational Case-Based Recommender Systems (CCBR-RSs) depend on the feedback provided by the user to learn about the preferences of the user. Our goal is to use the feedback provided by the user effectively by exploiting the interplay among the products to build an efficient CCBR-RS. In this work, we propose two ways towards achieving that goal. In the first method, we utilize the higher order similarity and tradeoff relationship among the products to propagate the evidence obtained through user feedback. In our second method, we utilize the diversity among cases/products along with the similarity and trade-off relationship to make the best use of the feedback provided by the user.
  • Placeholder Image
    Publication
    Thinking Fast and Slow: A CBR Perspective
    (01-01-2021)
    Kaurav, Srashti
    ;
    Ganesan, Devi
    ;
    Deepak, P.
    ;
    In a path-breaking work, Kahneman characterized human cognition as a result of two modes of operation, Fast Thinking and Slow Thinking. Fast thinking involves quick, intuitive decision making and slow thinking is deliberative conscious reasoning. In this paper, for the first time, we draw parallels between this dichotomous model of human cognition and decision making in Case-based Reasoning (CBR). We observe that fast thinking can be operationalized computationally as the fast decision making by a trained machine learning model, or a parsimonious CBR system that uses few attributes. On the other hand, a full-fledged CBR system may be seen as similar to the slow thinking process. We operationalize such computational models of fast and slow thinking and switching strategies, as Models 1 and 2. Further, we explore the adaptation process in CBR as a slow thinking manifestation, leading to Model 3. Through an extensive set of experiments on real-world datasets, we show that such realizations of fast and slow thinking are useful in practice, leading to improved accuracies in decision-making tasks.
  • Placeholder Image
    Publication
    Spreading activation way of knowledge integration
    (01-01-2015)
    Shekhar, Shubhranshu
    ;
    ;
    Search and recommender systems benefit from effective integration of two different kinds of knowledge. The first is introspective knowledge, typically available in feature-theoretic representations of objects. The second is external knowledge, which could be obtained from how users rate (or annotate) items, or collaborate over a social network. This paper presents a spreading activation model that is aimed at a principled integration of these two sources of knowledge. In order to empirically evaluate our approach, we restrict the scope to text classification tasks, where we use the category knowledge of the labeled set of examples as an external knowledge source. Our experiments show a significantly improved classification effectiveness on hard datasets, where feature value representations, on their own, are inadequate in discriminating between classes.
  • Placeholder Image
    Publication
    Competence guided model for casebase maintenance
    (01-01-2017)
    Mathew, Ditty
    ;
    A competence guided casebase maintenance algorithm retains a case in the casebase if it is useful to solve many problems and ensures that the casebase is highly competent. In this paper, we address the compositional adaptation process (of which single case adaptation is a special case) during casebase maintenance by proposing a case competence model for which we propose a measure called retention score to estimate the retention quality of a case. We also propose a revised algorithm based on the retention score to estimate the competent subset of a casebase. We used synthetic datasets to test the effectiveness of the competent subset obtained from the proposed model. We also applied this model in a tutoring application and analyzed the competent subset of concepts in tutoring resources. Empirical results show that the proposed model is effective and overcomes the limitation of footprintbased competence model in compositional adaptation applications.
  • Placeholder Image
    Publication
    Towards compiling textbooks from Wikipedia
    (01-01-2018)
    Mathew, Ditty
    ;
    In this paper, we explore challenges in compiling a pedagogic resource like a textbook on a given topic from relevant Wikipedia articles, and present an approach towards assisting humans in this task. We present an algorithm that attempts to suggest the textbook structure from Wikipedia based on a set of seed concepts (chapters) provided by the user. We also conceptualize a decision support system where users can interact with the proposed structure and the corresponding Wikipedia content to improve its pedagogic value. The proposed algorithm is implemented and evaluated against the outline of online textbooks on five different subjects. We also propose a measure to quantify the pedagogic value of the suggested textbook structure.
  • Placeholder Image
    Publication
    An Optimal Case-Base Maintenance Method for Compositional Adaptation Applications
    (01-01-2019)
    Mathew, Ditty
    ;
    Case-base maintenance method aims at maintaining a compressed case-base which is useful for solving future problems effectively. In this paper, we propose an optimization formulation to arrive at a compressed case-base that can find a solution for the rest of the cases in the case-base that involves compositional adaptation process. The objective of the optimization problem is to minimize the footprint set size and maximize the quality of solutions that can be adapted from the footprint set. We empirically studied the proposed formulation on four different datasets and the results show that the proposed model is effective and overcomes the limitation of the existing optimal footprint method in compositional adaptation applications.
  • Placeholder Image
    Publication
    Flexible and dynamic compromises for effective recommendations
    (11-12-2013)
    Gupta, Saurabh
    ;
    Conversational Recommendation mimics the kind of dialog that takes between a customer and a shopkeeper involving multiple interactions where the user can give feedback at every interaction as opposed to Single Shot Retrieval, which corresponds to a scheme where the system retrieves a set of items in response to a user query in a single interaction. Compromise refers to a particular user preference which the recommender system failed to satisfy. But in the context of conversational systems, where the user's preferences keep on evolving as she interacts with the system, what constitutes as a compromise for her also keeps on changing. Typically, in Single Shot retrieval, the notion of compromise is characterized by the assignment of a particular feature to a particular dominance group such as MIB (higher value is better) or LIB (lower value is better) and this assignment remains true for all the users who use the system. In this paper, we propose a way to realize the notion of compromise in a conversational setting. Our approach, Flexi-Comp, introduces the notion of dynamically assigning a feature to two dominance groups simultaneously which is then used to redefine the notion of compromise. We show experimentally that a utility function based on this notion of compromise outperforms the existing conversational rec-ommenders in terms of recommendation efficiency. Copyright 2013 ACM. 15.00.
  • Placeholder Image
    Publication
    Parallels between linguistics and biology
    (01-01-2013)
    Tendulkar, Ashish Vijay
    ;
    In this paper we take a fresh look at parallels between linguistics and biology. We expect that this new line of thinking will propel cross fertilization of two disciplines and open up new research avenues.