Recent News

  • [4 Dec, 2019] Our NeurIPS 2018 paper “On Controllable Sparse Alternatives to Softmax” received the IBM Research India Distinguished Paper Award for the year 2018.

  • [17 Sep, 2019] Our paper “Scalable Micro-planned Generation of Discourse from Structured Data” got accepted for publication in the Computational Linguistics journal 2019 arXiv.

  • [28 Jul, 2019] We presented a tutorial titled “Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective” at ACL 2019 in Florence, Italy Details.

  • [14 May, 2019] Our paper “Unsupervised Neural Text Simplification” got accepted in ACL 2019 arXiv.

  • [5 Feb, 2019] Our tutorial proposal “Storytelling from Structured Data and Knowledge Graphs : An NLG Perspective” got accepted for ACL 2019. The tutorial will be held on July 28th, 2019 in Florence, Italy. More details coming soon!!!

  • [6 Dec, 2018] Poster for the work “On Controllable Sparse Alternatives to Softmax” was presented in NeurIPS conference in Montreal, Canada.

  • [18 Oct, 2018] The pre-print version of our work “Unsupervised Neural Text Simplification” is available now on arXiv.

  • [5 Oct, 2018] The pre-print version of our work “Scalable Micro-planned Generation of Discourse from Structured Data” is available now on arXiv.

  • [5 Sep, 2018] Our paper “On Controllable Sparse Alternatives to Softmax” got accepted in NeurIPS 2018 Link.

  • [16 Jul, 2018] Our ACL 2017 paper “Diversity driven Attention Model for Query-based Abstractive Summarization“ received the IBM Research India Distinguished Paper Award for the year 2017.

  • [20 Jun, 2018] IBM Debater participated in a LIVE debate against human.
    News coverage on the debate: (NYT, Forbes,The Guardian).
    Details on contributions: IBM Debater.

Publications

. Scalable Micro-planned Generation of Discourse from Structured Data. In Computational Linguistics (CL), 2019.

Preprint PDF Code Dataset

. Unsupervised Neural Text Simplification. In ACL, 2019.

Preprint PDF Code Dataset

. On Controllable Sparse Alternatives to Softmax. In NeurIPS, 2018.

PDF Code Poster Slides Video

. Generating Descriptions from Structured Data Using a Bifocal Attention Mechanism and Gated Orthogonalization. In NAACL-HLT, 2018.

PDF Code Dataset Slides

. A Mixed Hierarchical Attention based Encoder-Decoder Approach for Standard Table Summarization. In NAACL-HLT (Short), 2018.

PDF Poster

. Diversity driven Attention Model for Query-based Abstractive Summarization. In ACL, 2017.

PDF Code Dataset Slides Video

. Joint Learning of Correlated Sequence Labelling Tasks Using Bidirectional Recurrent Neural Networks. In INTERSPEECH, 2017.

PDF Code Poster

. Story Generation from Sequence of Independent Short Descriptions. In SIGKDD Workshop on Machine Learning for Creativity (ML4Creativity), 2017.

PDF

. A Machine Learning Approach for Evaluating Creative Artifacts. In SIGKDD Workshop on Machine Learning for Creativity (ML4Creativity), 2017.

PDF Poster

Selected Talks

  • [28 Jul, 2019] ACL 2019 Tutorial Storytelling From Structured Data and Knowledge Graphs: An NLG Perspective. Venue: ACL 2019, Florence, Italy. Details: Link

  • [07 Mar, 2018] Generating Natural Language Descriptions from Structured Data. Venue: IBM Research-IISc Workshop on Knowledge and Learning organised at IISc. Details: Link

  • [20 Jan, 2018] Tutorial on Natural Language Processing and Generation. Venue: Technical Talk on the broader theme of ‘Cognitive Analytics and NLP’ as part of the annual science and cultural fest at IISc, Bangalore.

Projects

  • Controllable Sparse Alternatives to Softmax: We study techniques for converting a real vector to a probability distribution. We propose a unified framework which leads to understanding of existing approaches incl. softmax. Our framework also enables providing explicit controls to vary sparsity of the output distribution. Additionally, our proposed convex loss functions enable achievement of greater sparsity for multi-label classification task. We also show encouraging results for sparse attention in NLG tasks. NeurIPS ‘18 Paper.

  • Structured Data to Text Generation: This can be categorized into two paradigms - (a) Supervised End-to-end attention based seq2seq approaches for summarization of tabular data (NAACL ‘18 Paper 1, Paper 2), and (b) Unsupervised coherent description generation from tabular data, in a way that is scalable and adaptable to newer domains and does not require parallel data CL ‘19 Paper.

  • Query-based Abstractive Summarization: In this work, we propose a model for the query-based summarization task based on the encode-attend-decode paradigm with two key additions (i) a query attention model (in addition to document attention model) which learns to focus on different portions of the query at different time steps (instead of using a static representation for the query) and (ii) a new diversity based attention model which aims to alleviate the problem of repeating phrases in the summary. ACL ‘17 Paper.

  • Sequence classification used for claim and evidence detection in argument mining: We empirically evaluated multiple RNN and CNN architectures for bi-sequence classification and applied them for context-based claim and evidence detection. COLING ‘16 Paper.

  • Punctuation and Case prediction for ASR output: Automatic Speech Recognition (ASR) engines are good at recognizing words from human speech; however, the output from ASR needs to be formatted with punctuation and case to be useful for downstream NLP tasks. In this work, we employed a multi-task approach to simultaneously produce punctuation and casing labels for a stream of words. This solution enabled mining of claims and evidences from human debate recordings. INTERSPEECH ‘17 Paper.

  • Story generation from incoherent descriptions: We proposed the new task of generating a coherent story from disconnected text snippets. For this, we implement an RNN based encoder-decoder solution to produce comprehensive coherent story. Paper.

  • ML Framework for evaluating creativity: We identified important metrics for measuring creativity and proposed a regression-based learning framework for evaluating these metrics and combining them to produce a creativity score for an artifact. Paper.

  • Gene Prioritization from Heterogeneous Data Sources: In this work, we use graph-based learning-to-rank methods to learn a ranking of genes from each individual data source represented as a graph, and then apply rank aggregation methods to aggregate these rankings into a single ranking over the genes. MS Thesis.

  • Investigation of the incentive compatible scoring rules for prediction markets: We applied Prediction Markets in IISc Prediction League (to predict winners of DLF Indian Premier League 2012), an IISc-wide web application based on open-source Zocalo framework. Project Report.

Contact