The Homepage of Debopam Das

Debopam Das



Department English and American Studies
Humboldt University of Berlin
Unter den Linden 6
10099 Berlin Germany

Phone: +49 15214159339
Email: dasdebop@hu-berlin.de, ddas@sfu.ca

ABOUT ME

I am primarily a linguist, specializing in discourse analysis. A major part of my research concentrates on the topics pertaining to discourse structure of texts, such as discourse relations (relations between propositions or speech acts) and discourse signals (e.g., discourse connective if for Condition relations). Presently, I work as a postdoc (Wissenschaftlicher Mitarbeiter) in the Department English and American Studies, Humboldt University of Berlin, Germany. I completed my PhD degree in Linguistics in 2014 at Simon Fraser University (SFU). My PhD dissertation focused on how discourse relations are indicated by a wide variety of signals (such as lexical, semantic, syntactic and genre features, in addition to discourse connectives). The outcome of the project was a corpus of discourse signals, called the RST Signalling Corpus. In the past, I worked in areas such as sentiment analysis, discourse parsing and lexicography (bilingual and multilingual dictionaries). More recently, I have worked on developing NLP resources such as lexicons of discourse connectives in English and Bangla (an Indo-Aryan language). Presently, I and my colleague Dr. Markus Egg are investigating the role of continuity or discontinuity (along dimensions like time, space, action, modality or speech act) in the interpretation discourse relations.

Download my CV.

RESEARCH INTERESTS

DISCOURSE ANALYSIS: explicit and implicit discourse relations, signalling in discourse, discourse markers, continuity in discourse, discourse parsing, sentiment analysis

CORPUS LINGUISTICS: corpus annotation, corpus development

LEXICOGRAPHY: discourse connective lexicons, bilingual dictionaries, translation of technical terms

THEORETICAL FRAMEWORK

Rhetorical Structure Theory, PDTB framework, Appraisal Theory, Systemic Functional Linguistics

PUBLICATIONS

DISSERTATION

Das, Debopam (2014). Signalling of Coherence Relations in Discourse . Ph.D. dissertation. Simon Fraser University, Canada

JOURNAL ARTICLES

Das, Debopam, & Egg, Markus. (in preparation). Continuity in discourse.

Egg, Markus, & Das, Debopam. (to appear). Multiple signals of coherence relations. Linguistics Vanguard (Special issue: Natural Language Conditionals and Conditional Reasoning).

Das, Debopam, & Taboada, Maite. (2019). Multiple signals of coherence relations . Discours, 24 (online)

Das, Debopam, & Taboada, Maite. (2018). RST Signalling Corpus: A corpus of signals of coherence relations . Language Resources & Evaluation, 52(1), 149-184.

Das, Debopam, & Taboada, Maite. (2018). Signalling of coherence relations in discourse, beyond discourse markers . Discourse Processes, 55(8),743-770.

Trnavac, Radoslava, Das, Debopam, & Taboada, Maite. (2016). Coherence relations and evaluation. Corpora , 11(2): 169-190

Taboada, Maite, & Das, Debopam. (2013). Annotation upon annotation: Adding signalling information to a corpus of discourse relations . Dialogue and Discourse, 4(2), 249-281.

CONFERENCE AND WORKSHOP PROCEEDINGS

Das, Debopam, Stede, Manfred, Ghosh, Soumya Sankar, & Chatterjee, Lahari. (2020). DiMLex-Bangla: A Lexicon of Bangla Discourse Connectives. In proceedings of the LREC 2020. Online.

Das, Debopam. (2019). Nuclearity in RST and signals of coherence relations. In proceedings of the workshop on Discourse Relation Parsing and Treebanking 2019 (NAACL 2019), Minneapolis, USA.

Scheffler, Tatjana, Aktas, Berfin, Das, Debopam, & Stede, Manfred. (2019). Annotating Shallow Discourse Relations in Twitter Conversations. In proceedings of the workshop on Discourse Relation Parsing and Treebanking 2019 (NAACL 2019), Minneapolis, USA.

Das, Debopam, Scheffler, Tatjana, Bourgonje, Peter, & Stede, Manfred. (2018). Constructing a Lexicon of English Discourse Connectives. In proceedings of the SIGDIAL 2018, Melbourne, Australia.

Das, Debopam. (2018). Discourse Segmentation in Bangla. In proceedings of the 4th workshop on Indian Language Data: Resources and Evaluation (LREC 2018), Miyazaki, Japan.

Das, Debopam, & Stede, Manfred. (2018). Developing the Bangla RST Discourse Treebank. In proceedings of the LREC 2018, Miyazaki, Japan.

Das, Debopam, Taboada, Maite, & Stede, Manfred. (2017). The Good, the Bad, and the Disagreement: Complex ground truth in rhetorical structure analysis. In proceedings of the workshop on the Recent Advances in RST and Related Formalisms (EMNLP 2017), Santiago de Compostela, Spain.

Das, Debopam, & Taboada, Maite. (2013). Explicit and Implicit Coherence Relations: A Corpus Study. In proceedings of the Canadian Linguistic Association (CLA) Conference, University of Victoria, Canada.

Das, Debopam. (2012). Investigating the Role of Discourse Markers in Signalling Coherence Relations: A Corpus Study. In proceedings of the 28th Northwest Linguistics Conference, University of Washington, Seattle, USA.

Das, Debopam. (2010). The Uses and Distribution of Non-progressive Verbs in Progressive Forms in English: A Corpus-based Study. In proceedings of the 26th Northwest Linguistics Conference, Simon Fraser University, Canada.

TECHNICAL REPORTS

Stede, Manfred, & Das, Debopam. (2018). Bangla RST Discourse Treebank: Annotation Guidelines. Manuscript. University of Potsdam, Germany.

Stede, Manfred, Taboada, Maite, & Das, Debopam. (2017). Annotation Guidelines for Rhetorical Structure. Manuscript. University of Potsdam, Germany and Simon Fraser University, Canada.

Das, Debopam. (2010). Computational Analysis of Text Sentiment: A Report on Extracting Contextual Information about the Occurrence of Discourse Markers. Technical Report, Computational Analysis of Text Sentiment, Simon Fraser University, Canada.

CONFERENCE AND WORKSHOP PRESENTATIONS

August 2021
Continuative and Contrastive Discourse Relations across Discourse Domains: Cognitive and Cross-Linguistic Approaches (workshop at the 54th Annual Meeting of the Societas Linguistica Europaea). Online. Das. D. & Egg, M. (2021) Continuity in discourse: A case study on discourse relations.

February 2021
The Semantics and Pragmatics of Conditional Connectives (workshop at the 43rd Annual Conference of the DGfS). Online. Egg, M. & Das, D. (2021) Signalling Conditional Relations.

September 2019
XPrag workshop on Contrasting Underspecification and Overspecification of Discourse Relations. ZAS Berlin, Germany. Das, D. & Egg, M. (2019) Caught in the middle with you: Between under- and overspecification of discourse relations.

June 2019
The Workshop on Discourse Relation Parsing and Treebanking 2019. Minneapolis, USA. Das, D. (2019) Nuclearity in RST and signals of coherence relations.

June 2019
The Workshop on Discourse Relation Parsing and Treebanking 2019. Minneapolis, USA. Das, D. (2019). Annotating Shallow Discourse Relations in Twitter Conversations.

May 2018
The workshop on Implicit and explicit marking of discourse relations: the comparison between causals and conditionals. Osnabrueck University, Germany. Das, D. (2018) Multiple Signals of Coherence Relations.

May 2018
The 4th Workshop on Indian Language Data: Resources and Evaluation (WILDRE-4). Miyazaki, Japan. Das, D. (2018) Discourse Segmentation in Bangla.

May 2018
The LREC 2018. Miyazaki, Japan. Das, D. (2018) Developing the Bangla RST Discourse Treebank.

September 2017
The 6th Workshop on Recent Advances in RST and Related Formalisms. Santiago de Compostela, Spain. Das, D. (2017) The Good, the Bad, and the Disagreement: Complex ground truth in rhetorical structure analysis.

July 2013
2013 LACUS Conference. Brooklyn College, Brooklyn, New York, USA. Das, D. (2013) Signalling Subject Matter and Presentational Coherence Relations in Discourse: A Corpus Study.

June 2013
Canadian Linguistic Association (CLA) Conference. University of Victoria, Canada. Das, D. (2013) Explicit and Implicit Coherence Relations: A Corpus Study.

April 2012
2012 Northwest Linguistics Conference. University of Washington, Seattle, USA. Das, D. (2012) Investigating the Role of Discourse Markers in Signalling Coherence Relations: A Corpus Study.

May 2010
26th Northwest Linguistics Conference. Simon Fraser University, Canada. Das, D. (2010) The Uses and Distribution of Non-progressive Verbs in Progressive Forms in English: A Corpus-based Study.

October 2008
30th All India Conference of Linguists, Deccan College Post-Graduate & Research Institute, Pune, India. Das, D. (2008) Technical Terms and Vernacular: Some Notes on Linguistic Terminology in Bangla.

CORPORA

Das, Debopam, & Stede, Manfred (partially completed). The Bangla RST Discourse Treebank. University of Potsdam, Germany.

Das, Debopam, Taboada, Maite, & McFetridge, Paul. (2015). RST Signalling Corpus. LDC2015T10. Distributed through the Linguistic Data Consortium.

LEXICONS

Das, Debopam, Ghosh, Soumya Sankar, & Chatterjee, Lahari (2018). DiMLex-Bangla: A lexicon of Bangla discourse connectives. University of Potsdam, Germany.

Das, Debopam, Scheffler, Tatjana, Bourgonje, Peter, & Stede, Manfred (2018). DiMLex-Eng: A lexicon of English discourse connectives. University of Potsdam, Germany.

EDUCATION

2009 – 2014
Simon Fraser University, Burnaby, Canada
PhD in Linguistics (completed in August 2014)
Dissertation Topic: Signalling of Coherence Relations in Discourse

2004 – 2007
University of Calcutta, Kolkata, India
MA in Linguistics

2000 – 2004
University of Calcutta, Kolkata, India
BA in English Language and Literature

PROFESSIONAL EXPERIENCE

RESEARCH EXPERIENCE

November 2020 – Present
Postdoctoral researcher, Humboldt University of Berlin and University of Potsdam, Germany
Continuity in Discourse Relations
Project PIs: Dr. Debopam Das and Dr. Markus Egg
This project investigates the role of (dis)continuity in discourse relations (relations between propositions or speech acts, such as Condition or Claim-Argument). The notion of (dis)continuity in discourse occupies a central place in the deictic shift theory. Discourse relations are considered either continuous (e.g., Continuation, Elaboration) or discontinuous (e.g., Contrast, Comparison), based on preserving or shifting deictic centres along dimensions such as spatio-temporal setting, topicalized referents or perspective. In our present work, we re-evaluate the definition of continuity in discourse relations, and examine Givón’s (1993) seven dimensions of deictic shifts (time, space, reference, action, perspective, modality and speech act). Our preliminary results, based on the analysis of Causal and Contrastive relations in the RST Discourse Treebank, show that (dis)continuity in coherence relations operates more as a multifaceted phenomenon than a categorical one. A relation can simultaneously show evidence for discontinuity only for certain dimensions but not necessarily for others (e.g., Contrast, otherwise deemed to be a discontinuous, exhibits referential continuity). Also, discourse relations show different degrees of (dis)continuity, and continuity functions more as a gradient phenomenon than a bipolar one. In the next phase of this work, we would investigate the influence of (dis)continuity on the signalling of discourse relations.

May 2017 – July 2018
Postdoctoral researcher, University of Potsdam, Germany
The Bangla RST Discourse Treebank
Project PIs: Dr. Debopam Das and Dr. Manfred Stede
This project aims to develop a corpus in Bangla (an Indo-Aryan language) annotated for coherence relations (according to RST) and relational signals. The corpus contains 266 texts, comprising 71,009 words, with an average of 267 words per text. The corpus represents newspaper genre. The texts have been collected from a popular Bangla daily called Anandabazar Patrika published in India. The corpus started with the annotation of 16 texts, which were evaluated for agreement among the annotators. The present work includes annotation of the remaining 250 more texts, representative of different sub-genres in the newspaper genre.

June 2017 – July 2018
Postdoctoral researcher, University of Potsdam, Germany
The Bangla Discourse Connective Lexicon
Project PIs: Dr. Debopam Das and Dr. Manfred Stede
This project develops a lexicon of discourse connectives for Bangla. Discourse connectives are lexical expressions which represent a two-place relation and they take abstract objects (propositions, events, states, or processes) as their arguments. We compile a list of over 100 Bangla connectives, and provide information on their syntactic categories, discourse semantics and non-connective uses (if any). The format follows the German connective lexicon DiMLex, which provides a crosslinguistically applicable XML schema.

September 2016 – September 2017
Postdoctoral researcher, University of Potsdam, Germany
Underspecification and RST
Project PI: Dr. Manfred Stede
This project examines the disagreement in Rhetorical Structure Theory annotation which takes into account what we consider "legitimate" disagreements. In rhetorical analysis, as in many other pragmatic annotation tasks, a certain amount of disagreement is to be expected, and it is important to distinguish true mistakes from legitimate disagreements due to different possible interpretations of the structure and intention of a text. Using different sets of annotations in German and English, we present an analysis of such possible disagreements, and propose an underspecified representation that captures the disagreements.

September 2014 – August 2016
Research Assistant, Simon Fraser University, Canada
Discourse Parsing for Sentiment Extraction
Project Supervisor: Dr. Maite Taboada
This project investigates the relationship between coherence relations (relations between propositions) and appraisal. In particular, we examine the role of coherence relations in the interpretation of evaluative words. By combining Rhetorical Structure Theory and Appraisal Theory, we analyze how different types of coherence relation influence the evaluative content expressed by nouns, adjectives, adverbs and verbs found in the relational unit. We found that relations such as Concession, Elaboration, Evaluation, Evidence and Restatement most frequently intensify the polarity of opinion words. We also find that most opinion words (about 70 percent) are positioned in the nucleus.

September 2009 – August 2014
Research Assistant, Simon Fraser University, Canada
Signalling of Coherence Relations in Discourse
Project Supervisors: (The late) Dr. Paul McFetridge and Dr. Maite Taboada
This project (also my PhD project) investigates how coherence relations are signalled in discourse, and what signals are used to indicate them. A secondary goal of this study is to examine whether coherence relations are more frequently explicit or implicit in terms of the type of signalling involved. I conducted a corpus study, examining the RST Discourse Treebank which includes a collection of 385 Wall Street Journal articles annotated for rhetorical (or coherence) relations. I examined each and every relation in that corpus, identifying the signals for those relations, and finally, adding a new layer of annotation to them, to include signalling information. Results from my corpus study show that the majority of relations (over 90%) in a discourse are signalled (sometimes by multiple signals), and also that the majority of signalled relations (over 80%) are indicated by signals other than discourse markers, such as lexical, semantic, syntactic and graphical features.

September 2009 – December 2012
Research Assistant, Simon Fraser University, Canada
Computational Analysis of Text Sentiment
Project Supervisor: Dr. Maite Taboada
The goal of this project is to develop a computational system for automatically extracting sentiment from any given text. Sentiment is characterized as positive or negative views expressed by the subjective content of a text (e.g., an opinion piece in a newspaper or a movie review). We hypothesize that, given a text, we can determine whether it contains sentiment or subjective content, and if it does, we can also determine the type of the sentiment – categorically positive or negative, based on the analysis of the discourse structure of the text. In this project, my contributions were related to developing resources for discourse parsing. Specifically, I conducted a corpus study in order to extract relevant linguistic signals (e.g., discourse markers) of coherence relations, and then formulated rules for identifying coherence relations in unseen texts based on the contextual information about the occurrence of those signals.

January 2009 – July 2009
Research Fellow, The Asiatic Society, India
Project Supervisor: Dr. Pabitra Sarkar
A Modern Dictionary for Readers with Vernacular Different than Bengali
This project developed a detailed encyclopedic bilingual dictionary (from Bengali to English direction) in six volumes with an eye to facilitate understanding of the Bengali language by providing elaborate but precise information on Bengali words and their usages. In this project, I worked on entries dealing with biographical sketches of important personalities who had some significant social, cultural and political contribution for Bengal and its people.

August 2007 – December 2008
Project Fellow, University of Calcutta, India
Defining Key Concepts in Linguistics: A Bilingual Approach with Text-Machine Interface
Project Supervisor: Dr. Krishna Bhattacharya
This project developed a precise and convenient bilingual dictionary (in Bengali and English) on Linguistics to cover common concepts and frequently used terms in that discipline, specifically citing examples from Indian as well as other foreign languages to illustrate concepts. In addition, it addressed the problems of standardizing Linguistic terminology in Bengali. In this project, my contributions were related to (i) collecting, scrutinizing and justifying the English and Bengali entries (relevant linguistic key terms) for the dictionary, (ii) defining those entries in both English and Bengali, and (iii) citing appropriate examples from various languages to illustrate those linguistic concepts.

TEACHING EXPERIENCE

INSTRUCTOR (at Humboldt University of Berlin)
Discourse Structure and Processing (scheduled for Winter 2021-22)
This course provides an introduction to how discourse is structured, primarily through discourse coherence and discourse relations, and also to how discourse is processed by humans. Discourse is defined as the language above the level of sentence and also the use of language in context. In this course, we will learn about a wide range of phenomena included in the study of discourse structure and discourse processing from linguistic and psycholinguistic points of view. Students will read original and recent work in these areas. They will also examine different forms of discourse and analyze important aspects of discourse processing. The classes will take the form of seminars and include engaging discussions on significant topics in the field. The students will be evaluated based on their performance in a number of activities, such as writing summaries and reviews, and writing (and presenting) a short-term paper.

Pragmatics (from Winter 2018-19 to Summer 2021) Syllabus
INSTRUCTOR (at the University of Potsdam)
Coherence Relations (Summer 2018 and Summer 2017) Syllabus
Human Discourse Processing (Summer 2018 and Summer 2017) Syllabus
Foundations of Linguistics (Winter 2017/2018 and Winter 2016/2017) Syllabus
Introduction to Discourse Analysis (Winter 2017/2018 and Winter 2016/2017) Syllabus
SESSIONAL INSTRUCTOR (at Simon Fraser University)
LING 100: Communication and Language (Summer 2016, Summer 2015 and Spring 2015) Syllabus
LING 160: Language, Culture and Society (Fall 2015) Syllabus

TEACHING ASSISTANT (at Simon Fraser University)
LING 482W: Discourse Analysis (Fall 2012)
LING 323: Morphology (Summer 2010)
LING 321: Phonology (Summer 2014, Summer 2011, Fall 2010)
LING 301W: Linguistic Argumentation (Spring 2013)
LING 222: Introduction to Syntax (Spring 2013, Fall 2011, Summer 2011, Spring 2011)
LING 221: Introduction to Phonetics and Phonology (Spring 2011, Fall 2010)
LING 220:Introduction to Linguistics (Fall 2014, Fall 2012)
LING 200: Introduction to the Description of English Grammar (Summer 2014, Summer 2013)
LING 110: The Wonders of Words (Spring 2014, Fall 2013, Summer 2012, Spring 2012, Fall 2011, Summer 2011, Spring 2011, Fall 2010, Summer 2010, Spring 2010)
LING 100: Communication and Language (Summer 2014, Spring 2014)

THESES SUPERVISION

Sept, 2017 – December 2017
Sebastian Golly (Bachelor’s thesis, Co-supervisor, University of Potsdam)

Sept, 2017 – October 2017
Danny Belitz (Bachelor’s thesis, Co-supervisor, University of Potsdam)

HONORS

SCHOLARSHIPS & AWARDS

Spring 2014
Graduate Fellowship, Dean of Graduate Studies, Simon Fraser University

Spring 2014
Community Trust Endowment Fund Graduate Fellowship, Dean of Graduate Studies, Simon Fraser University

Fall 2013
Travel and Minor Research Award, Dean of Graduate Studies, Simon Fraser University

Summer 2013
Graduate Student Research Award (GSRA), Dean of Graduate Studies, Simon Fraser University

Spring 2013
President's PhD Scholarship, Dean of Graduate Studies, Simon Fraser University

Fall 2012
Graduate Fellowship, Dean of Graduate Studies, Simon Fraser University

Spring 2012
Graduate Fellowship, Dean of Graduate Studies, Simon Fraser University

Fall 2011
Community Trust Endowment Fund Graduate Fellowship, Dean of Graduate Studies, Simon Fraser University

Fall 2009
Community Trust Endowment Fund Graduate Fellowship, Dean of Graduate Studies, Simon Fraser University

PRIZES

2013
Winner of the Three Minute Thesis Competition in the Faculty of Arts and Social Sciences, Simon Fraser University

2013
People’s Choice Winner of the Three Minute Thesis Competition in the Faculty of Arts and Social Sciences, Simon Fraser University

PROFESSIONAL ACTIVITIES

SERVICE

November, 2016 – November, 2017
Member, Habilitation Colloquium Committee of Dr. Alexander Geyken, University of Potsdam

Spring, 2014 – Summer 2014
Manager, Discourse Research Group, Department of Linguistics, Simon Fraser University

Spring, 2012 – Summer 2014
Treasurer, Linguistics Graduate Studies Association, Simon Fraser University

Fall, 2009 – Fall, 2011
Member, Colloquium Committee, Department of Linguistics, Simon Fraser University

REVIWING

Journals: Pragmatics, Dialogue and Discourse, Current Psychology, Discourse Processes, Corpus Linguistics and Linguistic Theory, Corpus Pragmatics, International Journal of Corpus Linguistics, Language Resource and Evaluation Journal, Linguistics Vanguard, Semantic Web Journal

Conferences: EACL 2021 Conference, COLING 2020 Conference, STARSEM 2020 Conference, ACL 2020 Conference, LREC 2020 Conference, DISRPT (RST workshop 2019)

ORGANIZER

Organizer, Explicit and implicit relations: Different, but how exactly? Humboldt University of Berlin, Germany; January 17-18, 2020

Co-organizer, DISRPT - Discourse Relation Parsing and Treebanking (7th Workshop on Rhetorical Structure Theory and Related Formalisms) Minneapolis, USA; June 6, 2019

Abstract reviewer, 30th Northwest Linguistics Conference, SFU, Vancouver; April 26-27, 2014

Student volunteer, North American Computational Linguistics Olympiad (NACLO); 2012, 2013, 2014

Student volunteer, Western Conference on Linguistics, Simon Fraser University, Vancouver; November 18-20, 2011

WORKSHOPS

2018
Participant. Moderating Masterclass: Key skills in English for chairpersons, moderators and discussion leaders, Potsdam Graduate School, Potsdam. March, 2018.

2017-18
Participant. International Teaching Professional Program, Potsdam Graduate School, Potsdam. Aug, 2017 - Nov, 2018.

2013
Instructor. Practice Session for 2013 North American Computational Linguistics Olympiad (NACLO), Simon Fraser University. Jan 19, 2013.

2012
Participant. Workshops for instructors in writing courses, The Faculty of Arts and Social Sciences, Simon Fraser University. Oct 10 and Nov 6, 2012.

LANGUAGES

Bengali (Native)
English (Fluent)
Hindi (Fluent)
German (Beginner)
Spanish (Beginner)

REFERENCES

Available upon request