Surprisal and Interference Effects of Case Markers in Hindi Word Order

Faculty: Sumeet Agarwal


Based on the Production-Distribution-Comprehension (PDC) account of language processing, we formulate two distinct hypotheses about case marking, word order choices and processing in Hindi. Our first hypothesis is that Hindi tends to optimize for processing efficiency at both lexical and syntactic levels. We quantify the role of case markers in this process. For the task of predicting the reference sentence occurring in a corpus (amidst meaning-equivalent grammatical variants) using a machine learning model, surprisal estimates from an artificial version of the language (i.e., Hindi without any case markers) result in lower prediction accuracy compared to natural Hindi. Our second hypothesis is that Hindi tends to minimize interference due to case markers while ordering preverbal constituents. We show that Hindi tends to avoid placing next to each other constituents whose heads are marked by identical case inflections. Our findings adhere to PDC assumptions and we discuss their implications for language production, learning and universals.

Ranjan, S., Rajkumar, R., & Agarwal, S. (2019, June). Surprisal and Interference Effects of Case Markers in Hindi Word Order. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (pp. 30-42). Association for Computational Linguistics Minneapolis, Minnesota.

PDF available at: