IIT Delhi researchers in a collaborative study have comprehensively analyzed the growth of Hinglish, a fusion language derived from monolingual Hindi and English, on social media, particularly ‘X’ (formerly Twitter). The study has found that Hinglish helps users engage more effectively with a broader audience, with Bollywood significantly boosting its popularity.
The study titled “Social, economic, and Demographic Factors Drive the Emergence of Hinglish Code-mixing on Social Media” has recently been published in Humanities and Social Sciences Communications under the prestigious Nature publication.
The researchers at the Laboratory of Computational Social Systems (LCS2) led by Prof. Tanmoy Chakraborty conducted a thorough statistical and econometric study covering 260,000 posts on ‘X’ by 17,000 users and showed that more than 34% of users prefer Hinglish to connect more effectively with their followers.
The study reveals that the Hinglish population has evolved steadily between 2014 and 2022, with an annualized growth rate of 1.2%, and the usage of Hinglish on ‘X’ has increased annually by 2%. This growth is driven by the desire for broader audience engagement and relatability, researchers say.
The researchers have also elaborated on Bollywood’s influence on Hinglish evolution, with frequent references to famous actors contributing to Hinglish’s spread.
The study highlights socio-economic factors, such as living standards and internet activity, as key drivers of Hinglish adoption.
“Considering these external factors, we developed an econometric model to predict Hinglish’s future evolution. This model helps understand the broader implications of socio-economic conditions on language usage,” said said Prof. Tanmoy Chakraborty, Electrical Engg., IIT Delhi.
The researchers also delved into the dynamics of language use, showing that not all Hindi words are equally likely to be mixed with English. The context of the conversation often changes how words are used, with political ‘X’ posts exhibiting the highest level of code-mixing, followed by Bollywood and sports.
This insight into the contextual nature of Hinglish provides a deeper understanding of how language evolves in different settings. The researchers further addressed the challenges that Hinglish presents for natural language processing (NLP) technologies. Traditional NLP models, designed for monolingual data, struggle with the complexities of code-mixed languages like Hinglish.
They underscored the need for updated models that can handle the evolving semantics of Hinglish, highlighting the necessity for advancements in NLP to accommodate new language trends.