Bharatiya Bhasha Diwas 2024

Bharatiya Bhasha Diwas will be observed on 11th December, 2024 at IIIT Bangalore. The focus will be on technologies in, for and through Indian languages. The event will include several technical talks and demonstrations focusing on Indian language technologies by eminent speakers from industry and academia.

Bharatiya Bhasha Diwas is celebrated to honour the Janma Jayanti (birth anniversary) of Mahakavi Subramania Bharati and celebrates India’s rich linguistic heritage and nurtures multilingualism. With the growth of Artificial Intelligence (AI) throughout the globe, India is on the verge of a digital revolution with the aims of bridging linguistic and regional gaps using various tasks such as text generation, machine translation, question answering, voice recognition, and conversational AI. This one-day event aims to bring together enthusiasts from diverse backgrounds including AI practitioners, linguists, and social scientists. The goal is to promote research and innovation, build an inclusive and collaborative community, and foster student engagement. We welcome students, researchers, practitioners, and anyone keen to contribute to the growth of technologies in Indian languages to join the event by registering here.

Time : 10:00 – 16:30 IST

Venue : In-person – R-103, Ramanujan, International Institute of Information Technology Bangalore (IIITB) 26/C, Hosur Rd, Electronics City Phase 1, Electronic City, Bengaluru, Karnataka 560100

Online – meeting link will be sent to registered attendees

Prof. Pushpak Bhattacharyya (Keynote talk)

Title: Low Resource Machine Translation of Indic Languages

Abstract: Indic Languages provide a diverse and exciting panorama of linguistic phenomena. Translation among these languages (including English too) involves several linguistic and resource challenges. In this talk, we discuss the techniques for and performance with analysis in handling the challenges of low resources in Indic MT. Subwording, pivoting, phrase table injection, use of translationese, Multilingual training, post-editing, etc. are among the techniques. The discussions are based on our work reported in top-quality conferences and journals.

Speaker Bio: Prof Pushpak Bhattacharyya (http://www.cse.iitb.ac.in/~pb) is Bhagat Singh Rekhi Chair Professor of Computer Science and Engineering at IIT Bombay. He has done extensive research in Natural Language Processing and Machine Learning. Some of his noteworthy contributions are Sarcasm Metaphor Hyperbole Detection, IndoWordnet, Cognitive NLP, Low Resource MT, and Knowledge Graph-Deep Learning Synergy in Information Extraction and Question Answering.  He has published more than 450 research papers (https://scholar.google.co.in/citations?user=vvg-pAkAAAAJ&hl=en, 17K+ citations and h-index 62 as on Oct 24), has authored/co-authored 8 books including a textbook on machine translation (2015) and one on NLP (2023), and has guided close to 400 students for their Ph.D., Masters, and Undergraduate thesis. Prof. Bhattacharyya has been a visiting researcher at MIT and a visiting faculty at Stanford. He is a Fellow of the National Academy of Engineering, an Abdul Kalam National Fellow, a Distinguished Alumnus of IIT Kharagpur, an ex-director of IIT Patna, and a past President of ACL (Association of Computational Linguistics).

Dr. Shakira Jabeen

Title: Multiple Languages of India, Multilingualism, Constitutional Guarentees and Preventing Language Death

Abstract: The talk is aimed at explaining crucial concepts pertaining to social behaviour towards language/s. The age old language scene of India is used as a diving board to focus on existing language issues. Effort is made to draw a distinction between multilingualism of the West and Indian multilinguality. Analyzing  the issue of language death, the talk tries to focus on ways to prevent the loss of languages. This framework is handled with an  aim to address graduate and post graduate students of technical and managerial stream. 

Revendranath T

Title: LLMs for Indic Languages: Challenges & Opportunities

Abstract:  The talk covers a use cases of LLMs for Indic Languages. Challenges in developing solutions or products for Indic languages, and opportunities for the business adaptation

Speaker Bio: Revendra works as a Project Manager at Next Labs, a Research & Innovation vertical of Mphasis. He believes in the promise of AI and Quantum technologies in transforming the human and business experiences. Revendra worked in IT services delivery and product development for 4 years, and 11 years of experience in research. Besides, he has an experience in consulting for non-profit organisations and government agencies.   

Schedule

Time Speaker Title
10:00 – 10:45Inauguration
10:45 – 11:00Break
11:00 – 12:00Prof. Pushpak Bhattacharyya
(Keynote talk)
Low Resource Machine Translation of Indic Languages
12:00 – 13:00Dr. Shakira JabeenMultiple Languages of India, Multilingualism, Constitutional Guarentees and Preventing Language Death
13:00 – 14:00Lunch Break
14:00 – 15:00Revendranath T (Mphasis)LLMs for Indic Languages: Challenges & Opportunities
15:00 – 16:00Niharikasri ParasaIndic NLP : Progress, Gaps and Future Directions
16:00 – 16:15Closing remarks

WSL Research Workshop: 16 December 2024

Web Science Lab (WSL), IIIT-B biannually conducts a research workshop, where WSL research scholars share the latest developments in their work and the knowledge and insights from it. The event encourages interactive discussions on the ongoing research problems and includes brainstorming sessions. This time along with the full-day workshop, we would also have a Fireside chat with Experts on 16th December, 2024.

Schedule

S.No. Time Speaker Title Session Chair
1 10:00 – 10:20 Dev Shinde Rethinking Cross-Border Data Sharing for the Digital Age Praseeda
2 10:20 – 10:40 Bhoomika A P Comparing Visual Scene Understanding Approaches for Diverse Road Conditions
3 10:40 – 11:00 Rishita Patel Multi-modal content generation for Navigated Learning: Text-to-image using Stable Diffusion
Break 11:00 – 11:15 BREAK
5 11:15 – 11:30 Sachin and Meghana IUDX Project Showcase Rishita Patel
6 11:30 – 12:00 Praseeda Understanding and Representing Diverse Assimilation Patterns of Learning
7 12:00 – 12:30 Asilata Karandikar Building an Ontology for Categories of Consent Transactions on the Consent Matrix
Lunch Break 12:30 – 2:00 LUNCH BREAK
8 2:00 – 3:00 Shridhar Mandyam
FIRE SIDE CHAT
Topic: Navigating Careers in the Age of Artificial Intelligence
Compere: Prof. Srinath
Dev Shinde
Break 3:00 – 3:10 TEA BREAK
9 3:10 – 3:30 Ashashree Sarma Learning Intervention for Network Synchrony Suhan Roy
10 3:30 – 3:50 Sharath Srivatsa Rural Colloquial Knowledge Management
11 3:50 – 4:10 Suhan Roy Markov Chain Modeling to predict the next activity of the learner
12 4:10 – 4:30 Prof. Srinath & Prof. Sushree Closing Remarks

To Join the Meet Online

WSL Research Workshop: 16 December 2024

10:00 – 16:30 (IST)

Microsoft Teams meeting

Meeting ID: 933 284 533 971
Passcode: JW3EU6

Meeting link: https://teams.live.com/meet/933284533971?p=yGKx9L3ngQftdUySZ3

Fireside chat

Topic : Navigating Careers in the Age of Artificial Intelligence

Register for Fireside chat here

From its early days as a concept to today’s sophisticated applications, the field of Artificial Intelligence (AI) is transforming the landscape of multiple industries including healthcare, technology, finance, marketing, agriculture, and education. AI is not only introducing extraordinary automation capabilities and efficiency but also is influencing the skills required in the workforce. The evolving significance of AI has also resulted in professionals adapting to the changes due to the new technology. As AI continues to evolve and integrate into our day-to-day lives, it is essential to shed light on the advancements, develop relevant skills, and address the challenges and opportunities presented by AI in the modern workforce.

Join us for an insightful fireside chat on “Navigating Careers in the Age of AI”, where our distinguished experts will discuss how AI has progressed, and how this advancement has redefined and affected our learning and career paths. This event is for students, professionals, or anyone who is simply interested in the impact of AI.

We look forward to welcoming you to a noon of learning, and discussion, and preparing for lifelong learning.

Speakers

Sridhar Mandyam K

Sridhar is a Network Science researcher with experience of 30+ years as an IT/analytics professional in Research and Development in academics and industry. He is currently associated with Web Science Lab at IIIT-B as visiting faculty.

His current research is focused on models and approaches to study social learning and collective behavior in the world of social networks, and how businesses and other entities are seeking to reach and serve this vast virtual society. Research in these directions is aimed at developing an understanding of how network structure impacts opinion dynamics and the emergence of different types of group behaviors, and the possibilities for creation of solutions that yield economic or other benefits by engendering cooperative, collective choices. He has previously been with C-DAC, India’s national initiative in supercomputing, heading its systems software group. He has also been with IBM’s supercomputing division in the US, as part of the Technical Strategy and Architecture Group. He has also been an entrepreneur for over a decade, co-founding an R&D flavored analytics firm in the late ‘90s, which developed tools for identity data management.

Sridhar holds bachelors and masters degrees in Physics from IIT Kharagpur and IIT Madras respectively, an M.Tech in Physical Engineering from the Indian Institute of Science (IISc), Bangalore, and Ph.D degree the in the area of parallel computing from the Department of Electrical Engineering, IISc, Bangalore, India. He has also held several visiting positions at research establishments in India and overseas, including the, the Department of Electrical Engineering at Queens University, Belfast, Northern Ireland, UK, the Department of Computer Science, as an invited scholar at University of Texas at Austin under the Fulbright Program of the US, and at the Center for Information-Enhanced Medicine (CiEMED), Institute of Systems Science, NUS, Singapore.

Abhijith Neerkaje

Abhijith is a Data Science Leader with over 20 years of experience leading high performing teams across various sectors ranging from retail, semiconductor manufacturing and energy. He currently is head of data science and analytics at Falabella India Pvt Ltd (A Latin American retailer). In his role Abhijith builds products that leverage machine learning algorithms to enhance the capabilities and efficiency Falabella’s marketplace. Prior to Falabella, Abhijith worked as a techno functional manager at Target, Walmart Ecommerce and Sandisk. Abhijith received his bachelor’s degree in engineering from PESIT in Bangalore. He holds post graduate degrees from Indian Institute of Science Bangalore and Massachusetts Institute of Technology.

Ramya

Ramya is a product management professional with background in engineering. She holds a Master’s degree in Technology (MTech) from IIITB and has built an extensive career in the tech industry. Ramya is currently with Lowe’s, a home improvement retail company, focusing on modernizing pricing and installation systems. Before that, she held various roles, including Head of Product at a startup specializing in air quality and pollen data services, as well as engineering positions at companies like Apple and Qualcomm.

Outside of her professional life, she enjoys solving puzzles and playing board games. Looking ahead, Ramya aspires to create and develop her own game, combining her love for technology and gaming.

Parichaya

Parichaya project aims at capturing indigenous oral traditional knowledge about sandalwood in rural communities and making it available through an interface that can enable users to interact with audio content to support broader cultural awareness, decision-making, cultivation practices and promote community involvement.

Objectives

Sandalwood plays an important role in Indian cultural, religious, and therapeutic practices. It is extremely important to capture relevant indigenous knowledge in rural communities about the tree from aging populations, support preservation and renew cultivation efforts. In addition to this, the verbal knowledge transferred through multiple generations in rural communities is largely uncodified. The project aims at shaping initial ontologies for a knowledge base about sandalwood.

The Parichaya application contains two interfaces;

  1. The first interface enables browsing the content using frequent keywords and their context words, representing critical aspects of information in the corpus and providing a good viewpoint of the content
  2. The second interface supports question-answering, where a user can post a question, get the summary answer, and listen to the audio contents with answers to the question.

Funding Agency

Mphasis F1 foundation

Demos

Parichaya interface : http://103.156.19.244:33404/
(username: guest, password: guest123)

Parichaya demo video :

Publications

Sharath Srivatsa, Aparna M, Samarth P, Malavika V, and Srinath Srinivasa. 2025. Parichaya: Rural Colloquial Knowledge AI Interface. In Proceedings of the 8th Joint International Conference on Data Science & Management of Data (12th ACM IKDD CODS and 30th COMAD) (CODS-COMAD ’25). Association for Computing Machinery, New York, NY, USA. [to appear]

WSL Research Workshop May 2024

The Web Science Lab (WSL), IIIT-B conducts a biannual research workshop where research scholars share their knowledge and the latest developments in their work. The event includes interactive brainstorming sessions and encourages discussions that give a fresh perspective on the ongoing research problems.

Date : May 14, 2024

Venue: Hybrid (Web Science Lab, A-132 & Online)

Schedule:

Sl.No.SpeakerTimeTitle
1Praseeda10:30 – 10:50Representing individualistic assimilation patterns through learning map
2Pooja10:50 – 11:10Intervention Science for Sustainable Development
3Asilata11:10 – 11:30What Makes Consent Meaningful? Situating meaningful consent within a social contract framework for data privacy
Break
4Bhoomika11:40 – 12:00Video Based Event Detection and Captioning for Vehicular Traffic to aid Scenario Search
5Anurag12:00 – 12:20Eduembedd – Knowledge Graph Embedding for Education domain
6Aparna12:20 – 12:40Retrieval Augmented Generation using Community Knowledge Corpus
Lunch Break
7Balambiga2:00 – 2:20Policy-based Consent Management Service for open ended dissemination of data in Digital Public Infrastructures
8Rohith2:20 – 2:40Ownership and Information Flow Primitives for Digital Public Infrastructures
Break
9Apurva2:50 – 3:10Accessing Data Through the Lens of SDGs
10Sarvesh
Manavi
3:10 – 3:30Dashboard for Learning Map
11Prof. Srinath & Prof. Sushree3:30 – 4:30Closing Remarks

Online attendees can join using the following link;

https://teams.microsoft.com/l/meetup-join/19%3ameeting_MTNhN2VjZjYtODdhMS00NWFiLTlkMzAtOTA3ZDgwZjJmNWI0%40thread.v2/0?context=%7b%22Tid%22%3a%2282a84c22-47b2-4612-b9f7-860f39eb9b12%22%2c%22Oid%22%3a%22c0cab96e-1626-4396-8188-c75dea19f8af%22%7d

Meeting ID: 457 818 670 744

Passcode: wfBifg

IndicNLP

IndicNLP project focuses on building an knowledge management framework for oral community knowledge in low-resource and colloquial Kannada language.

Background

Knowledge in rural communities is largely created, preserved, and is transferred verbally, and it is limited. This information is valuable to these communities, and managing and making it available digitally with state-of-the-art approaches enriches awareness and collective knowledge of people of these communities. The large amounts of data and information produced on the Internet are inaccessible to the population in these rural communities due to factors like lack of infrastructure, connectivity, and limited literacy. Knowledge internal to rural communities is also not conserved and made available in any global Big Data information systems. Artificial Intelligence (AI) technologies such as Automatic Speech Recognition (ASR) and Natural Language Processing (NLP) provide substantial assistance when vast quantities of data, like Big Data, are available to build solutions. In the case of low-resource languages like Kannada and rural colloquial dialects, publicly available corpora are significantly less. Building state-of-the-art AI solutions is challenging in this context, and we address this problem in this work. Knowledge management in rural communities requires a low-cost and efficient approach that social workers can use. Organizations such as Namma Halli Radio have collected an audio corpus of a few hours containing community interactions spoken in colloquial language. We propose an architecture for oral knowledge management for rural communities speaking colloquial Kannada using audio recordings.

Funding Agency

Mphasis F1 foundation

Publications

Aparna, M., Srivatsa, S., Sai Madhavan, G., Dinesh, T.B., Srinivasa, S. (2024). AI-Based Assistance for Management of Oral Community Knowledge in Low-Resource and Colloquial Kannada Language. In: Sachdeva, S., Watanobe, Y. (eds) Big Data Analytics in Astronomy, Science, and Engineering. BDA 2023. Lecture Notes in Computer Science, vol 14516. Springer, Cham. https://doi.org/10.1007/978-3-031-58502-9_1

Sharath Srivatsa, Aparna M, Sai Madhavan G, and Srinath Srinivasa. 2024. Knowledge Management Framework Over Low Resource Indian Colloquial Language Audio Contents. In Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD) (CODS-COMAD ’24). Association for Computing Machinery, New York, NY, USA, 553–557. https://doi.org/10.1145/3632410.3632483 

Aparna M and Srinath Srinivasa. 2023. Active learning for Named Entity Recognition in Kannada. TechRxiv. Preprint. https://doi.org/10.36227/techrxiv.24580582.v1

Media Mentions

Demo

Graama-Kannada Audio Search webapp : http://103.156.19.244:33035/,
(username : guest, password : guest123)

Graama-Kannada demo video:

People

Research Scholars

Project Students

  • Goutham U R
  • Ram Sai Koushik Polisetti
  • Sai Madhavan G
  • Kappagantula Lakshmi Abhigna
  • Manuj Malik
  • Debmalya Sen
  • Vikram Adithya C P
  • Venumula Sai Sumanth Reddy