WSL Research Workshop May 2024

The Web Science Lab (WSL), IIIT-B conducts a biannual research workshop where research scholars share their knowledge and the latest developments in their work. The event includes interactive brainstorming sessions and encourages discussions that give a fresh perspective on the ongoing research problems.

Date : May 14, 2024

Venue: Hybrid (Web Science Lab, A-132 & Online)

Schedule:

Sl.No.SpeakerTimeTitle
1Praseeda10:30 – 10:50Representing individualistic assimilation patterns through learning map
2Pooja10:50 – 11:10Intervention Science for Sustainable Development
3Asilata11:10 – 11:30What Makes Consent Meaningful? Situating meaningful consent within a social contract framework for data privacy
Break
4Bhoomika11:40 – 12:00Video Based Event Detection and Captioning for Vehicular Traffic to aid Scenario Search
5Anurag12:00 – 12:20Eduembedd – Knowledge Graph Embedding for Education domain
6Aparna12:20 – 12:40Retrieval Augmented Generation using Community Knowledge Corpus
Lunch Break
7Balambiga2:00 – 2:20Policy-based Consent Management Service for open ended dissemination of data in Digital Public Infrastructures
8Rohith2:20 – 2:40Ownership and Information Flow Primitives for Digital Public Infrastructures
Break
9Apurva2:50 – 3:10Accessing Data Through the Lens of SDGs
10Sarvesh
Manavi
3:10 – 3:30Dashboard for Learning Map
11Prof. Srinath & Prof. Sushree3:30 – 4:30Closing Remarks

Online attendees can join using the following link;

https://teams.microsoft.com/l/meetup-join/19%3ameeting_MTNhN2VjZjYtODdhMS00NWFiLTlkMzAtOTA3ZDgwZjJmNWI0%40thread.v2/0?context=%7b%22Tid%22%3a%2282a84c22-47b2-4612-b9f7-860f39eb9b12%22%2c%22Oid%22%3a%22c0cab96e-1626-4396-8188-c75dea19f8af%22%7d

Meeting ID: 457 818 670 744

Passcode: wfBifg

WS4D 2024 Poster Challenge on “Digital Public Infrastructure Thinking”

The link to the WS4D workshop is here

Problem Statement

Students, universities, examination boards, and employers need to access academic transcripts of students applying to higher education, work, or when transferring from one university to another. Such documents can be easily stored and shared online. However, there is a problem of verification of authenticity, enabling privacy-preserving access, as well as prevention of fraudulent or double use of transcripts in such sharing of PDFs, etc online.

Consider an example. After completing a 4-year degree program, a student graduates from a university and receives a degree and academic transcript showing his/her performance. The student is now the legitimate owner of these documents. However, this ownership comes with a clause of immutability— in that, the student cannot unilaterally change the contents of the certificate and transcripts. The university is empowered to alter these documents as the issuer of them (in some cases, even rescind certificates issued to students). However, the university is not the owner of these documents and cannot consent to these documents being used by some third party without the knowledge and consent of the student. Similarly, an employer who is using these documents to grant employment to the student, may have a clause that these same documents may not be used for gaining an alternate employment, while the current one is active. The employer may sometimes need to further share these documents with potential clients and/or regulators, however, this cannot be done without the knowledge or informed consent of the owner of these documents. 

A DPI architecture needs to provide sufficient underlying mechanisms where such nuances of ownership and entitlement such as the above, can be supported reliably. Privileges, obligations and liabilities should be made known to each stakeholder separately, and enforced by the architecture. Such an architecture should ideally be decentralized in nature, and should not empower just one entity to enforce these semantics, making this one entity the single point of failure and compromise. 

We invite you to design a solution using building blocks of the DPI architecture for managing, verifying, and sharing academic transcripts of students with other universities, companies, examination boards, etc. This solution should enable exam boards or universities to make available academic transcripts which students can share with other universities or companies at the time of applying for higher studies or employment. The solution should prevent fraudulent use as well as double use of documents, e.g. if a student has secured admission in a university, the student cannot, while holding on to the seat, apply to another university for an equivalent degree course.

Authors of the selected posters will have an opportunity to present their ideas on the day of the workshop.

More information on Digital Public Infrastructures:

Digital Public Infrastructures or DPIs enable countries, communities and businesses to achieve societal outcomes at scale using open and interoperable technologies to promote innovation and inclusion. It employs the following architectural principles to achieve this development:

DPI Tech Architecture Principles:

Interoperability –  To prevent information/data silos, monopolization, and walled gardens with the help of published protocols & standards/specifications for the ecosystem to adopt and comply with.

Minimalist & Reusable Building Blocks – A full solution approach assumes just one solution fits all  but a DPI approach uses minimalist components that perform one function well to catalyze combinatorial innovation and user-centric solution.

Diverse, Inclusive Innovation – Promote inclusion through the use of open APIs and other standards.

Federated & Decentralized by Design – Avoid centralisation to prevent over-aggregation of information by using solutions like wrapper apis across disparate systems.

Security & Privacy by Design – Build a ‘Trust no one architecture’ that operates on optimal ignorance – each system should know as little as possible.

Please refer to this website to familiarize yourself with DPI thinking and how you can build your own DPIs.

Workshop on Web Science for Development (WS4D) 2024

REGISTRATION*

Call for POSTER CHALLENGE: WS4D DPI thinking competition!

The Web Science for Development (WS4D 2024) workshop is part of the Web Science research initiative at IIIT Bangalore. WS4D, started in 2019 brings together professionals from several domains, addressing different thematic concerns pertaining to the use of web and mobile technologies in developmental efforts. The theme of this year’s WS4D workshop is Digital Public Infrastructures. The workshop is a part of a week-long DPI Conclave with the next independent event following from 11 June. Details are given here.

WS4D 2024 is organised as a one-day event on June 10, 2024. The workshop features invited talks by visiting researchers from the Centre of Data for Public Good, IISc Bangalore and other invitees from other parts of India. The workshop would also involve a DPI thinking poster presentation competition. Interactive sessions and panel discussions are also planned for the workshop.

The workshop aims to foster a community of practitioners, researchers, entrepreneurs and students to jointly address socially relevant opportunities and challenges from the web and mobile technologies.

Tentative Schedule

Time EventSpeaker
9:00-9:30 Tea and RegistrationNA
9:30 – 9:45Inauguration and Address by — Director, IIIT-B and Dean (Academics), IIIT-BProf. Debabrata Das
Prof. Chandrashekar Ramanathan
9:45 – 10:15DPI and the Four InternetsSrinath Srinivasa, IIITB
10:15 – 10:45Digital Twin for City-Scale Mobility
Management and Planning
Raghuram Krishnapuram, CDPG, IISc
10:45 – 11:15Privacy-Preservation ID Verification
using Third Party Cloud Services and FHE
Srinivas Vivek, IIITB
11:15-11:30Tea Break
11:30 – 12:00Driving the Digital Public Infrastructure growth story with Data ExchangesJyotirmoy Dutta, CDPG, IISc
12:00 – 12:30Harnessing IoT, Big Data, and Cloud
Computing for Smart City Innovation
Suresh Kumar. CDPG, IISc
12:30 – 13:00Dr. Mukund Raj,
NABARD
13:00 – 14:00Lunch Break
14:00 – 14:30Lightning Presentations
14:30 – 15:00Managing privacy in digital applications for public healthSrikanth T K, IIITB
15:00 – 15:15Tea Break
15:15 – 15:45Building an Inclusive ID SystemsSasikumar Ganesan, MOSIP
15:45 – 16:30Poster Presentations
16:30 – 16:45Closing Remarks and Vote of Thanks,
Certificate Distribution
NA
16:45 – 17:00Networking and High TeaNA

Registration* Details

  • There is no registration fee.
  • The total number of participants is capped at 30, and will be approved on a first come first order.
  • The workshop will be in hybrid mode. You can choose to attend either in person or online.
  • No accommodation is available for outstation participants.

Talk and Speaker Details

  1. Title: DPI and the Four Internets

Abstract: One of the fundamental challenges in DPI design is to resolve issues of ownership. No technological solutions are possible until we have a sound understanding of what ownership entails. In this talk, we will look into how the term ownership is interpreted in different contexts and their implications on public infrastructure design. When it comes to the Internet itself, there are at least four different ways in which ownership of the information space, are addressed. This is exemplified by the “Four Internets” model proposed by O’hara and Hall. In this talk, we will also address the Four Internets paradigm and its implications on DPI design. 

Srinath Srinivasa heads the Web Science lab and is the Dean (R&D) at the International Institute of Information Technology – Bangalore (IIITB), India. Srinath holds a Ph.D (magna cum laude) from the Berlin Brandenburg Graduate School for Distributed Information Systems (GkVI) Germany, an M.S. (by Research) from Indian Institute of Technology – Madras (IITM) and B.E. in Computer Science and Engineering from The National Institute of Engineering (NIE) Mysore,

His research interests are in the area of Web Science– understanding how the WWW is affecting humanity; and how the web can enable social empowerment and capability building. Srinath has participated in several initiatives for technology enhanced education including the Edusat program by the Vishveshwaraiah Technological University, The National Programme for Technology Enhanced Learning (NPTEL), a Switzerland based online MBA school called Educatis, and IIITB’s educational outreach program with Upgrad.  He has served on various technical and organizational committees for international conferences like International Conference on Weblogs and Social Media (ICWSM), ACM Hypertext, International Conference on Management of Data and Data Science (COMAD/CoDS), International conference on Ontologies, Databases and Applications of Semantics (ODBASE), International Conference on Big Data Analytics (BDA), ACM Web Science, etc. As part of academic community outreach, Srinath has served on the Board of Studies of Goa University and as a member of the Academic Council of the National Institute of Engineering, Mysore. He has served as a technical reviewer for various journals like the VLDB journal, IEEE Transactions on Knowledge and Data Engineering, and IEEE Transactions on Cloud Computing. He has also served as an Associate Editor of the journal Sadhana from the Indian Academy of Sciences. He is also the recipient of various national and international grants and awards, from foundations and companies like: EU Horizon 2020, UK Royal Academy of Engineering, Research Councils UK, MEITy, DST, Siemens, Intel, Mphasis, EMC and Gooru. Currently, Srinath also heads the AI initiative for the “Karnataka Data Lake” project by the Planning Dept of the Govt of Karnataka, to promote data and evidence-based planning and decision-making.

2. Title: Digital Twin for City-Scale Mobility Management and Planning

Abstract: Urban mobility has become a major challenge in many Indian cities. Bangalore has been ranked the most congested city in India in terms of traffic for several years in a row. Digital twins can provide a way to improve urban mobility by enabling what-if analyses (e.g. making a street “one way”) and by helping city planners to make better infrastructure decisions (such as adding a new metro line). However, digital twins require city-scale traffic models which are based on a good understanding of the mobility patterns and mode choices of the citizens. Recent advances in AI indicate that with sufficient sensor data about traffic volumes, speeds and other information about the road network, it is possible to build city-scale mobility models to achieve congestion mitigation and transport optimization. Such AI-driven digital twin models leverage computer vision, graph neural networks, transformers and agent-based simulations. This talk presents an overview of a digital twin solution for urban mobility based on the IUDX (India Urban Data Exchange) platform. IUDX is a MoHUA sponsored project as part of the Smarter Cities Initiative of the Government of India. The talk also describes our initial efforts in building a traffic model for Bangalore in collaboration with the Bangalore Traffic Police.

Speaker Bio: Raghu Krishnapuram is a Senior Scientist at IUDX (India Urban Data Exchange), FSID, Indian Institute of Science. His work experience spans both academia and industry across continents over almost four decades. Raghu is an alumnus of IIT-Bombay and received his PhD from Carnegie Mellon University in 1987. He worked in the academia in the US until the year 2000. Between 2000 and 2015, he held various technical leadership positions at IBM Research India and IBM T J Watson Centre, NY, USA, where he led projects in the area of ‘Knowledge, Information, and Smarter Planet Solutions’ and ‘Cognitive Computing,’ with a particular focus on emerging markets. Raghu was with Xerox Research Centre – India, during 2015-16. Most recently, he was with the Robert Bosch Centre for Cyber-Physical Systems, Indian Institute of Science, Bangalore, and ARTPRAK, IISc, Bangalore.

Raghu’s research encompasses many aspects of machine learning, computer vision, text analytics, artificial intelligence, and data mining. Many of his publications have a very high citation count, with the overall count exceeding 15,000. Raghu is a recipent of many best paper awards, including the IEEE Neural Network Council Best Paper. He has been recognized as a Master Inventor by IBM and has filed over 40 patent disclosures at the US Patent Office. Raghu is a Fellow of IEEE and the Indian National Academy of Engineers (INAE).


3. Title: Practical Privacy-Preserving Identity Verification using Third-Party Cloud Services and FHE

Abstract: National digital identity verification systems have played a critical role in the effective distribution of goods and services, particularly, in developing countries. Due to the cost involved in deploying and maintaining such systems, combined with the lack of in-house technical expertise, governments seek to outsource this service to third-party cloud service providers to the extent possible. This leads to increased concerns regarding the privacy of users’ personal data. In this work, we propose a practical privacy-preserving digital identity verification protocol where the third-party cloud services process the identity data encrypted using a (single-key) Fully Homomorphic Encryption (FHE) scheme. Though the role of a trusted entity such as government is not completely eliminated, our protocol significantly reduces the computation load on such parties. We implement our protocol using the Microsoft SEAL FHE library and demonstrate that secure demographic and biometric matching queries and age comparisons can be efficiently performed on batched FHE ciphertexts.

Speaker Bio: Dr. Srinivas Vivek is currently the Infosys Foundation Career Development chair professor at IIIT Bangalore. Previously, he was a postdoctoral researcher in the Cryptography group at the University of Bristol. He has obtained his doctoral degree from the University of Luxembourg, master’s degree from IISc, and bachelor’s degree from NITK Surathkal.

He has served as a member of the editorial board/PC of IACR Trans. CHES,  CARDIS, AsianHOST,  Indocrypt, and other venues. He is also a recipient of the DST INSPIRE faculty award from Govt. of India. His research is focused on the design, analysis, and implementation of countermeasures against side-channel attacks, and homomorphic encryption schemes and their applications.

4. Title: Driving the Digital Public Infrastructure Growth Story with Data Exchanges

Abstract: India has witnessed an unprecedented digital revolution through the ‘India Stack’ that is unlocking the economic primitives of identity and payments at population scale through AADHAR and UPI. The third and final piece of the stack is establishing a new governance model through easy access and exchange of data. 

The real power of data is realized when there is seamless flow of data from the data provider to the data consumers and the data silos are broken. How the data is managed, exchanged and used is crucial to successfully resolving complex problems in domains such as agriculture, urban, geospatial and healthcare etc. A data exchanges is a platform which acts as an enabler to solve these problems and build innovative solutions. When developed for public good, the data exchange platforms are ideally open source, based on standardized data models and have robust security, privacy and accounting mechanisms that facilitates their easy adoption across the digital ecosystem.

This talk will focus on the idea of Digital Public Infrastructures and Data Exchanges in particular and how they are a critical tool for Digital transformation of a society. I will discuss the benefits of seamless flow of data from the providers and the consumers with some use cases and how business cases could be developed around these.

Speaker Bio: Jyotirmoy has close to twenty years of experience holding key positions in academia, industry, and research, including six years in digital transformation. He has taught at institutes of national importance, contributed to research projects with international collaborators, contributed to technical standards, policy frameworks and consulted startups in emerging technology space.

At present he works as a Senior Scientist at the Centre of Data for Public Good, FSID, Indian Institute of Science Bengaluru. Working at the cross section of technology and policy, he contributes to large scale digital public infrastructures in Agriculture, Urban, and Geospatial domain. He holds a PhD in Microsystems, from the Shiv Nadar Institution of Eminence, Delhi-NCR with several peer reviewed publications and conference presentations.


5. Title: Harnessing IoT, Big Data, and Cloud  Computing for Smart City Innovation

Abstract :  This talk presents typical applications in the Smart City domain and the crucial roles of IoT, big data, and cloud computing in realizing them. It explores the types of data generated by these applications and the role of data exchange in fostering open innovation and developing AI/ML applications. Additionally, it discusses the data-driven applications being built and their extension to domains like agriculture, e-governance, and more at the state and national levels.

Speaker Bio: Suresh has over three decades of experience in defense, telecom, automotive, semiconductor, IoT, cloud computing, and smart city domains. He has worked across startups, government agencies, and European and American organizations, building many award-winning teams, products, and businesses.

Suresh is currently working at the Indian Institute of Science (IISc) Bengaluru, leading the Data Exchange project. which is an open-source, cloud-based platform developed and deployed at the national level to facilitate the discovery, exchange, and use of data from various sources, promoting open innovation and application development.

Suresh holds an engineering degree in Electronics and Communication from Kerala University, Software specialization from the Indian Institute of Science, and Business Management from the Indian Institute of Management Bengaluru and is also the author of the book “7 Steps to Joyful Living”. 


6. Title: Managing privacy in digital applications for public health

Abstract: DPIs provide an efficient path to rolling out robust citizen-centric applications at scale. Ensuring privacy of personal data for such applications is a key requirement, and this often requires domain and region-specific approaches. Public health applications can benefit from the services and capabilities of DPI and similar platforms, especially mechanisms for data protection and consent management. However, deploying applications that ensure privacy of health data across a variety of healthcare settings and demographics poses a number of challenges. In this talk, we explore some of these challenges, and discuss approaches for managing privacy that can be enabled by digital platforms, as well as limitations to such approaches.

Dr. T K Srikanth is a Professor in Computer Science at IIIT Bangalore. His research interests are in systems for the management and analysis of healthcare data, data privacy, as well as computer graphics. He is a co-convenor of the E-Health Research Center at IIITB, and is actively involved in multiple collaborative research projects in healthcare.

Prior to joining IIITB, he had extensive experience in the software industry, in India and in the US. He has a Ph.D. in Computer Science from Cornell University and a B.Tech. in Mechanical Engineering from IIT Madras.


7. Title: Building an Inclusive ID Systems

Abstract: In today’s digitally connected world Identity is one of the key factors that the environment consumes to make decisions.  From our social media handles to email addresses our identities are multiple and they differ based on the context. With Identity at the centre of this revolution, governments across the globe are attempting to leapfrog into the digital age. Governments are looking for a faster and more efficient way to deliver service to their citizens & residents. Foundational identity is one of the easiest yet most effective ways to achieve this goal. While this explosion is good, it has also created a digital entry barrier for many. This barrier is a curse when it comes to digital governance and service delivery across the world. We will discuss several ideas and strategies adopted by the open-source project – MOSIP to achieve the goal.

We will explore what goes into building, designing and securing inclusive digital Identities across multiple cultures.

Speaker Bio: Sasikumar Ganesan has Over 20 years of experience in Building DPI, Platforms, Security and Robotics. Security advisor for several large Government of India Projects. Ex Cheif Security architect of Aadhaar, Co-Authored e-sign paper to convert the old style digital signatures and electronic signatures to new cloud-based identity-driven digital signatures, Author of the open sourced “Rahasya – Advanced cryptography for forward secrecy” used by all of the account aggregator.

Sasi is actively involved in building digital public infrastructure and advices on security for most of the government initiatives on Digital India. Sasi actively supports MOSIP a Foundational Identity project set to provide foundational digital identity across the world. The platform aims to provide secure and privacy-enabled digital identity across the world.

Sasi has co-authored and published various Standards & RFC for large-scale adoption of security digital initiatives. His designs of delivering services over PKI are the most widely adopted design across the Government of India and its ecosystem partners. Other than this Sasi is part of various forums in defining security standards.

As a software architect Sasi has architected large-scale platforms from identity, security analytics, IOT Device identity, Data classification, and DRM solutions and has scaled these solutions to meet high-performance requirements. Sasikumar holds a Master’s in Computer Technology from Coimbatore Institute of Technology.

ADCOM 2024

The Advanced Computing and Communication Society (ACCS) in association with The International Institute of Information Technology Bangalore announces the 29th annual International Conference on Advanced Computing and Communications (ADCOM 2024 at Bangalore during 18th – 20th December 2024.

ADCOM, the flagship Systems Conference of the ACCS, is a major annual international meeting that draws leading scientists and researchers in computational and communications engineering from across industry and academia. ADCOM highlights the growing importance of Large-Scale Systems Engineering and provides the platform to share, discuss and witness leading edge research and trends.

ADCOM 2024 seeks to share insights for a ‘Responsible AI’ framework to develop Artificial Intelligence safely and ethically based on the five core principles of fairness, transparency, accountability, privacy and safety. Sustainable, reliable, effective, and human-centered AI systems require responsible AI principles and perspectives not only from technological but also from ethical, legal, and socio-economic domains. This edition of ADCOM will also explore as a sub theme the democratization of access to AI with a need for Trust, Risk and Security Management (TRiSM) to enhance positive performance and societal gains that AI enables. 

More details can be found from ADCOM 2024 website.

Workshop on Vision for Megaregional Mobility

Jointly organized by Web Science Lab, IIIT-Bangalore and BeST Cluster, IISc

14th February 2024, IIIT-Bangalore

Workshop Registration: https://forms.office.com/r/q4wERGtxEb

About the Workshop

Today’s highly clustered knowledge economy is centered in and around global cities. And it is not just individual cities and metropolitan areas that power the world economy. Increasingly, the real driving force is larger combinations of cities and metro areas called mega-regions. Development of megaregions that are made up of a network of interconnected cities with rapid transport options between them not only serves to decongest existing urban centres, but also act as drivers of economic growth.

Karnataka is among the fastest growing states in India, with a compounded annual growth rate of 9% in the year 2021-22. However, Karnataka has also one of the highest disparities in terms of population distribution. The population of the second biggest city in Karnataka is less than 10% of the population of the biggest city in the state, with the population of the biggest urban conglomeration, the metropolitan area of Bengaluru, growing at a compounded annual rate of 3.5%.

Traffic woes and mobility crises in Bengaluru are a daily affair, which only exacerbates in times of monsoon. Despite several initiatives, Bengaluru continues to crawl and has the dubious distinction of being the second slowest city in the world. There is a dire need to develop a megaregion around Bengaluru with a network of multiple growth centres to bring about long-term sustainable solutions. A megaregion is a network of interconnected cities each of which is an independently administered economic growth centre. They would be interconnected with rapid transport options for both freight and people, enabling different growth centres to balance out one another.

In this workshop, we are looking for multiple stakeholders from this proposed megaregion to come together to form a vision committee and create a detailed roadmap for designing this megaregion. Some of the key discussion points for this vision committee include the following:

  • Identification of key growth centres in this megaregion including proposals for formation of new townships to promote
    economic growth.
  • Detailed proposals for the nature of the RRTS system connecting this megaregion, including the different modalities involved, and identification of key transit hubs.
  • Identification of specific agencies, including private partners who can play key roles in the design of this megaregion.
  • Inputs for specific policy changes and/or interventions to facilitate creation of this megaregion.
  • Identifying and protecting specific environmental and ecologically sensitive zones in the design of this megaregion.
  • Creation of roadmap, timelines, and expected outcomes.

Key Co-ordinators

  • Prof. Srinath Srinivasa, Professor and Dean (R&D), Web Science Lab, IIIT-Bangalore
  • Prof. Abdul Pinjari, IISc Bangalore

Agenda

Forenoon Session :  R109, Ramanujan Block
Time Details
11.00 – 11.15Introductory remarks
11.15 – 11.30Welcome address by Director, IIIT-B
11.30 – 12.00Setting the context: Megaregion mobility around Bengaluru
12.00 – 01.15Roundtable Discussion
01.00 – 02.00Networking lunch
Afternoon Session:  A307, Aryabhata Block
02.00 – 03.00Breakout sessions and Report-outs
03.00 – 03.10High Tea and Closing

Resources

Slide deck for introductory presentation

WSL Research Workshop December 2023

The Web Science Lab at IIIT-B conducts a biannual research workshop that aids research scholars in sharing and presenting the latest developments in their field. These interactive brainstorming sessions encourage everyone to new ways of thinking and give a fresh perspective on ongoing research problems.

Date: December 15th 2023

Venue: Web Science Lab, A-132

S.NoTimeSpeakerTalk TitleSession Chair
1.10:00 – 11:00Brian GillikinKeynote Talk: The Design of Data Science ThingsProf. Srinath
2.
11:00 – 11:20
PraseedaUnderstanding and Representing Assimilation PatternsAparna
3.11:20 – 11:40JayatiThe Philosophy of TranscendenceAparna
Break
4.12:00 – 12:20PoojaThinking in Systems towards SustainabilityRohith
5.12:20 – 12:40AnuragEduEmbedd – A Knowledge Graph Embedding for
Education
Rohith
6.12:40 – 1:00AparnaNamed Entity Recognition in Kannada using Active LearningRohith
Lunch Break
7.2:30 – 2:50RohithFederated Consent Service for Data TrustsJayati
8.2:50 – 3:10AsilataA Typology of Consent for Information-Sharing and ManagementJayati
9.3:10 – 3:30BhoomikaVector-Based Semantic Scenario Search for Vehicular TrafficJayati
10.3:30 – 3:50AbrahamKarnataka Data Lake (Analytics & Visualization)Jayati
Break
11.4:10 – 4:30BalambigaPolicy-Based Consent Management for Data Trusts (Online)Bhoomika
12.4:30 – 4:50ApurvaAdministrative Data TwinBhoomika
13.4:50 – 5:10SharathBhoomika
14.5:10 – 5:30Prof. SrinathClosing Remarks

CIKM 2023 Workshop on Enterprise Knowledge Graphs Using Large Language Models

22nd October 2023,University of Birmingham, UK

Authors of the accepted papers can register using the link: https://uobevents.eventsair.com/cikm2023/cikmauthpreandmain
-219Days -18Hours -28Minutes -8Seconds

About the Workshop

Knowledge graphs are used for organizing and connecting individual entities to integrate the information extracted from different data sources. Typically, knowledge graphs are used to connect various real-world entities like persons, places, things, actions, etc. For the knowledge graphs created using the enterprise data, the knowledge graph entities can be of different types—static entities (e.g., people, projects), communication entities (e.g., emails, meetings, documents), derived entities (e.g., rules, definitions, entities from emails), etc. The graphs are used to connect these entities with enriched context (as edges and node attributes) and used for powering various search and recommendations applications.

With the advent of large language models, the whole lifecycle of knowledge graphs involving –information extraction, graph construction, application of graphs, querying knowledge graphs, using the graph for recommendations, etc., — is impacted. With large language models such as GPT, LLaMA, PALM, etc., entity and relationship extraction can be improved. Similarly, one can answer different types of queries using LLMs which were very difficult without them. This workshop is about improving the enterprise knowledge graphs and its applications using large language models.

Enterprise graphs can be of different scopes—whether it contains data from individual users/customers, a sub-organization, or the whole enterprise. This workshop will also cover various privacy and access control related issues which are typical for any enterprise graph. These include privacy preserving federated learning, using LLMs to extract information from private data, querying the knowledge graph in a privacy preserving manner, etc.

Workshop Objectives, Goals, and Outcomes

Knowledge graphs can integrate diverse data sources and provide a holistic view to the downstream applications. By virtue of being structured, knowledge graphs offer transparency and interpretability to the search and recommendations applications. As per one prediction, this connected data with semantically enriched context applications and graph mining will grow 100% annually. This workshop is about creating and using knowledge graphs on the enterprise data. This data is the internal data of the enterprises—of their employees and/or their customers. Unlike the graphs of open-web entities, enterprise knowledge graphs (EKG) connect the entities specific to the enterprises. For example, all the employee emails, meetings, documents, projects, etc., can be used to create a graph and this graph can be used to summarize the interaction between two employees, identify close collaborators, identify documents which should be attached to an email, documents associated with a project, etc. Similarly, there can be a knowledge graph of items, suppliers, teams, regions, etc., and the graph can be used to recommend suppliers for a particular requirement.

In this workshop we will be covering how large language models can help with the construction and usage of these enterprise knowledge graphs. This involves improving all the aspects of EKG workflow using large language models: entity extraction, entity enrichment, EKG construction, querying EKG for search and recommendations, scenario specific EKG, etc. Besides the well-known challenges associated with the knowledge graphs, EKGs have other issues—how to extract entities from private enterprise data? how to use large language models in a privacy aware manner? how to create relationships between different entities while preserving privacy? how to create EKG with internal (e.g., employees) data and external (e.g., suppliers) data? how is access control maintained in an EKG where data is from different divisions of the enterprise? how are the enterprise recommendations application different compared to, say, movie or a product recommendations? how can one integrate EKG with large language models for a particular application? etc. To ensure privacy and separation of access one may need to use federated graph learning while developing applications over EKGs. How to use federated learning in large language models? Through this workshop we would like to highlight research issues specific to the integration of the enterprise knowledge graphs with large language models and associated applications. By bringing together the researchers (from academia as well as industry) and practitioners (mainly from industry) we want to achieve that.

Workshop Themes

  • Enterprise Knowledge Graph (EKG) design and Implementation
  • Scalable extraction of enterprise entities using LLMs
  • Building EKGs for specific domains or applications
  • Natural Language Processing (NLP) algorithms to build EKGs.
  • Relationship extraction using large language models
  • Federated graph learning with LLMs
  • Privacy in graph algorithms
  • Privacy preserving graph construction and mining
  • Semantic reasoning based on deep learning on graph
  • Industrial applications of EKGs: banking, financing, retail, healthcare, medicine, etc.
  • Explainable AI based on EKG
  • Use of EKG and LLMs for search and recommendations

Target Audience

Researchers and Practitioners from industry and academia. The practitioners and researchers from industry are likely to present their domain, graphs they are building using LLMs, for various applications, whereas folks from academia are likely to identify research problems of common interest and advice appropriately.


Workshop on Web Science for Development (WS4D) 2023

Workshop on Web Science for Development

IIIT-Bangalore, 17-18 Mar 2023

WS4D 2023 Registration: Link

-438Days -18Hours -28Minutes -8Seconds

The Web Science for Development (WS4D 2023) workshop is part of the web science research initiative at IIIT Bangalore. WS4D, started in 2019 brings together professionals from several domains, addressing different thematic concerns pertaining to the use of web and mobile technologies in developmental efforts. The theme of this year’s WS4D workshop is sustainable development and digital capabilities.

Prof. Dame Wendy Hall, Executive Director of the Web Science Institute at the University of Southampton will deliver the keynote address and IITB Silver Jubilee Lecture on the first day. On the second day, Prof. Noshir Contractor, Jane S. & William J. White Professor of Behavioral Sciences (McCormick, SoC, Kellogg) will be delivering the keynote address.

WS4D 2023 is organised as a two day event between March 17-18 2023. The workshop features invited talks by visiting researchers from the Web Science Trust (WSTNet) and other invitees from other parts of India. The workshop would also involve tutorial sessions and research colloquium for the research scholars. Interactive sessions and panel discussions are also planned over the two days.

The aim of the workshop is to foster a community of practitioners, researchers, entrepreneurs, students, and policy makers to jointly address socially relevant opportunities and challenges from the web and mobile technologies.


Agenda

Day 1: March 17th, 2023
TimeDetails
0930-1000 Tea and Registration
1000-1015Inauguration and Address by Prof. Debabrata Das
1015-1115Keynote 1 and IIITB Silver Jubilee Lecture – Prof. Wendy Hall
Title: Four Internets: Data, Geopolitics and the Governance of Cyberspace
1115-1130Tea Break
1130-1200The Applications of Generative AI – IT industry perspectives by Dr. Archisman
1200-1230Preserving Narrative Diversity on the Web by Prof. Srinath
1230-1400Lunch Break
1400-1430What will it take for democratic innovation and data science / AI
to positively reinforce one another? by Prof. Matt Ryan
1430-1500Recovering Food Narratives and Reimagining Health
by Prof. Janaki and Ms. Sudha Nagavarapu
1500-1530Smart City IoT systems and Ethics by Dr. Vinay Reddy
1530-1545Tea Break
1545-17001-Day PM Activity & Presentations [Activity Details]
1700 -1730High Tea and Closing
Day 2: March 18th, 2023
TimeDetails
1000-1015Address by Prof. Srinath Srinivasa
1015-1115Keynote 2 – Prof. Noshir Contractor
People Analytics: Using Digital Exhaust from the Web
to Leverage Network Insights in the Algorithmically Infused Workplace
1115-1130Tea Break
1130-1200Are Models Trained on Indian Legal Data Fair? by Prof. Ravindran
1200-1230Towards decentralised webs by Dr. TB Dinesh
1245-1400Lunch Break
1400-1515Research Scholars Colloquium [Paramita, Meera, Apurva, Pooja, Jayati]
1515-1545Tea Break
1545-1645Panel Discussion – Characterizing the Swing State of the Internet
Prof. Wendy, Prof. Noshir, TB Dinesh, Prof. Bidisha, Prof. Srinath (Moderator)
1645 -1700Vote of Thanks and Closing Remarks
1700 -1730High Tea

Talk and Speaker Details

Prof. Dame Wendy Hall

Title: Four Internets: Data, Geopolitics and the Governance of Cyberspace.

Abstract: There is no doubt that the world is very dependent on the Internet these days. If it wasn’t obvious before, we certainly realised our dependency during the Covid-19 pandemic. Also, when the whole world piled onto the Internet in order to do anything during the lockdowns, it stayed up and running which is a huge testament to the foresight of the Internet pioneers in terms of its design and in built resilience and scalability. But the Internet has never been under such threat and it’s whole future as a globally interconnected system is in much doubt for many different reasons. In this talk we will explore the future of the Internet through the perspective of geopolitics and data governance. We will argue that through this lens we see at least four internets, maybe more, rather than just one interconnected ecosystem. We will explore what aspects of the governance of cyberspace we must protect the most in order for us to continue to use the technical infrastructure of the Internet that we all rely on to support cloud and data services. https://www.southampton.ac.uk/wsi/research/four-internets.page

Dame Wendy Hall, DBE, FRS, FREng is Regius Professor of Computer Science, Associate Vice President (International Engagement), and is an Executive Director of the Web Science Institute at the University of Southampton. One of the first computer scientists to undertake serious research in multimedia and hypermedia, she has been at its forefront ever since. The influence of her work has been significant in many areas including digital libraries, the development of the Semantic Web, and the emerging research discipline of Web Science. She is well known for her development of the Microcosm hypermedia system in the mid-1980s, which was a forerunner to the World Wide Web.

In addition to playing a prominent role in the development of her subject, she also helps shape science and engineering policy and education. Through her leadership roles on national and international bodies, she has shattered many glass ceilings, readily deploying her position on numerous national and international bodies to promote the role of women in SET and acting as an important role model for others. With Sir Tim Berners-Lee and Sir Nigel Shadbolt she co-founded the Web Science Research Initiative in 2006 and is the Managing Director of the Web Science Trust, which has a global mission to support the development of research, education and thought leadership in Web Science. She became a Dame Commander of the British Empire in the 2009 UK New Year’s Honours list and is a Fellow of the Royal Society. Many of Wendy’s previous roles include: President of the ACM, President of BCS, Senior Vice President of the Royal Academy of Engineering, a member of the UK Prime Minister’s Council for Science and Technology, was a founding member of the European Research Council and Chair of the European Commission’s ISTAG, was a member of the Global Commission on Internet Governance, and was a member of the World Economic Forum’s Global Futures Council on the Digital Economy. Dame Wendy was co-Chair of the UK government’s Artificial Intelligence Review, which was published in October 2017, is the UK government’s first Skills Champion for AI and is a member of the newly formed AI Council. In May 2020, she was appointed Chair of the Ada Lovelace Institute and joined the BT Technology Advisory board in January 2021.

Prof. Noshir Contractor

Title: People Analytics: Using Digital Exhaust from the Web to Leverage Network Insights in the Algorithmically Infused Workplace

Abstract: Organizations need to do more than analyze data on demographic attributes to bring the performance of people analytics in the algorithmically infused workplace up — and in line with the hype. We need to focus not only on who people are but also on who they know. The potential for social network analysis to identify “high potentials,” who has good ideas, who is influential, and what teams will get work done efficiently and effectively is well established based on decades of research. The challenge has been collecting network data via time-consuming surveys, which elicit low response rates, and have high obsolescence. This talk presents empirical examples ranging from corporate enterprises to simulated long-duration space exploration to demonstrate how we can leverage people analytics – and in particular relational analytics – to mine “digital exhaust”— data created by individuals every day in their digital transactions, such as e‐mails, chats, “likes,” “follows,” @mentions, and file collaboration— to address challenges they face with issues such as team assembly and team conflict.

Noshir Contractor is the Jane S. & William J. White Professor of Behavioral Sciences in the McCormick School of Engineering & Applied Science, the School of Communication and the Kellogg School of Management and Director of the Science of Networks in Communities (SONIC) Research Group at Northwestern University. He is also the President of the International Communication Association (ICA). Additionally, he is the host of a podcast series titled “Untangling the Web,” where he engages in conversations with thought leaders to explore how the Web is shaping society, and how society in turns is shaping the Web.

Professor Contractor has been at the forefront of three emerging interdisciplines: network science, computational social science and web science. He is investigating how social and knowledge networks form – and perform – in contexts including business, scientific communities, healthcare and space travel.  His research has been funded continuously for 25 years by the U.S. National Science Foundation with additional funding from the U.S. National Institutes of Health, NASA, DARPA, Army Research Laboratory and the Bill & Melinda Gates Foundation.

His book Theories of Communication Networks (co-authored with Peter Monge) received the 2003 Book of the Year award from the Organizational Communication Division of the National Communication Association and the 2021 Fellows Book Award from the International Communication Association (ICA).  He is a Fellow of the American Association for the Advancement of Science (AAAS), the Association for Computing Machinery (ACM), the Network Science Society, and the International Communication Association (ICA).  He also received the Distinguished Scholar Award from the National Communication Association, the Lifetime Service Award from the Communication, Digital Technology, & Organization Division of the Academy of Management, and the Simmel Award from the International Network for Social Network Analysis (INSNA). In 2018 he received the Distinguished Alumnus Award from the Indian Institute of Technology, Madras where he received a Bachelor’s in Electrical Engineering. He received his Ph.D. from the Annenberg School of Communication at the University of Southern California.

Dr. Archisman Majumdar

Title: The Applications of Generative AI – IT industry perspectives.

Archisman Majumdar is an assistant vice president and lead for applied AI at Mphasis Next Labs, where he conceptualizes, develops, and leads multiple products in the analytics R&D space. Archisman is responsible for the research, innovation, and go to market for the products and solutions. His areas of expertise are business analytics, machine learning, product management, and information systems research. He holds a PhD in quantitative methods and information systems from the Indian Institute of Management Bangalore (IIMB).

Dr. Mukund Raj

Title: Bite and Bight the byte

Abstract: Abundance of data creates opportunities and confrontation. There is a dire need for ToT in AI and explore technology leaps to provide sustainable solutions to real world complex issues by developing sustainable copute models.

Currently, employed at United Nations Develop Program (UNDP) as Project Head for Sustainable Development Goals Coordination Center (SDGCC) Karnataka. He obtained his PhD in Economics from Rushmore University (2003), PGD In Cyber Security and Cyber Law from National Law University School University, Bengaluru (2019) (1998), Masters in Business Administration from Rushmore University (2001). His research lines are on the topics of agriculture, economy, technology and sustainable development. Having an international work experience of 14 years in technology and IT domains and also served as Chief Information/Security Officer.He has been Consultant IT at various departments for the Government of Karnataka – involved in various IT initiatives, Data analysis, and future technology initiatives.

Prof. Srinath Srinivasa

Title: Preserving Narrative Diversity on the Web 

Abstract: The human mind is known to be a “story engine” where it interprets and operates within the framework of mental constructs called narratives. The web has added a new dimension to how narratives are diffused and consumed in social settings. In this talk we will look at the role of narratives in the way we interpret facts and the way we commit resources to action. We also address issues of clash of narratives and how it goes on to shape collective behaviour. We finally stress upon the importance of preserving narrative diversity in online discourses. 

Srinath Srinivasa heads the Web Science lab and is the Dean (R&D) at IIIT Bangalore, India. Srinath holds a Ph.D (magna cum laude) from the Berlin Brandenburg Graduate School for Distributed Information Systems (GkVI) Germany, an M.S. (by Research) from IIT-Madras and B.E. in Computer Science and Engineering from The National Institute of Engineering (NIE) Mysore.

He works in the area of Web Science — that models of the impact of the web on humanity. Technology for educational outreach and social empowerment has been a primary motivation driving his research. He has participated in several initiatives for technology enhanced education including the VTU Edusat program, The National Programme for Technology Enhanced Learning (NPTEL) and an educational outreach program in collaboration with Upgrad. He is a member of various technical and organizational committees for international conferences like International Conference on Weblogs and Social Media (ICWSM), ACM Hypertext, COMAD/CoDS, ODBASE, etc. He is also a life member of the Computer Society of India (CSI). As part of academic community outreach, Srinath has served on the Board of Studies of Goa University and as a member of the Academic Council of the National Institute of Engineering, Mysore. He has served as a technical reviewer for various journals like the VLDB journal, IEEE Transactions on Knowledge and Data Engineering, and IEEE Transactions on Cloud Computing. He is also the recipient of various national and international grants for his research activities.

Prof. Matt Ryan

Title: What will it take for democratic innovation and data science/AI to positively reinforce one another?

Abstract: The explosion of AI and data science, and exploitation of a web-based data exhaust has raised several ethical concerns about how machines (intentionally or unintentionally) increase societal inequalities or oppress and destroy social goods. Though there have always been worries about design and especially the use of new technologies by undemocratic actors, until recently waves of democratisation seemed to proceed in mutualistic symbiosis with rapid technological advance. These appear to have stagnated and recent advances in AI have been portrayed more often as especially parasitic to democratic societies. In particular, changes in what information we access and how we communicate online have posed challenges for democratic politics, with increasing disaffection with politics, polarisation of views, mistrust and hate speech. Nevertheless new technologies also provide significant opportunities to aid democratic processes, innovating in who can be included in political discussion, and increasing civic activity online, often with transformative democratic outcomes. When and how should we use AI to buttress democracy, and aid informed, fair, inclusive, empowering and evidence-based reasoning? In this talk, I will offer some ideas about how socio-technical expertise can inform development of democratic norms by design, and how the power of novel computing technology can be harnessed to reinforce the most important technology humans have ever developed – that of solving their disputes peacefully.

Matthew Ryan is Associate Professor in Governance and Public Policy. His research on democratic innovations tries to figure out how people can have control over the decisions that affect their lives. His research crosses several disciplinary boundaries with a focus on innovative research methods. Since January 2020 he has been a UKRI Future Leaders Fellow leading the Rebooting Democracy project which aims to understand which innovations in public participation restore and sustain democracy.

He is Co-Director of the Centre for Democratic Futures, bringing together academics from across the University of Southampton who have an interest in how people make collective decisions both in the UK and internationally. Since January 2021 he has been Policy Director at the Web Science Institute an established world-leading institute dedicated to bringing socio-technical expertise to explore the development of the Web. Matt has expertise in fields related to data science and artificial intelligence and in October 2021 became a Turing Fellow collaborating and advancing research with the national Alan Turing Institute.

Prof. Janaki and Ms. Sudha Nagavarapu

Title: Recovering Food Narratives and Reimagining Health: Digital documentation of cultural practices around food, farming, dietary transitions and nutrition in western Avadh, Uttar Pradesh, India

Abstract: Digital technologies have been heralded for some time as having the potential to democratize the doing and consumption of historical research (Bolick 2006). On the one hand, material in a digital archive can be accessed from a wider range of locations, thereby potentially expanding the consumption of such research. But, as important, digital tools can help democratise history through “the inclusion of all histories” (Thomas 1999) by facilitating the collection and archiving of oral histories from diverse constituents. Our ongoing project, the foodcultures.org portal, is an attempt to build and leverage a digital repository towards these goals in the context of food histories in western Avadh, UP. By collecting and juxtaposing diverse narratives of diets, hunger, farming, and health in this region, we hope to build a rich repository that can be leveraged by students, researchers and policy makers alike towards a fuller understanding of the historical trajectory of a food production and consumption ecosystem.

Our project builds on a two-year, collaborative and multidisciplinary study (2017-18) conducted by Sangtin, a farmer-labourer collective in Sitapur district, Uttar Pradesh, and the Indian Institute of Technology Delhi, that documented historical and current diets, dietary and agricultural transitions and their drivers in the region, and people’s experiences of hunger and health using oral history interviews and archival research. The findings from this research were shared with academic audiences through seminars and publications (Kumar et al 2018, Nagavarapu et. al 2019). For the current project, this research team collaborated with IIIT-Bangalore and Design Beku to take these findings to several other audiences, including local residents at the one end and global audiences at the other. We see the juxtaposition of these diverse measures of food production and consumption trajectories in the region as one way for our knowledge on the historical trajectories of agriculture and nutrition in the region, and their linkages, to be expanded, critiqued and analysed. This, in turn, could shape more regionally-informed policymaking. It could also potentially contribute towards reshaping people’s dietary choices and improving their understanding of the linkages between food, farming and health. Finally, such a portal could also serve as a pedagogical tool for teaching about broader conceptual linkages as well as regional particularities of agricultural transformations. In this talk, we will walk you through the portal and our goals for it.

Janaki Srinivasan’s research examines the political economy of information technology-based development initiatives. She uses ethnographic research to examine how gender, caste and class shape the use of such technologies. Her work has explored these interests in the context of Indian digital inclusion initiatives focussed on community computer centres, mobile phones, identity systems and open information systems. Currently, she is exploring privacy, algorithmic control and the role of intermediaries in digital transactions, with an emphasis on the domains of financial inclusion and work automation. Janaki has a PhD in Information Management and Systems from UC Berkeley and Masters degrees in Physics and in Information Technology from IIT Delhi and IIIT Bangalore.

Sudha Nagavarapu supports grassroots organizations in India in the areas of food systems, sustainable agriculture, health, livelihoods and related issues. She has coordinated community-driven, collaborative research into maternal health, health systems, food cultures and agrarian histories, and is also involved in developing a portal at foodcultures.org. She works with Sangtin Kisan Mazdoor Sangathan (SKMS), Uttar Pradesh and various organizations and networks in Karnataka.

Dr. Vinay Reddy

Title: Smart City IoT systems and Ethics

Abstract: The talk explores following ethical dimensions – Justice and Equity, Trust, Fairness, Dignity of Life and Work – in Smart City IoT systems. To this end, we rely on the case study of an Integrated Command and Control Centre (ICCC), an IoT platform for Smart Cities in India; arrived based on interviews and literature review.

Dr. Vinay is working as a post-doctoral fellow in the Centre for Internet of Ethical Things (CIET), IIITB. He is currently involved in a project whose aim is to arrive at an ethical governance framework for public purpose IoT projects.
Education: B. Tech (EE) IIT Kharagpur M.S (Development Practice) TISS Mumbai PhD (Public Policy) IIM Bangalore.
Work Experience: Qualcomm Bangalore (2011 to 2014; 2016 to 2017) Prime Minister’s Rural Development Fellow (2014 to 2016). Research Interests: Emerging Technologies (AI, IoT, and Blockchain), Public Policy, and Information Systems.

Prof. Balaraman Ravindran

Title: Are Models Trained on Indian Legal Data Fair?

Abstract: Recent advances and applications of language technology and artificial intelligence have enabled intelligent automation across a wide variety of domains such as law, health care, FinTech, etc. Particularly for legal systems, AI based language models have recently been proposed to understand legal language and documents attempting to predict the judgment. While these models demonstrate acceptable performance on judgement prediction problems, they also carry encoded social biases learned from the training data. The concept of bias and fairness within machine learning models have been widely studied across NLP community, but most studies limit themselves to the Western contexts. In this work, we present an initial investigation of fairness and bias in language models designed to understand legal documents from the Indian perspective. We highlight the presence of learnt algorithmic biases in InLegalBERT, a language model finetuned on legal documents in the Indian context. We show that InLegalBERT shows stereotypical preference in the axes of disparities such as Religion, Caste & Gender and anti-stereotypical nature in the case of the Region axis of disparity. On average, the bias shown by InLegalBERT is around 12.55% higher compared to a standard BERT model. Additionally, we highlight the requirements of research in the direction of understanding bias in language models trained on Indian legal documents and its removal, which can potentially assist legal practitioners in future.

Professor B. Ravindran heads the Robert Bosch Centre for Data Science and Artificial Intelligence, a WSTNet laboratory, and the Centre for Responsible AI (CeRAI) at IIT Madras. He is the Mindtree Faculty Fellow and Professor in the Department of Computer Science and Engineering at IIT Madras. He has held visiting positions at the Indian Institute of Science, Bangalore, India, the University of Technology, Sydney, Australia and Google Research. Currently, his research interests are centred on learning from and through interactions and span the areas of geometric deep learning and reinforcement learning. He currently serves on the editorial boards of ACM Transactions on Intelligent Systems, Machine Learning Journal, Journal of AI Research, PLOS One, and Frontiers in Big Data and AI. He has published more than 100 papers in premier journals and conferences. His work with students has won multiple best paper awards, the most recent being the best application paper at PAKDD 2021. He was elected ACM Distinguished Member (2021) for his significant contributions to computing. He was recognized, in 2020, as a Senior member of AAAI (Association for Advancement of AI) for his long-standing contributions to AI.

Dr. TB Dinesh

Title: Towards decentralised webs

Abstract: We have been part of the COWMesh setup near Bangalore. COW – community owned wifimesh interconnects 4 villages to our rural research lab campus called iruWay. We are an internet-independent community network space with several services such as community radio, media sharing, media archival and fragment annotations, etc. Our goal is to be a “decentralised Web” setup where we become document centric rather than document location centric like the current web.

T B Dinesh is a community media activist with a background in Computer Science. The recent focus of his work is on storytelling methods and encouraging people from marginalised communities to tell their own stories and document their ways of life. T B Dinesh is a founder of Janastu in Bangalore, India. see open.janastu.org


Research Scholars Colloquium

Paramita Das

Title: Diversity matters: Robustness of bias measurements in Wikidata 

Abstract: With the widespread use of knowledge graphs (KG) in various automated AI systems and applications, it is very important to ensure that information retrieval algorithms leveraging them are free from societal biases. Previous works have depicted biases that persist in KGs, as well as employed several metrics for measuring the biases. However, such studies lack in the systematic exploration of the sensitivity of the bias measurements, through varying sources of data, or the embedding algorithms used. To address this research gap, in this work, we present a holistic analysis of bias measurement on the knowledge graph. First, we attempt to reveal data biases that surface in Wikidata for thirteen different demographics selected from seven continents. Next, we attempt to unfold the variance in detecting biases by two different knowledge graph embedding algorithms – TransE and ComplEx. We conducted our extensive experiments on a large number of professions sampled from the thirteen demographics with respect to the sensitive attribute, i.e., gender. Our results show that the inherent data bias that persists in KG can be revised by specific algorithm bias as incorporated by KG embedding learning algorithms. Further, we show that the choice of the state-of-the-art KG embedding algorithm has a strong impact on the ranking of biased professions irrespective of gender. In particular, we find that the embedding algorithm ComplEx is more robust to the choice of demographics compared to TransE. Subsequently, we observe that the similarity of the biased professions across demographics is minimal which possibly reflects the socio-cultural differences around the globe. This is often overlooked by most of the coarse-grained approaches working at the aggregate level. We believe that this full-scale audit of the bias measurement pipeline will raise awareness among the community while deriving insights related to design choices of data and algorithms both and refrain from the popular dogma of “one-size-fits-all”.

Bio: I am pursuing my Ph.D. under the guidance of Prof. Animesh Mukherjee at the department of Computer Science and Engineering at IIT Kharagpur. My works are based on the state-of-the-art approaches of social computing, machine learning, and natural language processing. My research problems focus on quality issues of collaborative platforms, mostly Wikipedia, and human-curated societal biases that persist in a crowd-sourced system. Before joining Ph.D. I have completed my M. Tech in Computer Science and Engineering from IIEST Shibpur and B.Tech from Maulana Abul Kalam Azad University of Technology, West Bengal. Please find my list of publications here.

Meera Muthukrishnan

Title: COVID-19 data infrastructure in India: politics of knowing and governing the pandemic

Abstract: Most of the governmental and popular responses to the COVID-19 pandemic have been dominated by a data-driven approach. This means that different kinds of data available about the virus and its impact -ranging from positivity rate to mortality rate, from recovered cases to active cases- have not only shaped public health policies and government actions but also largely shaped our understanding of the global health crisis. Lack of data, lack of trust in data, counter-data practices- all have come to the fore of the public debate around the pandemic. Using a critical data studies lens to explore a series of Covid-19 datasets on and in India and how the data and socio-political response to the crisis co-constitute each other, we explore questions such as, what kind of data gets captured, who are the actors in setting up and maintaining this data infrastructure and who have access to such data infrastructure. Through the analysis, we bring to light the practices and politics around data infrastructures employed during the pandemic and have called for a more critical understanding of the claims of truth laid down by them.

Bio: Meera Muthukrishnan was a Software Professional for around 19 years delivering enterprise-scale software solutions, with a Bachelor and Masters degree in Computer Science, before she joined for the Master of Science by Research program in the IT and Society Department at IIIT, Bangalore in 2020. She is interested in the research and development of ICT projects at the intersection of technology and public services delivery. Many of her research projects have been in the area of public health care. She is interested in both qualitative and quantitative research to understand and develop methods to support design for inclusive and participatory development.

Apurva Kulkarni

Title: Ontology Augmented Data Lake System for SDGs

Abstract: Analytics of Big Data in the absence of an accompanying framework of metadata can be a quite daunting task. While it is true that statistical algorithms can do large-scale analyses on diverse data with little support from metadata, using such methods on widely dispersed, extremely diverse, and dynamic data may not necessarily produce trustworthy findings. One such task is identifying the impact of indicators for various Sustainable Development Goals (SDGs). One of the methods to analyze impact is by developing a Bayesian network for the policymaker to make informed decisions under uncertainty. It is of key interest to policy-makers worldwide to rely on such models to decide the new policies of a state or a country (https:// sdgs.un.org/2030agenda). The accuracy of the models can be improved by considering enriched data – often done by incorporating pertinent data from multiple sources. However, due to the challenges associated with volume, variety, veracity, and the structure of the data, traditional data lake systems fall short of identifying information that is syntactically diverse yet semantically connected. In this research work, we propose a Data Lake (DL) framework that targets ingesting & processing of data like any traditional DL, and in addition, is capable of performing data retrieval for applications such as Policy Support Systems (where the selection of data greatly affect the output interpretations) by using ontologies as the intermediary. This research work also targets to discuss the proof of concept and the preliminary results (IIITB Data Lake project Website link: http://cads.iiitb.ac.in/wordpress/) based on the data collected from the agriculture department of the Government of Karnataka (GoK).

Bio: Apurva Kulkarni is a PhD Research Scholar at IIIT-Bangalore. She is working under guidance of Prof. Chandrashekar Ramanathan in the field of Heterogeneous Data Modeling. Her interests include semantic based document retrieval system, semantic document linking, querying heterogeneous documents and database system. She holds a bachelor’s and a master’s degree from Mumbai university. She has 5 years of academic experience.

Pooja Bassin

Title: Intervention Modeling for Sustainable Development

Abstract: To solve numerous problems in social, economic and environmental domains, Sustainable Development Goals(SDGs) were adopted by the United Nations in 2015. SDGs with a new vision and many new challenges replaced the Millennium Development Goals(MDGs) that had a successful run from 2000-2015. The concept of SDG localization or domestication focuses on implementing locally-appropriate actions for achieving SDGs. We argue that this notion of localization requires conceptualization and a defined framework, currently, the lack of which hinders computational efforts towards representing and reasoning about sustainability. The talk will give a brief overview of some of the hermeneutic challenges encountered in our ongoing work on designing AI-based Policy Support System.

Bio: Pooja Bassin is a PhD candidate at the Web Science Lab (WSL), IIITB working under the supervision of Prof. Srinath Srinivasa. Her work involves constructing and representing intervention models based on Sustainable Development Goals with the application of the underlying principle of sustainability. Prior to undertaking her doctoral journey, she has worked as a Research Associate at the WSL and has been part of the Cogno Web Observatory project that involved understanding online social cognition that refers to the way social discourses lead to formation of collective worldviews. Her research interests include semantic web, network science, causal inference. She obtained her Master of Technology in Computer Science (2014) and Master of Computer Application degree (2011) from Banasthali Vidyapith, Jaipur. She has 4+ years of teaching experience at various degree colleges in Jaipur and Mumbai.

Jayati Deshmukh

Title: Responsible Agency and Sustainability

Abstract: In this talk, we will discuss about responsible agency, our proposed model of designing responsible autonomous agents, and finally we compare it with sustainability principles. In our agent design, rather than imposing constraints or external reinforcements, agents are endowed with an elastic “sense of self” or an elastic identity that they curate based on rational considerations. This approach is called “computational transcendence (CT).” We show that agents using this model make responsible choices i.e. choices for collective welfare instead of individual benefit. We demonstrate CT in the game theoretic context of Prisoners’ Dilemma. Finally, we discuss the similarity between the principles of CT and the principles of sustainability and show that our proposed CT framework is one way of designing responsible and sustainable autonomous agents.

Bio: Jayati Deshmukh is a Ph.D. research scholar at Web Science Lab, IIIT-Bangalore under the guidance of Prof. Srinath Srinivasa. Her research interests broadly lie in the area of artificial intelligence and specifically in autonomous systems, multi-agent systems, network science and game theory. She holds M.Tech. in Data Science from IIIT-Bangalore and B.E. in Computer Engineering from Gujarat Technological University. She is a gold-medalist in both undergraduate and graduate degrees. She has 4+ years of work experience at Accenture Technology Labs, Bangalore where she did research and development in the area of natural language processing, deep learning and knowledge graphs which resulted in many successful PoCs as well as patents and publications.


Acknowledgements


WSL Research Workshop – May’22

The Web Science Lab at IIIT-B conducts a biannual research workshop that aids research scholars to share and present the latest development in their field of work. These interactive brainstorming sessions encourage everyone to new ways of thinking and put forward a fresh perspective on ongoing research problems.

After having it in the online mode for the last two years, we’ll be having the research workshop again in-person.

Date: 20th May, 2022

Time: 10:00AM – 7:00PM

Venue: WSL Lab  

Agenda
TimeSpeakerTitleSession Chair
10:00 – 10:30PoojaNetwork Learning in Open Data to aid Policy MakingJayati
10:30 – 11:00AparnaNamed Entity Recognition in KannadaJayati
11:00 – 11:30Break
11:30 – 12:00JayatiComputational Transcendence in Supply ChainBalambiga
12:00 – 12:30ChaitaliAI based Narrative Arc Generation for Coherent and Engaging Learning ExperienceBalambiga
12:30 – 2:00 Lunch Break
2:00 – 2:30PraseedaNavigated Learning a two dimensional learning mapChaitali
2:30 – 3:00AnuragEduEmbed – Embeddings for EducationChaitali
3:00 – 3:30ShyamCompetency Maps – A Measure space for Online LearningChaitali
3:30 – 3:45Break
3:45 – 4:15BalambigaPolicy Based Consent Management for Data TrustsPooja
4:15 – 4:45PrakharTBAPooja
4:45 – 5:15NiharikaTBAPooja
5:15 – 5:30Break
5:30 – 6:00JayaTBAAparna
6:00 – 6:30SharathTBAAparna
6:30 – 7:00Prof. Srinath and
Prof. Sridhar
Closing Remarks

WSL Research Workshop December 2021

The Web Science Lab at IIIT-B conducts a biannual research workshop that aids research scholars to share and present the latest development in their field of work. These interactive brainstorming sessions encourage everyone to new ways of thinking and put forward a fresh perspective on ongoing research problems.

Due to the present COVID-19 situation, the workshop will be a 2-day event and shall be hosted virtually.

Date: 16th Dec’2021 – 17th Dec’2021

Time: 2:00 – 5:00 p.m.

Venue: MS Teams Link

Day 1 – Dec 16th, 2021
TimeSpeakerTitle
2:00 – 2:30 p.m.PrakharA Semi-automatic Approach for Generating
Academic Trailers for Learning Pathways and Resources
2:30 – 3:00 p.m.ChaitaliAI-based Narrative Arc Generation for Engaging Learning Experience
3:00 – 3:30 p.m.ShyamCompetency Maps – Measurement Space & Polyline Algebra
3:30 – 4:00 p.m.AnuragEduEmbed – A Knowledge Graph Embedding for Education
4:00 – 4:30 p.m.Niharika
4:30 – 5:00 p.m.PraseedaA Semantic Two Dimensional Space to Navigate the Learner
Day 2 – Dec 17th, 2021
2:00 – 2:30 p.m.JayatiComputational Transcendence in a Network
2:30 – 3:00 p.m.PoojaNetwork Learning in Open Data to aid Policy Making
3:00 – 3:30 p.m.Sharath
3:30 – 4:00 p.m.AparnaNamed Entity Recognition in Dravidian Languages
4:00 – 4:30 p.m.BalambigaConsent Management for Non-Personal Data
4:30 – 5:00 p.mProf. SrinathClosing Remarks