The Web Science for Development (WS4D) workshop, an outreach activity of the esteemed Web Science research initiative at IIIT Bangalore, is a platform that started in 2019 to bring together professionals from domains such as data science and public policy, among others. This year, we are excited to launch the fifth iteration of the workshop with the theme ‘AI for Public Good’.
Keynote Addresses
The keynote addresses this year will be delivered by Dame Wendy Hall(Director of the Web Science Institute, University of Southampton), and Prof. Noshir Contractor (Faculty, Kellogg School of Management, and Director of the Science of Networks in Communities (SONIC) Research Group, Northwestern University)
Workshop Activities
As part of the workshop, we are announcing a call for extended abstracts for researchers from across the country to present their work on the theme of AI for Public Good during a dedicated workshop session. Apart from this, interactive sessions are planned over the day.
We are interpreting the broad theme of the workshop to include:
AI for accessibility
AI for inclusion and mainstreaming of marginalised voices in knowledge production and dissemination
AI for safety, diversity and mitigation of bias, as some of the possible directions.
Why Should You Attend?
The workshop aims to foster a vibrant community of practitioners, researchers, entrepreneurs, students, and policymakers to meet the challenges and opportunities in harnessing the power of AI for the Public Good.
We are excited to announce a call for extended abstracts for the upcoming in-person WS4D 2025 workshop to be held on 27 February 2025. The workshop will be focused on the transformative potential of artificial intelligence (AI) in promoting social impact. This workshop aims to explore innovative applications of AI that enhance accessibility, inclusion, and representation while addressing critical issues such as safety and bias mitigation.
Important Dates
Abstract Submission Deadline: 15 January 2025
Notification of Acceptance: 7 February 2025
Workshop Date: 27 February 2025
Workshop Themes
We invite contributions that align with, but are not limited to, the following themes:
AI for Accessibility: Exploring how AI technologies can be leveraged to create more accessible environments and tools for individuals with disabilities.
AI for Inclusion: Discuss strategies and applications that ensure diverse populations are included in AI development and deployment processes.
AI for Mainstreaming Marginalized Voices: Investigating how AI can facilitate the representation of marginalized communities in knowledge production and dissemination.
AI for Safety: Examining the role of AI in enhancing safety measures across various sectors, including public safety, cybersecurity, and personal security.
AI for Diversity and Mitigation of Bias: Addressing the challenges of bias in AI systems and proposing solutions to foster diversity in AI applications.
Submission Guidelines
We welcome abstracts from researchers, practitioners, policymakers, and advocates. Submissions should include:
Title of the presentation
An extended abstract (2 pages) outlining the key ideas and relevance to the workshop themes
Author(s) name(s), affiliation(s), and contact information
Selected authors will be invited to present their research during a dedicated session as part of the workshop. The option of online presentations for authors unable to travel to the workshop will be explored, and we will keep you posted regarding the same.
How to Submit
Please submit your abstracts as PDF files using this Google Form.
Ensure that your submission is in PDF format and that it follows the guidelines outlined above. For any inquiries, please contact wsllabiiitb@gmail.com
I am currently pursuing Masters by Research in the Web Sciences Lab here at IIIT Bangalore. Prior to joining IIITB I have worked in Capgemini Engineering for three years as a Senior Associate and also worked as a Research Assistant in The Visual Conception Group Lab at IIIT Delhi. I am currently working on the Online Learning Navigator Project in partnership with Gooru Labs. My Research Interests are Optimizations in Machine Learning, Deep Learning and Vision Language Models.
In Predictive Impact Analysis dashboard, we can see the impact of different factors on Maternal Deaths (MD). We can intervene on a factor by selecting it from the dropdown menu and change it by any amount (eg: +10% , -10% etc) and we can see the corresponding changes in MD at district level.
In Prescriptive modelling dashboard, we can set the target MD and the model outputs the prescribed values of different factors in order to achieve the specified target MD. We can also see the corresponding change in MD at the district level by adopting these prescribed values of the factor. We can also see sensitivities of different factors which talks about the importance of the factor and it ranges from 0 to 1. If a domain expert deems a specific factor as unimportant, they can assign a sensitivity value of 0. For factors considered partially important, a sensitivity value of 0.5 can be assigned. If the expert believes the factor unquestionably plays a role, they have the option to set its sensitivity to 1.
In the prescriptive modelling dashboard itself, there is a box displaying the state stability score after intervention. There is also a scatter plot showing the relation between impact and stability with districts represented as points.
Finally, the dashboard includes a feature for budget allocation. Positioned at the top is a pie chart derived from slopes obtained from multiple linear regression. The methodology systematically distributes the budget to address the requirements of various districts. Here, as well, we can see the sensitivities of the factors. If the policy maker/domain expert thinks that a particular factor plays no role, he can set its sensitivity value to 0 and the budget allocation model will automatically get re-adjusted.
Workshop Venue: Teaching and Learning Building (M208/M209) at the University of Birmingham (workshop venue).
The below schedule is based on UK time zone(UTC+1).
Session 1: 9.00-10.30
Introduction & Initial announcements: 9.00-9.30
Lessons from the age of user-generated content for the age of AI-generated content (Prof. Nishanth Sastry: 9.30 to 10.30 )
Refreshment Break: 10:30-11:00
Session 2: 11.00-12.30
Enhancing Enterprise Knowledge Base Construction with Fine-Tuned Generative Language Models (Liana Mikaelyan:11.00-11.45)
Research Session 1: 11.45 – 12:30 1. Related Table Search for Numeric data using Large Language Models and Enterprise Knowledge Graphs(Pranav Subramaniam, Udayan Khurana, Kavitha Srinivas and Horst Samulowitz) 2. Cognitive Retrieve: Empowering Document Retrieval with Semantics and Domain Specific Knowledge Graph(Apurva Kulkarni, Chandrashekar Ramanathan and Vinu E Venugopal)
Lunch: 12.30- 14.00
Session 3: 14:00- 15:30
Building Knowledge Graph for Products at Scale and infusing it into LLMs(Dr. Manoj Agarwal: : 14.00-14.45)
Research Session 2:(14.45 -15.30) 1. EduEmbedd – A Knowledge Graph Embedding for Education(Anurag Mohanty) 2. CRUSH: Cybersecurity Research using Universal LLMs and Semantic Hypernetworks (Mohit Sewak, Vamsi Emani and Annam Naresh)
Refreshment Break: 15.30 – 16.00
Session 4: 16.00- 17:00
LLMs for Social Networks: Applications, Challenges and Solutions (Bojan Babic: 16.00-17.00)
Lessons from the age of user-generated content for the age of AI-generated content
Prof. Nishanth Sastry, Director of Research of the Department of Computer Science, University of Surrey.
Abstract:
The past decade and more has been defined by the rise and near universal adoption of user-generated content (UGC) on social media. Initial excitement about the promise of UGC has since become tempered by concerns about misinformation, hate speech and other online harms. We are now witnessing a similar enthusiasm for content generated by Large Language Models. This talk will draw parallels between the two, and extract lessons about the perils, potentials and pitfalls awaiting us in the future age of AI-generated content.
Biography:
Prof. Nishanth Sastry is the Director of Research of the Department of Computer Science, University of Surrey. His research spans a number of topics relating to social media, content delivery and networking, and online safety and privacy. He is joint Head of the Distributed and Networked Systems Group and co-leads the Pan University Surrey Security Network. He is also a Surrey AI Fellow and a Visiting Researcher at the Alan Turing Institute, where he is a co-lead of the Social Data Science Special Interest Group.
Can machines discover new knowledge?
Dr. Fabio Petroni, Co-Founder & CTO at Samaya AI
Abstract:
For many years, the quest to determine the most efficacious representations of knowledge for machines has been at the forefront of research. Historically, this focus has centered on knowledge retrieval, whether from unstructured text corpora, structured collections (e.g, knowledge graphs, key-value memories), or the parameters of a neural model. How can we evolve these representations to not just retrieve, but actively discover new knowledge?
Biography:
Dr. Fabio Petroniis the Co-Founder & CTO at Samaya AI, building an AI-powered knowledge-discovery platform. Before that he was a Researcher at FAIR and Thomson Reuters, focusing on representing, gathering, extracting, using, reasoning on and creating world knowledge using AI.
Enhancing Enterprise Knowledge Base Construction with Fine-Tuned Generative Language Models
Ms. Liana Mikaelyan, Research Software Development Engineer in the Alexandria team, Microsoft Research Cambridge UK .
Abstract:
In this talk, we will present our latest work on leveraging the power of generative language models for knowledge base construction. We have fine-tuned a generative LLM to extract entities and their relevant properties from text passages and represent them in a structured JSON format. This task was accomplished by creating a dataset of short passages and corresponding JSON outputs using GPT4, which was then used to fine-tune the OpenLlama 3B model on a single A100 GPU. Our approach has demonstrated superior performance compared to the existing template matching algorithm in Alexandria, both in terms of precision and coverage, as well as extracting a richer set of properties from the text. Furthermore, the addition of new properties to the knowledge base has been significantly simplified. Future work involves exploring ways to improve the generation time as well as investigating other models to further enhance our system’s performance
Biography:
Ms. Liana Mikaelyan is a Research Software Development Engineer in the Alexandria team at Microsoft Research Cambridge UK . Before joining Microsoft Research Cambridge she worked on various machine learning projects mainly in speech synthesis and recognition. She completed her MSc in Machine Learning at UCL with a background in mathematics.
LLMs for Social Networks: Applications, Challenges and Solutions
Bojan Babic, Nextdoor.
Abstract:
Last couple of years we have witnessed an explosion of Generative AI research and respective applications that are simultaneously transforming how companies operate internally and how they communicate with their customers.
In this talk we will present work of the Nextdoor GenAI team and respective LLM applications in social networks in the areas such as Knowledge tasks, Engagement tasks and Governance. We will cover what we have tried, what works and what does not work. At the same time, in this talk we will present a framework that we used that helped us iterate fast and systematically improve each of the product areas.
Biography:
Bojan Babic is currently working on various Generative AI problems at the social media platform Nextdoor. Preceding this position, he has been working on the Search/Information Retrieval, Ads and recommendations and respective application spanning from e-commerce to social media space
Building Knowledge Graph for Products at Scale and infusing it in to LLMs
Dr. Manoj Agarwal,Senior Staff Engineer in Discovery Intelligence team at Uber AI.
Abstract:
A knowledge graph is the key to entity search as it can store the factual entity related information in a structured manner without the rigidity of a fixed schema. Both Google and Bing have web scale knowledge graphs and for a large fraction of user queries knowledge graph is invoked. E-commerce search is primarily an entity search. Therefore, building a Knowledge Graph is the key to improve the eCommerce search in many ways. However, building it at web scale is a highly challenging problem. It is an equally or even more challenging problem to build the knowledge graph for products. In this talk, we present our methodology to build the knowledge graph for products at web scale. With recent success of LLMs, can we infuse such semantic understanding of the world, encoded in the form of Knowledge Graph, in the LLMs? There are some advances in this direction, however it remains an open question if the Knowledge graphs can be replaced by the LLMs.
Biography:
Dr. Manoj Agarwal is Senior Staff Engineer in Discovery Intelligence team at Uber AI. Before Uber, he was Principal Applied Scientist at Microsoft – AI and Research and a senior researcher in IBM Research. Manoj was the chief architect for building a web scale product knowledge graph for Microsoft – Shopping, comprising a few hundred million products and a few billion facts with high accuracy. Currently, he is engaged in the efforts to build the scalable knowledge graph as well as discovering the taxonomy to improve the semantic search and recommendations for Uber Delivery. His research interests are in the areas of web mining, graph mining, pattern recognition, data mining, knowledge graphs, LLMs and information retrieval with more than 30 patents and over 25 research papers in reputed journals and conferences.
Knowledge graphs can integrate diverse data sources and provide a holistic view to the downstream applications. By virtue of being structured, knowledge graphs offer transparency and interpretability to the search and recommendations applications. Combining Knowledge Graphs with current-day advances in LLMs can create several opportunities.
The EKG-LLM workshop as part of CIKM 2023, would be addressing how large language models can help with the construction and usage of these enterprise knowledge graphs. This involves improving all the aspects of EKG workflow using large language models: entity extraction, entity enrichment, EKG construction, querying EKG for search and recommendations, scenario specific EKG, etc. Through this workshop we would like to highlight research issues specific to the integration of the enterprise knowledge graphs with large language models and associated applications.
Topics of interest include but are not limited to, the following:
Designing Enterprise Knowledge Graph (EKG)
EKG Implementation
Scalable extraction of enterprise entities using LLMs
Building EKGs for specific domains or applications
Natural Language Processing (NLP) algorithms to build EKGs.
Relationship extraction using large language models
Federated graph learning with LLMs
Privacy in graph algorithms
Privacy preserving graph construction and mining
Semantic reasoning based on deep learning on graph
Industrial applications of EKGs: banking, financing, retail, healthcare, medicine, etc.
Explainable AI based on EKG
Use of EKG and LLMs for search and recommendations
Submission
Manuscripts should be submitted in PDF format with 6 pages of content , plus references. Please follow two-column CEUR style template (https://ceur-ws.org/Vol-XXX/) for paper submissions .
Authors of accepted papers should prepare a camera-ready (final) version of their paper and submit it using the EasyChair system no later than Sunday, October 1, 2023. Please email the camera ready version(PDF as well as editable versions (doc/latex)) to rajeev.gupta@microsoft.com , sri@iiitb.ac.in, aparna.m@iiitb.ac.in and bhoomika.ap@iiitb.ac.in.
Each accepted paper requires at least one author to perform in person registration using the link https://uobevents.eventsair.com/cikm2023/cikmauthpreandmain and be presented at the workshop in-person in order to include and publish the paper in the workshop proceedings.
Preparation of Camera Ready Paper
Authors are advised to address the comments of the reviewers in the camera-ready version suitably.