CIKM 2023 Workshop on Enterprise Knowledge Graphs Using Large Language Models

22nd October 2023,University of Birmingham, UK

Invited Talks

Lessons from the age of user-generated content for the age of AI-generated content

Prof. Nishanth Sastry, Director of Research of the Department of Computer Science, University of Surrey.


The past decade and more has been defined by the rise and near universal adoption of user-generated content (UGC) on social media. Initial excitement about the promise of UGC has since become tempered by concerns about misinformation, hate speech and other online harms. We are now witnessing a similar enthusiasm for content generated by Large Language Models. This talk will draw parallels between the two, and extract lessons about the perils, potentials and pitfalls awaiting us in the future age of AI-generated content.


Prof. Nishanth Sastry is the Director of Research of the Department of Computer Science, University of Surrey. His research spans a number of topics relating to social media, content delivery and networking, and online safety and privacy. He is joint Head of the Distributed and Networked Systems Group and co-leads the Pan University Surrey Security Network. He is also a Surrey AI Fellow and a Visiting Researcher at the Alan Turing Institute, where he is a co-lead of the Social Data Science Special Interest Group.

Can machines discover new knowledge?

Dr. Fabio Petroni, Co-Founder & CTO at Samaya AI


For many years, the quest to determine the most efficacious representations of knowledge for machines has been at the forefront of research. Historically, this focus has centered on knowledge retrieval, whether from unstructured text corpora, structured collections (e.g, knowledge graphs, key-value memories), or the parameters of a neural model. How can we evolve these representations to not just retrieve, but actively discover new knowledge?


Dr. Fabio Petroni is the Co-Founder & CTO at Samaya AI, building an AI-powered knowledge-discovery platform. Before that he was a Researcher at FAIR and Thomson Reuters, focusing on representing, gathering, extracting, using, reasoning on and creating world knowledge using AI.

Enhancing Enterprise Knowledge Base Construction with Fine-Tuned Generative Language Models

Ms. Liana Mikaelyan, Research Software Development Engineer in the Alexandria team, Microsoft Research Cambridge UK .


In this talk, we will present our latest work on leveraging the power of generative language models for knowledge base construction. We have fine-tuned a generative LLM to extract entities and their relevant properties from text passages and represent them in a structured JSON format. This task was accomplished by creating a dataset of short passages and corresponding JSON outputs using GPT4, which was then used to fine-tune the OpenLlama 3B model on a single A100 GPU. Our approach has demonstrated superior performance compared to the existing template matching algorithm in Alexandria, both in terms of precision and coverage, as well as extracting a richer set of properties from the text. Furthermore, the addition of new properties to the knowledge base has been significantly simplified. Future work involves exploring ways to improve the generation time as well as investigating other models to further enhance our system’s performance


Ms. Liana Mikaelyan is a Research Software Development Engineer in the Alexandria team at Microsoft Research Cambridge UK . Before joining Microsoft Research Cambridge she worked on various machine learning projects mainly in speech synthesis and recognition. She completed her MSc in Machine Learning at UCL with a background in mathematics.

LLMs for Social Networks: Applications, Challenges and Solutions

Bojan Babic, Nextdoor.


Last couple of years we have witnessed an explosion of Generative AI research and respective applications that are simultaneously transforming how companies operate internally and how they communicate with their customers. 

In this talk we will present work of the Nextdoor GenAI team and respective LLM applications in social networks in the areas such as Knowledge tasks, Engagement tasks and Governance. We will cover what we have tried, what works and what does not work. At the same time, in this talk we will present a framework that we used that helped us iterate fast and systematically improve each of the product areas. 


Bojan Babic is currently working on various Generative AI problems at the social media platform Nextdoor. Preceding this position, he has been working on the Search/Information Retrieval, Ads and recommendations and respective application spanning from e-commerce to social media space

Building Knowledge Graph for Products at Scale and infusing it in to LLMs

Dr. Manoj Agarwal,Senior Staff Engineer in Discovery Intelligence team at Uber AI.


 A knowledge graph is the key to entity search as it can store the factual entity related information in a structured manner without the rigidity of a fixed schema. Both Google and Bing have web scale knowledge graphs and for a large fraction of user queries knowledge graph is invoked. E-commerce search is primarily an entity search. Therefore, building a Knowledge Graph is the key to improve the eCommerce search in many ways. However, building it at web scale is a highly challenging problem. It is an equally or even more challenging problem to build the knowledge graph for products. In this talk, we present our methodology to build the knowledge graph for products at web scale. With recent success of LLMs, can we infuse such semantic understanding of the world, encoded in the form of Knowledge Graph, in the LLMs? There are some advances in this direction, however it remains an open question if the Knowledge graphs can be replaced by the LLMs.


Dr. Manoj Agarwal is Senior Staff Engineer in Discovery Intelligence team at Uber AI. Before Uber, he was Principal Applied Scientist at Microsoft – AI and Research and a senior researcher in IBM Research.  Manoj was the chief architect for building a web scale product knowledge graph for Microsoft – Shopping, comprising a few hundred million products and a few billion facts with high accuracy. Currently, he is engaged in the efforts to build the scalable knowledge graph as well as discovering the taxonomy to improve the semantic search and recommendations for Uber Delivery. His research interests are in the areas of web mining, graph mining, pattern recognition, data mining, knowledge graphs, LLMs and information retrieval with more than 30 patents and over 25 research papers in reputed journals and conferences.