Sandesh is Semantic Data Mesh for publishing of Knowledge aggregated from Indian Open Data. Open structured data is published by several agencies like World Health Organization (WHO), United Nations Organization (UNO), private firms, NGOs, governmental bodies etc. Government of India publishes open data on its data portal called data.gov.in. To aggregate and integrate data from disparate datasets, a framework called Many Worlds on a Frame (MWF) is proposed. The framework is partially implemented in software called RootSet on top of which, the module Sandesh is implemented.
- Project Members
- Description: Our understanding of the web has evolved from that of a passive repository of hypertext documents, to an active, participatory, socio-cognitive space. People are not users of the web, but “participants” – the web uses people as much as people use the web. One of themajor objectives of web science research is to understand how this socio-cognitive space is affecting individuals and shaping society. As a part of this endeavor, we propose an abstract framework to model social interactions on the web, as part of the IIIT Bangalore, Web observatory. The model, called the STI model, characterizes the Web into three distinct regions namely: the social region, the trigger region and the inert region. Different kinds of analysis techniques are proposed for each of the regions.
- Current progress: Analysis on the social region by exploring three prominent social machines: Facebook, Twitter and Reddit.
- Analysis on Facebook conversations: A conversation which includes Posts and replies are collected from public pages and groups on Facebook and the average replies for conversations is calculated. Average sentiment and sentiment score is calculated using a sentiment analyzer (implemented using NLTK). Going further emoticons will also be incorporated in sentiment calculation.
- Analysis on Twitter:Based on search topic top 15 tweets are retrieved using twitter’s API and then merged with their replies . Whole data is then analyzed using R platform and average replies, tweet sentiment , reply sentiment and average reply sentiment are computed and stored in Mongodb as well as passed as json reply to the user on call of service.
- Analysis on Reddit conversations: The top post corresponding to a particular search is fetched using Snoowrap Reddit API and then all the conversations pertaining to that post are retrieved. The average sentiment and sentiment score is calculated using a sentiment analyzer.