Co-creation of a Center of Excellence in Big Data Engineering

This project aims to establish a collaboration between the International Institute of Information Technology (IIIT-B), with City University London, as the UK partner and Siemens Research, India, as the industry partner, to set up a centre of excellence in Big Data Engineering. With emerging trends like Web Science and the Internet of Things, expertise in Big Data is going to be in high demand in the future. As part of our initiatives to create a talent pool of research and engineering expertise, IIIT-B has collaborated with several partners in this area on specific projects. This project aims to consolidate our disparate activities in this area and create a Centre of Excellence in Big Data Engineering. The term “Big Data” is defined here to mean any kind of data management problem for which, conventional RDBMS based solutions are inadequate. The “Big” refers to not just the volume of data, but also challenges concerning variety, veracity and velocity of the data.

Collaborators:

  • Dr. Srinath Srinivasa IIIT Bangalore
  •  Prof Muttukrishnan Rajarajan, City University, Northampton Square , London,
    United Kingdom
  • Dr. Amarnath Bose, Siemens Technology and Service, Bangalore
  • Praseeda , Research Scholar, IIIT Bangalore
  • Anish, Mtech. Thesis Student, IIIT Bangalore

Events

  • A two day workshop on Big Data on April 18th and 19th at IIIT Bangalore

Associated Project

Reports

SANDESH

 

Sandesh is Semantic Data Mesh for publishing of Knowledge aggregated from Indian Open Data. Open structured data is published by several agencies like World Health Organization (WHO), United Nations Organization (UNO), private firms, NGOs, governmental bodies etc. Government of India publishes open data on its data portal called data.gov.in. To aggregate and integrate data from disparate datasets,  a framework called Many Worlds on a Frame (MWF) is proposed. The framework is partially implemented in software called RootSet on top of which, the module Sandesh is implemented.

A Framework for Observing and Characterizing Social Cognition on the Web

  • Project Members
  • Description: Our understanding of the web has evolved from that of a passive repository of hypertext documents, to an active, participatory, socio-cognitive space. People are not users of the web, but “participants” – the web uses people as much as people use the web. One of themajor objectives of web science research is to understand how this socio-cognitive space is affecting individuals and shaping society. As a part of this endeavor, we propose an abstract framework to model social interactions on the web, as part of the IIIT Bangalore, Web observatory. The model, called the STI model, characterizes the Web into three distinct regions namely: the social region, the trigger region and the inert region. Different kinds of analysis techniques are proposed for each of the regions.
  • Current progress: Analysis on the social region by exploring three prominent social machines: Facebook, Twitter and Reddit.
    • Analysis on Facebook conversations: A conversation which includes Posts and replies are collected from public pages and groups on Facebook and the average replies for conversations is calculated. Average sentiment and sentiment score is calculated using a sentiment analyzer (implemented using NLTK). Going further emoticons will also be incorporated in sentiment calculation.
    • Analysis on Twitter:Based on search topic top 15 tweets are  retrieved  using  twitter’s API and then  merged with their  replies . Whole data is then analyzed using R platform and average replies, tweet sentiment , reply  sentiment  and average reply sentiment  are  computed and stored in Mongodb as well  as  passed as json reply to the user on call of  service.
    • Analysis on Reddit conversations: The top post corresponding to a particular search is fetched using Snoowrap Reddit API and then all the conversations pertaining to that post are retrieved. The average sentiment and sentiment score is calculated using a sentiment analyzer.

REACH

Sponsors and Collaborators: Horizon 2020, European Commission

Time Frame: Jan 2016 – Dec 2020

Status – Active

The REACH project aims to develop solution to avail the provision for high speed Internet access in rural India using unlicensed TV white space spectrum and designing the Geolocation database for it. With the wide increase of population and use of Internet in India, the efficient utilization and management of spectrum is needed. The utilization of TV white space spectrum is emerging as a best alternative to fulfill this need since there are many unused channel in TV spectrum due to migration from analog to digital transmission technology.

At IIIT-B, we are working on Distributed Algorithms for Spectrum Assignment for White Space Devices. Spectrum assignment for devices in white-space spectrum is challenging due to the fact that, white-space spectrum has temporal and spatial variations and is most often fragmented. We have created an autonomous agent model for spectrum assignment of white space devices at a given location. Each white space device (WSD) acts autonomously out of self-interest, choosing a strategy from its bag of strategies. It obtains a payoff based on its choice and choices made by all other agents. WSDs interact with each other using a central shared memory located at a “Master” device. Based on payoffs received by different strategies, WSDs evolve their strategic profile over time. This has the effect of “demographic changes” in the population. The system is said to have reached a state of equilibrium (or, in a state of evolutionary best-response) when the demographic profile stabilises. The system is trained on different load profiles to compute their respective evolutionary best responses.

Project Outcomes

  1. Chaitali Diwan. Autonomous Spectrum Assignment of White Space devices. MTech thesis. June 2016.
  2. Chaitali Diwan, Srinath Srinivasa, Bala Murali Krishna. Autonomous Spectrum Assignment of White Space Devices. Proceedings of the 12th EAI International Conference on Cognitive Radio Oriented Wireless Networks. Lisbon, Portugal. September 2017.
  3. Simulation dashboard for autonomous spectrum allocation algorithm

Other Relevant Links

 

An Open Architecture for Smart Cities

Sponsor and Collaborator: Siemens India

Time Frame: May 2015 — Apr2016

Status: Active

Smart cities are an upcoming area of growth in the region and provide a large gamut of technical challenges in the area of wide area distributed sensing and processing. A significant part of these challenges can be addressed by leveraging upcoming technologies in the fog computing area where edge devices not only collect data and provide control signals but perform local optimizations based on global optimization requirements. However, to implement rapid growth of smart campus/city requires applications. These applications may be on different devices, software, middleware etc. As a result, this heterogeneity is a huge challenge at present to integrate to available system, collect data etc. Hence, one needs to develop a platform agnostic to above complex heterogeneous environment.

This provides an opportunity for both Siemens and IIIT Bangalore to create an open platform to share data and expose relevant APIs using which smart applications can be built by anyone.

Establishing a Web Science Research Centre and a Web Observatory in India

Time Frame: June 2015 — May 2016

Sponsor: UK Royal Academy of Engineering (Newton Research Collaboration Programme)

Collaborator: University of Southampton, UK

Status: Active

The Web Science Trust (WST) is a charitable body, originally founded as a collaboration between the University of Southampton and the Massachusetts Institute of Technology, Boston. It’s aims include articulating a global research agenda for studying the growth and impact of the World Wide Web on humanity in general. An important element of this is the Web Observatory (WO) Project, which aims to create a global grid of datasets pertaining to the Web and its use, coming from different parts of the world. This proposal aims to extend the WO grid to India, beginning with IIIT Bangalore as the collaboration partner. This collaboration also aims to nurture research excellence in the Web Sciences in India. Given that India constitutes the third largest, and one of the fastest growing bases for Web users across the world, the proposed collaboration assumes critical significance to meet a pressing need that is likely to emerge over the years.

Publications

Aastha Madaan, Tiropanis Thanassis, Srinath Srinivasa, Wendy Hall. Observlets: Empowering Analytical Observations on Web Observatory, WWW’16 Companion Volume, April 11–15, 2016, Montréal, Québec, Canada.