Topic Expansion

(Please enter the term which you want to expand in the above text box.)

Execution Speed:
(Fast_0 is the fastest. Fast_2 is the slowest)
(Note: with the decrease in speed, the quality of the results increases.)

Sense Distinction:
(The default sense distinction is "Similar & Unrelated Senses". For choosing other values, please refer to the overlap value mentioned in brackets. Overlap value signifies the maximum overlap allowed between senses of two clusters. Please note that as the senses become more distinct, the execution time increases.)

Corpus to use:
(Please note that the results displayed will be generated on the corpus mentioned above.)

What is topic expansion?

Topic Expansion is the process of describing a topic of a concept using other concepts. A term can have multiple sense and hence can represent multiple concepts. These concepts may belated to each other with varying degree of relatedness. Depending on the sense in use, the other concepts used to describe it will changes. For example, take a topic of concept Ganga. We can talk of Ganga as a north Indian river. Then, Gomukh, Uttarakhand, Utter Pradesh, Bihar, river, Tehri, irrigation, Himalaya, dam, India, floods, Hoogli, Bay of Bengal are the appropriate topics to describe it. However, when we talk of Ganga as the holy Indian river and God, we may describe it in terms of Shiva, Kailash, Vishnu, Bhagiratha, Sagara, Kapila, eight Vasus, Bhishma, Mahabharat, Haridwar. Finally, if Ganga is talked about in the sense of a south Indian dynasty, it can be described in terms of Karnataka, Western Ganga, Talakadu, Kaveri River, Pallava and so on.

We observe here that the same term Ganga can represent these three different concepts Ganga, the river and Ganga, the holy river . Even though the first two look similar, they are two different and atomic concepts on their own. The similarity between them is due to the aboutness existing between these two concepts. We can see another example to have clarity on this. Consider a term Java. It can be expanded as Java, Programming language, Object, Inheritance, Interface. We can also expand the same term like Java, Sumatra, Indonesia, Javanese, Bali. The first expansion represents the concept Java, the programming language and the latter one represents Java, the island in Indonesia. Here, we can see that these two concepts are less about each other than in case of Ganga. We should not confuse between two concepts being related due to their high aboutness values and them being actually the same.

The algorithm running here in the background is only an approximation, and the actual implementation which is yielding slightly better results is much slower (i.e., a query is taking around half an hour to 2 hours). We have reduced the execution speed by some approximation in calculations.

Following are the related publications:

Rachakonda, Aditya Ramana, Srinath Srinivasa, Sumant Kulkarni, and M. S. Srinivasan. "A generic framework and methodology for extracting semantics from co-occurrences." Data & Knowledge Engineering 92 (2014): 39-59.

Kulkarni, Sumant, Srinath Srinivasa, and Rajeev Arora. "Cognitive Modeling for Topic Expansion." On the Move to Meaningful Internet Systems: OTM 2013 Conferences. Springer Berlin Heidelberg, 2013.

Please share your feed back at