One of the most prominent trends of our time is the emergence of an information society. The amount of data available is absolutely surging, at the same time as we are becoming more and more dependent on constant access to updated and relevant information. To derive high quality information from a large quantity of textual data, we need efficient methods for structuring and querying the data.
One way is to use the Vector Space Model (VSM), a mathematical way of representing the relationship between words and documents. Wellknown examples of this approach are Latent Semantic Indexing (LSI) and Random Indexing (RI). This study describes the implementation of an information retrieval system based on the VSM that uses fragments of parse trees to index data.
Source: Umeå University
Author: Knutsson, Thomas