Degree Type

Thesis

Date of Award

2018

Degree Name

Master of Science

Department

Computer Science

Major

Computer Science

First Advisor

Pavan Aduri

Abstract

Inverted index has been extensively used in Information retrieval systems for document relatedqueries. We consider the generic case of graph storage using Inverted Index and compressing themusing this format. Graph compression by reordering has been done using traversal and clusteringbased techniques. In generic methods, the graph is reordered to arrive at new identifiers for thevertices. The reordered graph is then encoded using an encoding format. The reordering to achievemaximal compression is a well known NP complete problem, the Optimal Linear Arrangement.

Our work focuses on the inverted index format, where each node has its corresponding list ofneighbours. We propose a heuristic based graph reordering, using the property that the cost ofeach vertex is bound by its neighbour with largest vertex id. Consider, two vertices x and y withedges a and b respectively. If x>y and a>b, then cost of graph would come down, if the vertex idof x and y are interchanged. Further, experiments shows that using this heuristic helps in achievingcompression rates on par with distributed methods but with reduced utilization of computationresources

Copyright Owner

Pavithra Rajarathinam

Language

en

File Format

application/pdf

File Size

30 pages

Share

COinS