A two-stage strategy for solving the connection subgraph problem

Thumbnail Image
Date
2012-01-01
Authors
Wang, Heyong
Major Professor
Advisor
Carl Kochao Chang
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Organizational Unit
Journal Issue
Is Version Of
Versions
Series
Department
Computer Science
Abstract

A connection subgraph is a small subgraph of a large graph that best capture the relationship between two nodes. Formally, Connection Subgraph Problem is:

Given: An edge-weighted undirected graph G, two query vertices s and t from G, the size of the subgraph b, and a "goodness" function.

Find: A connected subgraph H containing s and t and at most b other vertices that maximizes the "goodness" function g( H ).

Two challenging problems for Connection Subgraph Problem are i) Finding an good "goodness" function ii) and designing algorithms to quickly identify the connection subgraph. The goal of this work is to provide a way to explore and discovery relationship between nodes on graphs, especially for social networks. We focus on connection subgraph problem on the graphs where the relationship between nodes is from all the contributions of their paths. In this research,

we propose a measure called the Path Betweenness of a path to measure the contribution of a path between two nodes to these two nodes' relationship. Based on path betweenness, We introduce graph between of a graph as the "goodness" function for the Connection Subgraph Problem. Graph betweenness of a graph takes contributions of all paths in the graph into consideration. In

order to quickly find the connection subgraph from a large graph, we propose a two-stage solution. The two-stage strategy first generates a small enough intermediate subgraph by eliminating the unimportant nodes in the graph. In the second stage, a candidate subgraph generation algorithm and the graph betweenness computational algorithm are designed to find the connection subgraph. We implemented a system to conduct experiments on real data. Our experimental results show the two-stage strategy can help us quickly find the connection subgraph in practice.

Comments
Description
Keywords
Citation
Source
Subject Categories
Copyright
Sun Jan 01 00:00:00 UTC 2012