Campus Units
Electrical and Computer Engineering
Document Type
Article
Publication Version
Accepted Manuscript
Publication Date
2017
Journal or Book Title
IEEE Transactions on Services Computing
Volume
10
Issue
5
First Page
771
Last Page
784
DOI
10.1109/TSC.2016.2523997
Abstract
We consider the enumeration of maximal bipartite cliques (bicliques) from a large graph, a task central to many data mining problems arising in social network analysis and bioinformatics. We present novel parallel algorithms for the MapReduce framework, and an experimental evaluation using Hadoop MapReduce. Our algorithm is based on clustering the input graph into smaller subgraphs, followed by processing different subgraphs in parallel. Our algorithm uses two ideas that enable it to scale to large graphs: (1) the redundancy in work between different subgraph explorations is minimized through a careful pruning of the search space, and (2) the load on different reducers is balanced through a task assignment that is based on an appropriate total order among the vertices. We show theoretically that our algorithm is work optimal, i.e., it performs the same total work as its sequential counterpart. We present a detailed evaluation which shows that the algorithm scales to large graphs with millions of edges and tens of millions of maximal bicliques. To our knowledge, this is the first work on maximal biclique enumeration for graphs of this scale.
Rights
© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Copyright Owner
IEEE
Copyright Date
2017
Language
en
File Format
application/pdf
Recommended Citation
Mukherjee, Arko Provo and Tirthapura, Srikanta, "Enumerating Maximal Bicliques from a Large Graph Using MapReduce" (2017). Electrical and Computer Engineering Publications. 150.
https://lib.dr.iastate.edu/ece_pubs/150
Comments
This is a manuscript of an article published as Mukherjee, Arko, and Srikanta Tirthapura. "Enumerating maximal bicliques from a large graph using mapreduce." IEEE Transactions on Services Computing, Volume: 10, Issue: 5, Page(s): 771 - 784, Sept.-Oct. 1 2017. 10.1109/TSC.2016.2523997. Posted with permission.