LDM: Lineage-Aware Data Management in Multi-tier Storage Systems

Thumbnail Image
Date
2019-02-02
Authors
Mishra, Pratik
Somani, Arun
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Person
Somani, Arun
Senior Associate Dean
Research Projects
Organizational Units
Organizational Unit
Electrical and Computer Engineering

The Department of Electrical and Computer Engineering (ECpE) contains two focuses. The focus on Electrical Engineering teaches students in the fields of control systems, electromagnetics and non-destructive evaluation, microelectronics, electric power & energy systems, and the like. The Computer Engineering focus teaches in the fields of software systems, embedded systems, networking, information security, computer architecture, etc.

History
The Department of Electrical Engineering was formed in 1909 from the division of the Department of Physics and Electrical Engineering. In 1985 its name changed to Department of Electrical Engineering and Computer Engineering. In 1995 it became the Department of Electrical and Computer Engineering.

Dates of Existence
1909-present

Historical Names

  • Department of Electrical Engineering (1909-1985)
  • Department of Electrical Engineering and Computer Engineering (1985-1995)

Related Units

Journal Issue
Is Version Of
Versions
Series
Department
Electrical and Computer Engineering
Abstract

We design and develop LDM, a novel data management solution to cater the needs of applications exhibiting the lineage property, i.e. in which the current writes are future reads. In such a class of applications, slow writes significantly hurt the over-all performance of jobs, i.e. current writes determine the fate of next reads. We believe that in a large scale shared production cluster, the issues associated due to data management can be mitigated at a way higher layer in the hierarchy of the I/O path, even before requests to data access are made. Contrary to the current solutions to data management which are mostly reactive and/or based on heuristics, LDM is both deterministic and pro-active. We develop block-graphs, which enable LDM to capture the complete time-based data-task dependency associations, therefore use it to perform life-cycle management through tiering of data blocks. LDM amalgamates the information from the entire data center ecosystem, right from the application code, to file system mappings, the compute and storage devices topology, etc. to make oracle-like deterministic data management decisions. With trace-driven experiments, LDM is able to achieve 29–52% reduction in over-all data center workload execution time. Moreover, by deploying LDM with extensive pre-processing creates efficient data consumption pipelines, which also reduces write and read delays significantly.

Comments

This is a post-peer-review, pre-copyedit version of a conference proceeding published as Mishra, Pratik and Arun K. Somani. (2020) "LDM: Lineage-Aware Data Management in Multi-tier Storage Systems." In Arai K., Bhatia R. (eds) Advances in Information and Communication. FICC 2019. Lecture Notes in Networks and Systems, vol. 69. Springer, Cham. The final authenticated version is available online at DOI: 10.1007/978-3-030-12388-8_48. Posted with permission.

Description
Keywords
Citation
DOI
Copyright
Wed Jan 01 00:00:00 UTC 2020