Degree Type

Thesis

Date of Award

2008

Degree Name

Master of Science

Department

Computer Science

First Advisor

Shashi K. Gadia

Second Advisor

Leslie Miller

Third Advisor

William J. Gutowski

Abstract

The NC-94 dataset, that contains climate, soil and crop data for 30 years during 1971-2000 for all counties in the north central United States, is an important resource in the agricultural community. Analyzing the dataset would yield invaluable understanding for farmers, scientists, public, planners, and policy makers to improve crop practices and yields, undertake scientific studies, and developing policy.;In the parametric model and its query language ParaSQL, the concept of a dimension is built at the level of primitive values. A canonical storage for XML (CanStoreX) is a technology to store large XML documents, deemed to be in terabyte range, in a paginated form on the disk that is accessed easily and efficiently requiring very small amount of main memory. CanStoreX is used as a back-end for storing NC-94 data, hiding the heterogeneity in climate, crop, and soil data in order to allow the user a simple view of counties as objects where geographical and time dimensions are implicit and taken for granted.;This work has focused on loading the NC-94 database on the CanStoreX storage platform. The combination of existing parametric query constructs and an efficient storage structure will provide an important tool to researchers who wish to analyze the NC-94 dataset. The process of loading this database has also revealed important inconsistencies in the data, which we have tried to address and hence develop a more consistent view of the dataset. Previously only climate data was available and it was stored in an older version of CanStoreX where XML was stored in text form. The newer binary version of CanStoreX allows a readily available tree-like navigation in the paginated XML document. Addition of crop and soil data requires different internal representation in order to achieve a uniform view for users that is at par with the climate data. Further, the internals were conformed to use the version of CanStoreX where pages are stored in binary, rather than text form.

DOI

https://doi.org/10.31274/rtd-180813-16682

Publisher

Digital Repository @ Iowa State University, http://lib.dr.iastate.edu/

Copyright Owner

Niranjan Kumar

Language

en

Proquest ID

AAI1454645

OCLC Number

268962196

ISBN

9780549685432

File Format

application/pdf

File Size

50 pages

Share

COinS