Matrix-Query: A distributed SQL-like query processing model for large database clusters

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Along with the development of distributed computation and the rapid growth of data, scientific research increasingly requires the support of high-efficiency relational data processing framework. According to the characteristics of scientific data, for example bulk inserts and unfrequented change, this paper proposes a streaming processing model called Matrix-Query with the matching data storage architecture for relational query. Through transforming the original relational schema to entities and key-value indexing, the data storage solution provides more localization operation and data positioning. Compare to traditional Map-Reduce model, the Matrix-Query isolates the influence between subtasks to ensure execution in a streaming and parallel manner and reduces negative impacts of writing intermediate file. We also optimize the data structure and subtask management to improve the performance of Matrix-Query. The experimental results demonstrate performance advantage of Matrix-query compared to two famous data processing systems, Hive and HadoopDB, which build on the top of Map-Reduce model.

Original languageEnglish
Title of host publicationProceedings - 2013 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2013
PublisherIEEE Computer Society
Pages179-185
Number of pages7
ISBN (Print)9780768551067
DOIs
StatePublished - 2013
Event2013 5th International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2013 - Beijing, China
Duration: 10 Oct 201312 Oct 2013

Publication series

NameProceedings - 2013 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2013

Conference

Conference2013 5th International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2013
Country/TerritoryChina
CityBeijing
Period10/10/1312/10/13

Keywords

  • Distributed computation
  • Relational query processing model
  • SQL

Fingerprint

Dive into the research topics of 'Matrix-Query: A distributed SQL-like query processing model for large database clusters'. Together they form a unique fingerprint.

Cite this