TY - GEN
T1 - An XML data placement strategy for distributed XML storage and parallel query
AU - Zhang, Jing
AU - Lang, Bo
AU - Duan, Yawei
PY - 2011
Y1 - 2011
N2 - Since there has been significant amount of XML documents generated in various application domains, efficient XML management has become an important problem. Distributed XML storage and parallel query based on MapReduce can be an effective solution to this problem. As XML data placement strategy is a key factor of parallel system performance, in this paper we present an XML placement strategy, which is Query Workload Estimation based XML Placement strategy (QWEXP) for efficient distributed XML storage and parallel query. To achieve query workload balance, it partitions XML based on query workload estimation which is calculated by XML structure without knowing of user queries, considering that in common application scenarios user queries are unknown in advance. The partitioned XML segments are around an XML storage unit W0, to support scalability of parallel XML database. Finally segments are distributed to each processing node evenly to ensure workload balance on parallel query execution. Experimental results have shown that QWEXP promotes the speedup and scaleup properties of parallel XML system greatly.
AB - Since there has been significant amount of XML documents generated in various application domains, efficient XML management has become an important problem. Distributed XML storage and parallel query based on MapReduce can be an effective solution to this problem. As XML data placement strategy is a key factor of parallel system performance, in this paper we present an XML placement strategy, which is Query Workload Estimation based XML Placement strategy (QWEXP) for efficient distributed XML storage and parallel query. To achieve query workload balance, it partitions XML based on query workload estimation which is calculated by XML structure without knowing of user queries, considering that in common application scenarios user queries are unknown in advance. The partitioned XML segments are around an XML storage unit W0, to support scalability of parallel XML database. Finally segments are distributed to each processing node evenly to ensure workload balance on parallel query execution. Experimental results have shown that QWEXP promotes the speedup and scaleup properties of parallel XML system greatly.
KW - Distributed XML storage
KW - Parallel XML query
KW - XML data placement
UR - https://www.scopus.com/pages/publications/84863069752
U2 - 10.1109/PDCAT.2011.19
DO - 10.1109/PDCAT.2011.19
M3 - 会议稿件
AN - SCOPUS:84863069752
SN - 9780769545646
T3 - Parallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings
SP - 433
EP - 439
BT - Proceedings - 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2011
T2 - 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, PDCAT 2011
Y2 - 20 October 2011 through 22 October 2011
ER -