TY - GEN
T1 - Extending dependencies with conditions
AU - Bravo, Loreto
AU - Fan, Wenfei
AU - Ma, Shuai
N1 - Publisher Copyright:
Copyright 2007 VLDB Endowment, ACM.
PY - 2007
Y1 - 2007
N2 - This paper introduces a class of conditional inclusion dependencies (CINDs), which extends traditional inclusion dependencies (INDs) by enforcing bindings of semantically related data values. We show that CINDs are useful not only in data cleaning, but are also in contextual schema matching [7]. To make effective use of CINDs in practice, it is often necessary to reason about them. The most important static analysis issue concerns consistency, to determine whether or not a given set of CINDs has conflicts. Another issue concerns implication, i.e., deciding whether a set of CINDs entails another CIND. We give a full treatment of the static analyses of CINDs, and show that CINDs retain most nice properties of traditional INDs: (a) CINDs are always consistent; (b) CINDs are finitely axiomatizable, i.e., there exists a sound and complete inference system for implication of CINDs; and (c) the implication problem for CINDs has the same complexity as its traditional counterpart, namely, PSPACE-complete, in the absence of attributes with a finite domain; but it is EXPTIME-complete in the general setting. In addition, we investigate the interaction between CINDs and conditional functional dependencies (CFDs), an extension of functional dependencies proposed in [9]. We show that the consistency problem for the combination of CINDs and CFDs becomes undecidable. In light of the undecidability, we provide heuristic algorithms for the consistency analysis of CFDs and CINDs, and experimentally verify the effectiveness and efficiency of our algorithms.
AB - This paper introduces a class of conditional inclusion dependencies (CINDs), which extends traditional inclusion dependencies (INDs) by enforcing bindings of semantically related data values. We show that CINDs are useful not only in data cleaning, but are also in contextual schema matching [7]. To make effective use of CINDs in practice, it is often necessary to reason about them. The most important static analysis issue concerns consistency, to determine whether or not a given set of CINDs has conflicts. Another issue concerns implication, i.e., deciding whether a set of CINDs entails another CIND. We give a full treatment of the static analyses of CINDs, and show that CINDs retain most nice properties of traditional INDs: (a) CINDs are always consistent; (b) CINDs are finitely axiomatizable, i.e., there exists a sound and complete inference system for implication of CINDs; and (c) the implication problem for CINDs has the same complexity as its traditional counterpart, namely, PSPACE-complete, in the absence of attributes with a finite domain; but it is EXPTIME-complete in the general setting. In addition, we investigate the interaction between CINDs and conditional functional dependencies (CFDs), an extension of functional dependencies proposed in [9]. We show that the consistency problem for the combination of CINDs and CFDs becomes undecidable. In light of the undecidability, we provide heuristic algorithms for the consistency analysis of CFDs and CINDs, and experimentally verify the effectiveness and efficiency of our algorithms.
UR - https://www.scopus.com/pages/publications/85011024333
M3 - 会议稿件
AN - SCOPUS:85011024333
T3 - 33rd International Conference on Very Large Data Bases, VLDB 2007 - Conference Proceedings
SP - 243
EP - 254
BT - 33rd International Conference on Very Large Data Bases, VLDB 2007 - Conference Proceedings
A2 - Gehrke, Johannes
A2 - Koch, Christoph
A2 - Garofalakis, Minos
A2 - Aberer, Karl
A2 - Kanne, Carl-Christian
A2 - Neuhold, Erich J.
A2 - Ganti, Venkatesh
A2 - Klas, Wolfgang
A2 - Chan, Chee-Yong
A2 - Srivastava, Divesh
A2 - Florescu, Dana
A2 - Deshpande, Anand
PB - Association for Computing Machinery, Inc
T2 - 33rd International Conference on Very Large Data Bases, VLDB 2007
Y2 - 23 September 2007 through 27 September 2007
ER -