KG quality control is important to the utility of KGs.
We present anal- yses of our proposed algorithms, and experiments on real data demonstrating the efficiency and utility of our framework.Ī knowledge graph (KG), a special form of semantic network, integrates fragmentary data into a graph to supportknowledge processing and reasoning. This paper proposes the notion of conditional approxi- mate sequential dependencies and provides an efficient framework for discovering pattern tableaux, which are compact representations of the subsets of the data (i.e., ranges of values of the ordered at- tributes) that satisfy the underlying dependency. To make sequential dependencies applicable to real-world data, we relax their requirements and allow them to hold approximately (with some exceptions) and conditionally (on various subsets of the data). Sequen- tial dependencies express relationships between ordered attributes, and identify missing (gaps too large), extraneous (gaps too small) and out-of-order data. For example, time →(0,∞) sequence number indicates that sequence numbers are strictly increasing over time, whereas sequence number →(4,5) time means that the time "gaps" be- tween consecutive sequence numbers are between 4 and 5. Given an interval g, we write X →g Y to denote that the difference between the Y -attribute values of any two consecutive records, when sorted on X, must be in g.
We study sequential dependencies that express the semantics of data with ordered domains and help identify quality problems with such data.