Capturing Expert Knowledge to Guide Data Flow and Structure
Analysis of Large Corporate Databases
Gergő Balogh, Tamás Gergely, Árpád
Beszédes, Attila Szarka and Zoltán Fábián
Maintaining and improving existing, large-scale systems,
that are based on relational databases has proven to be a
challenging task. Among many other aspects, it is crucial to
develop actionable methods for estimating costs and durations in
the process of assessing new feature requirements. This is a very
frequent activity during the evolution of large database systems
and data warehouses. This goal requires the analysis of program
code, data structures and business level objectives at the same
time, which is a daunting task if made manually by experts. Our
industrial partner started to develop a static database analysis
software package that would automate and ease this process in
order to make more accurate estimations. The goal of this work was
to create a quality assessment model that can effectively help
developers to assess the data flow (lineage) quality and the
database structure quality of data warehouse (DWH) and online
transaction processing (OLTP) database systems. Based on the
relevant literature, we created different models for these two
interconnected topics, which were then evaluated by independent
developers. The evaluation showed that the models are suitable for
implementation, which are now included in a commercial product
developed by our industrial partner, Clarity.
Keywords: database
systems; data warehouses; cost estimation; software quality
models; data flow; database structure; data lineage.
Back