COLLEY, Derek (2021) Development of a Dynamic Design Framework for Relational Database Performance Optimisation. Doctoral thesis, Staffordshire University.
Thesis_Final_with_Corrections_Submitted.pdf - Submitted Version
Available under License Type All Rights Reserved.
Download (7MB) | Preview
EThOS-Deposit-Agreement.doc - Supplemental Material
Restricted to Repository staff only
Available under License Type All Rights Reserved.
Download (125kB) | Request a copy
Abstract or description
Relational Database Management Systems (RDBMSs) are advanced software packages responsible for providing storage and access to relational databases; data stores in which data is arranged in schemas, which are interlinked tables, each table constituted of columns and rows, and each intersection containing a data point.
This project considers the impact that the ever-increasing demand in data volume, velocity and variety, combined with changes in query methodology and uptake of objectrelational mapping frameworks driven by modern object-oriented application programming practices, have had upon the effectiveness of the relational database query optimiser; in particular, this research examines the emergence of object-relational impedance mismatch and the corresponding effect on query processing efficiency within the database engine.
Firstly, this research reconsiders the query parsing and caching mechanisms within current RDBMSs and notes their deficiencies in query plan re-use. An alternative mechanism for query representation is presented, representing queries as multidimensional structures which are computable, comparable, and reducible to hashes. It is shown how this representation can be used to improve plan re-use and increase the efficiency of the query optimiser.
Secondly, new multidimensional representations in real-time are demonstrated using weighted k-means clustering with self-adjusting weights and k to predict superior subschema selection, including application of queries to an alternative sub-schema of data, reducing resource consumption and improving query execution times. This is validated against a real data set and performance is tested at scale. It was found that use of KNN provided the relational database query optimiser with an increasing degree of accuracy and reliability in query classification, with an improvement in query execution time demonstrated at scale, against lifelike database queries, ranging from 6.2% to 20.6%.
Finally, a novel method of dynamic schema redefinition is presented. This process defines, creates and destroys sub-schemas, maps queries to their sub-schema variants, and keeps track of performance metrics, self-adjusting the current library of alternative schema representations available. This is defined theoretically against the backdrop of the relational algebra and ZFC axiomatic set theory.
Item Type: | Thesis (Doctoral) |
---|---|
Faculty: | School of Digital, Technologies and Arts > Computer Science, AI and Robotics |
Depositing User: | Library STORE team |
Date Deposited: | 13 Jul 2022 12:11 |
Last Modified: | 24 Feb 2023 14:04 |
URI: | https://eprints.staffs.ac.uk/id/eprint/7402 |