DBLModeller

This is the project web page for DBLModeller, a tool for extracting models from the database layer of multi-tier systems. As input it requires a SQL dump file (DDL only) and a set of workload measurements (provided as a CSV file). This produces a structure model (conforming to the Knowledge Discovery Metamodel) and a workload model (conforming to the Software Metrics Metamodel).

Getting Started

DBLModeller is packaged as JAR file which can be downloaded from here, the source code is also on GitHub. Java 8 or higher must be installed. DBLModeller.jar has three options

The -logprocess and -wikipedia options are used to generate the set of workload measurements (measurements.csv). -logprocess takes a Oracle JDBC log file and an entity name as input. The log file can be obtained by replacing a systems Oracle JDBC driver with a logging enabled alternative (see here). A sample JDBC log file can be downloaded from here (163MB), using the entity name 'MHE504.ORDER_ITEM'.

The -wikipedia option takes a CSV file as input, with each line containing 4 URLS to: a monthly page count file, an article title list file for the English language site, a complete Wikipedia database dump for the English language site, and the site statistics file for the English language site. These files will be downloaded and used to produce the workload measurements CSV file. This option also requires grep and bzip2/lbzip2 meaning a UNIX-like environment should be used (alternatively install GNU Tools on Windows).

The -extract option requires two input files: a SQL schema and the measurements CSV file. The SQL schema should be obtained using Oracle SQL Developer, MySQL Workbench, or a similar tool. From these files two models will be extracted: a KDM-based structure model, and a SMM-based workload model. See here and here for example schemas and here for an example CSV file.

Research Data

The schema and workload measurements from Science Warehouse are considered commercially sensitive, and therefore are not included.

DBLModeller Code

Schemas

Workload Measurements

SharePoint Extension Code

Performance Results

Model Correctness Results