A Point of View on ORDBMS

by Sergey Savushkin, 2003, 2012

This article is dedicated primarily to logical representation of objects in database by means of two object-relational (OR) languages: SQL3 (or SQL:1999) (http://www.objs.com/x3h7/sql3.htm) and Zigzag (http://savtechno.com). Really, the SQL3 is not final and is not supported by all the known object-relational DBMSs completely. The matter in hands of this article is object-oriented SQL of Oracle and Informix. Zigzag differs from SQL not only in syntax but also in methods of data processing. Whereas the SQL is a calculus language, the Zigzag is an algebra object-relational language.

To define Object-Relational Database Management System (ORDBMS) it is enough to take simple equation: ORDBMS = ODBMS + RDBMS = (O + R) * DB * MS. On a logical level, an ORDBMS is the MS process methods applied for DB data structure, which is complied with O object and R relational concepts.

The R relational concept in the context of DBMS is based on the relations in the form of two-dimensional tables of rows and columns. Transformation of queries into relational algebra is the main confirmation that affiliates a database with the relational model. It is prejudice to think that SQL2 language is a single and necessary criterion of RDBMS just as to think that Java is a single language of OO programming. Noteworthy feature of RDBMS is the possibility to process a large mass of the uniform n-element tuples (rows or records) quickly.

All necessary for object representation is available in the object DBMS (ODBMS). Often the ODBMS is equated to the OODBMS, namely DBMS integrated with an Object-Oriented (OO) programming language like C++ and Java. The characteristic properties of ODBMS are 1) complex data, 2) type inheritance, and 3) object behavior. These features will be considered below regarding the ORDBMS.

Complex data

Complex data creation in most SQL ORDBMSs is based on preliminary schema definition via the user-defined type (UDT). The table remains a most clear form for representation of complex data in any ORDBMS.

students

id

name

course

first

last

st031

Jane

Hunter

Economy

Planning

st072

Richard

White

Computers in Engineering

The "name" attribute (or field or column) consists of the "first" and "last" attributes. The value of the "course" attribute is a set of "Economy" and "Planning" elements. Such structure is able, for example, for the Informix SQL (http://publib.boulder.ibm.com/infocenter/idshelp/v111/index.jsp?topic=/com.ibm.sqls.doc/sqls.htm).

CREATE ROW TYPE Student (
  id CHAR(5),
  name ROW (first VARCHAR(12), last VARCHAR(20)),
  course SET (VARCHAR(128) NOT NULL)
);
CREATE TABLE students OF TYPE Student;
INSERT INTO students
VALUES (
  'st031',
  ROW('Jane', 'Hunter'),
  SET('Economy', 'Planning')
);
INSERT INTO students
VALUES (
  'st072',
  ROW('Richard', 'White'),
  SET('Computers in Engineering')
);

Other SQL ORDBMS suggests other composite type constructors, for example VARRAY or ARRAY instead of the SET, and OBJECT instead of the ROW. Constructors of the simple built-in types, with limitations like CHAR(5), came from SQL2.

Is it necessary to define schema before filling a database? Historically, the answer was canonical "yes" for two reasons. Firstly, the definition of database schema essentially helped to control the type of input data. Secondly, the data type limitation made possible the early DBMSs to organize data with maximal memory and processor efficiency. Nevertheless, development of database schema is a most laborious process. There are a lot of situations related with complex data when precise schema description is not necessary or unreasonably slows project development. Furthermore, both reasons for schema predefinition are no longer valid. Control of input data should be executed by supporting an input form, (e.g., via JavaScript on the Web client side). With the software and hardware advances in modern database systems, the influence of data type limitation on performance is practically insignificant. For example, in the old databases, a field (or record) length limit does reduce memory use and time of linear search. Now, in modern database, the field (record) values are realized by hierarchical trees and links (in them, on them and from them). Data are indexed for searching. Fixing of length may require even more memory, than it is necessary.

Zigzag assumes the above table without using the separate description of database schema, as below through the readTable procedure (http://www.savtechno.com/docz/zigzag/z/zlanguage.html).

$readTable() <
student; name:first; name:last; course
st031  ; Jane      ; Hunter   ; Economy, Planning
st072  ; Richard   ; White    ; Computers in Engineering
>;

The first row of input data declares attribute names. The "id" name may be missed, and the "student" will denote both a primary key attribute and a table.

Type inheritance

Hierarchy within structured complex data supposes type hierarchy and type inheritance, when all the characteristics (attributes) of super type are passed to it's subtypes.

employees
Name Salary
Sylvia Karsen 30000.00

programmers
Name Salary Language Project
William Helprin 40000.00 C++ Seestorm

representatives
Name Salary Region
Akiko Yokomoto 50000.00 Asia

The data types make up the hierarchy illustrated here:

               Employee(Name,Salary)
              /                     \
Programmer(Language,Project)   Representative(Region)

SQL representation

To create database, let’s take advantage of the Oracle SQL (http://docs.oracle.com/).

CREATE TYPE Employee AS OBJECT (
  Name      VARCHAR2(20),
  Salary    NUMBER(8,2)
) NOT FINAL;
CREATE TYPE Programmer UNDER Employee (
  Language   VARCHAR2(12),
  Project    VARCHAR2(30)
);
CREATE TYPE Representative UNDER Employee (
  Region    VARCHAR2(30)
);
CREATE TABLE employees OF Employee;
CREATE TABLE programmers OF Programmer;
CREATE TABLE representatives OF Representative;
INSERT INTO employees
  VALUES (Employee('Sylvia Karsen', 30000.00));
INSERT INTO programmers
  VALUES (Programmer('William Helprin', 40000.00, 'C++', 'Seestorm'));
INSERT INTO representatives
  VALUES (Representative('Akiko Yokomoto', 50000.00, 'Asia'));

The "Programmer" and "Representative" subtypes inherit all the attributes from the "Employee" supertype. However, in Oracle, a request for "employees" objects of the "Employee" type does not mean also request for objects of subtypes, namely "programmers" and "representatives". For example, the following SQL statement:

SELECT e.Name
  FROM employees e;

gives the result:

Name
--------------------
Sylvia Karsen

Zigzag representation

The same data hierarchy may be expressed in Zigzag. A separate type definition is not needed.

$readTable() <
Employee; Name         ; Salary
#       ; Sylvia Karsen; 30000.00
>;
$readTable() <
Employee:Programmer; Name           ; Salary  ; Language; Project
#                  ; William Helprin; 40000.00; C++     ; Seestorm
>;
$readTable() <
Employee:Representative; Name          ; Salary  ; Region
#                      ; Akiko Yokomoto; 50000.00; Asia
>;

A Zigzag request for objects of the "Employee" type means also a request for objects of the "Programmer" and "Representative" types (subtypes of "Employee"):

= Name:(Employee:);

In SQL as a result we receive only one employee, Sylvia Karsen, which has been set directlly. In Zigzag we get all three values. Whether all of them really are employee so? :). Result for Zigzag:

Name: Sylvia Karsen, Akiko Yokomoto, William Helprin

The distinctive property of Zigzag is that a type really is an object that determines a class (in the sense of multitude) of other objects. In other words, types in Zigzag are also data. Moreover, a type inheritance means not only an inheritance of attribute names but also an inheritance of attribute values. For example, assume that all programmers are located in one "E" department. SQL3 demands to insert "E" value in all the rows of the "programmers" table. In respect of Zigzag, we can be satisfied if set up the "E" only for the "Programmer" data object:

$readTable() <
Employee  ; Department
Programmer; E
>;

To test an inheritance of "E" value, enter the following Zigzag request translated as "the names of employees of E department":

= Name:(Employee:(Department:E));

I'd like to repeat, here it is a question of inheritance of attribute value (not attribute name). Value is inherited by Programmer, that is called as type in SQL3 :). The result will be such:

Name:William Helprin

Object behavior

To maximize computer capabilities, DBMS should provide some connection (e.g., ODBC) from an external programming language to the internal DBMS language. Additional advantage transforming (R)DBMS in O(R)DBMS is an access from internal DBMS language to the program objects, namely objects of an external/internal object-oriented programming (OOP) language. Such program objects have to be storable and transportable for database processing, therefore they usually are named as persistent objects. Inside a database, all the relations with a persistent program object are relations with its object identifier (OID). Mapping the external persistent program objects to a database is not a problem (see, for example, JDO/JPA tools http://db.apache.org/jdo/why_jdo.html / http://www.service-architecture.com/object-oriented-databases/index.html).

It is possible to describe a restricted real object through traditional database element like a record or even a field. This element is named as data object (or database object). Previously described the complex data elements also are data objects. What distinguishes a program object? Each program object can be formally defined as an instance received from the type mapping:

          type     = { attributes, methods  }
           ↓            
OBJECT =  instance = { state,      behavior }

Significant in this definition more approximating to a real object is that object type includes not only attributes but also methods mapped into an object behavior. Some ORDBMSs like Oracle and DB2 make it possible to place methods inside of a data type definition. It enables to develop a SQL application similar to an application of OOP language (and ODBMS). Method properly is a function or procedure defined for a certain class of the objects. The invoking of a method can be represented as follows:

object.method(parameters) = function(object,parameters);
or
object.method(parameters) = procedure(object,parameters);

The left expressions help to perceive method as a constituent, which may be expressed via objects. Let's consider properties related with the method and realized in OOP languages (see, for example, http://home.cogeco.ca/~ve3ll/jatutor0.htm or http://java.sun.com/). It is known that object behavior (methods) denoted by one name may be distinguished by the type of object itself or situation (parameters). The OOP languages call this as polymorphism principle, which briefly is described as "one interface, many implementations". Other OOP principles, inheritance and encapsulation, are related both with methods, and with attributes. Inheritance was considered already. The principle encapsulation in ORDBMS has a shade. Even advanced Oracle 9i and IBM DB2 V8 do not take an encapsulation degree through PUBLIC, PRIVATE and PROTECTED like Java or C++ does. Encapsulation in ORDBMS is reduced to replacing an attribute with a method or, in other words, using virtual attributes. For example, instead of the price attribute the getPrice() method may be used. In the following discussion the OOP principles are demonstrated through Oracle SQL and Java/Zigzag.

SQL representation

The SQL example below creates an "equipments" table of an abstract type, "Equipment". Really, a row of the "equipments" table may be an object of either the "Platform" or "Engine" subtype. Each subtype defines a unique getPrice() method.

CREATE TYPE Equipment AS OBJECT (
  name VARCHAR2(24),
  NOT INSTANTIABLE MEMBER FUNCTION getPrice() RETURN NUMBER
) NOT INSTANTIABLE NOT FINAL;
CREATE TYPE Platform UNDER Equipment (
  size NUMBER,
  OVERRIDING MEMBER FUNCTION getPrice() RETURN NUMBER
);
CREATE TYPE BODY Platform AS
  MEMBER FUNCTION getPrice() IS
  BEGIN
    RETURN size * 6
  END getPrice;
END;
CREATE TYPE Engine UNDER Equipment (
  power NUMBER,
  OVERRIDING MEMBER FUNCTION getPrice() RETURN NUMBER
);
CREATE TYPE BODY Engine AS
  MEMBER FUNCTION getPrice() IS
  BEGIN
    RETURN 40 + power * 5
  END getPrice;
END;

CREATE TABLE equipments OF Equipment;
INSERT INTO equipments
  VALUES (Platform('Tower X04', 4));
INSERT INTO equipments
  VALUES (Engine('Ford U14', 14));
SELECT name, getPrice() price
  FROM equipments;

The result is:

name        price
-------------------
Tower X04    24
Ford U14    110

Java/Zigzag representation

Let's turn to Zigzag. The free realization of this language is based on Java and complements Java. It is easy to create a persistent Java object in a Zigzag database and get its attributes or call methods via Zigzag. The object types are described in Java via the following classes:

public abstract class Equipment implements java.io.Serializable {
  public String name;
  public abstract int getPrice();
}//Equipment

public class Platform extends Equipment {
  public int size;
  public Platform(String name, int size) {
    this.name = name;
    this.size = size;
  }//Platform
  public int getPrice() {
    return size * 6;
  }//getPrice
}//Platform

public class Engine extends Equipment {
  public int power;
  public Engine(String name, int power) {
    this.name = name;
    this.power = power;
  }//Engine
  public int getPrice() {
    return 40 + power * 5;
  }//getPrice
}//Engine

Then following Zigzag fragment creates program objects of the "Platform" and "Engine" type, maps their content in database via $mapState, and prints table with "name" and "price" attributes like the SQL example before:

equipment:[
  @po Platform.(Tower X04, 4),
  @po Engine.(Ford U14, 14)
];
$mapState(equipment:);
$printTable(equipment:, name, price);

In addition, Zigzag has the unique possibility to set not only program objects but also their attributes and methods as multitude. That is a method or attribute can be expressed indefinitely through the request expression:

[object expression].[method expression]() – method calling,
[object expression].[attribute expression] – getting of the attribute from the program object.

Summary

Three main evidences, 1.complex data, 2.type inheritance, and 3.object behavior, are enough to class RDBMS or DBMS as ORDBMS or ODBMS correspondently. An internal language of DBMS like SQL or Zigzag is not criterion but only material for classification on logical level. The Zigzag language correlates with the object SQL3 at least in object-oriented data representation. Zigzag is more expressive and helps to work with structurally more flexible data. Nevertheless, SQL3 with its type schematization helps to set more strict control of uniform data. Principal difference appears that Zigzag can see a type as a data object and a data object as a type for other objects. This allows constructing a data hierarchy, not only a type hierarchy, semantically more precise, for example:

                                production(equipment)
                                /                   \       
        printed(periodicity,equipment:printer)    video(equipment:camera,...)
                 /                    \                                 |  \
magazine(periodicity:regular,...)  book(purpose,periodicity:occasional)  
/  |                               /                 |               \
       fiction(purpose:entertainment)   tutorial(purpose:education)  ...

The book object inherits not only the fact that it is production characterized by equipment and periodicity but fact that equipment is printer. Moreover, book is a type for objects (fiction, tutorial and others), which inherit attribute purpose and attribute periodicity with value occasional. If assume that instances of book (fiction, tutorial, ...) are researched target objects, inheritance in SQL3 is represented by the following type hierarchy:

                           production(id,equipment)
                             /                \
            printed(periodicity)             video(...)
                /     \                       /     \
    magazine(...)     book(purpose)     

The book objects stored in a "books" table may be represented with corresponding attributes' order:

books = [
  book('fiction', 'printer', 'occasional', 'entertainment'),
  book('tutorial', 'printer', 'occasional', 'education')
]

As can be seen, attribute values are not inherited. So the "printer" and "occasional" are repeated in each object. In the future the SQL has to posses less expressive than Zigzag but reasonably simple constructions to consider types in role of objects. Flexibility based on consideration of objects as types rather remains a prerogative of languages like Zigzag. To create objects of the "tutorial" and "fiction" types (subtypes of "book"), a SQL3 developer creates new types, and new tables, or better yet transforms the existing table to the new. Zigzag developer uses already existent objects "tutorial" and "fiction" as types of new objects. For example, to add a new object "physics", he can use a statement like :tutorial:physics(...):