Thou shalt encapsulate thine objects
The idea of encapsulation comes from (i) the need to cleanly
distinguish between the specification and the implementation of
an operation and
(ii) the need for modularity. Modularity is necessary to structure
complex applications designed and implemented by a team
of programmers. It is also necessary as a tool for protection
and authorization.
There are two views of encapsulation: the programming language view (which is the original view since the concept originated there) and the database adaptation of that view.
The idea of encapsulation in programming languages comes from abstract data types. In this view, an object has an interface part and an implementation part. The interface part is the specification of the set of operations that can be performed on the object. It is the only visible part of the object. The implementation part has a data part and a procedural part. The data part is the representation or state of the object and the procedure part describes, in some programming language, the implementation of each operation.
The database translation of the principle is that an object encapsulates both program and data. In the database world, it is not clear whether the structural part of the type is or is not part of the interface (this depends on the system), while in the programming language world, the data structure is clearly part of the implementation and not of the interface.
Consider, for instance, an Employee. In a relational system, an employee is represented by some tuple. It is queried using a relational language and, later, an application programmer writes programs to update this record such as to raise an Employee's salary or to fire an Employee. These are generally either written in a imperative programming language with embedded DML statements or in a fourth generation language and are stored in a traditional file system and not in the database. Thus, in this approach, there is a sharp distinction between program and data, and between the query language (for ad hoc queries) and the programming language (for application programs).
In an object-oriented system, we define the Employee as an object that has a data part (probably very similar to the record that was defined for the relational system) and an operation part, which consists of the raise and fire operations and other operations to access the Employee data. When storing a set of Employees, both the data and the operations are stored in the database.
Thus, there is a single model for data and operations, and information can be hidden. No operations, outside those specified in the interface, can be performed. This restriction holds for both update and retrieval operations.
Encapsulation provides a form of ``logical data independence'': we can change the implementation of a type without changing any of the programs using that type. Thus, the application programs are protected from implementation changes in the lower layers of the system.
We believe that proper encapsulation is obtained when only the operations are visible and the data and the implementation of the operations are hidden in the objects.
However, there are cases where encapsulation is not needed, and the use of the system can be significantly simplified if the system allows encapsulation to be be violated under certain conditions. For example, with ad-hoc queries the need for encapsulation is reduced since issues such as maintainability are not important. Thus, an encapsulation mechanism must be provided by an OODBS, but there appear to be cases where its enforcement is not appropriate.