An OSM Data Design Assistant

Eric Carter

July 11, 1997

Thesis Proposal

Introduction

In this thesis, we will focus on one area of software design: data design for database applications. We seek to create a tool that will assist a designer to create a data design that is efficient, mainly in its use of time and space. The "deliverables" for our design process are either flat schemes suitable for use with a relational database or nested schemes suitable for use with an object-oriented database.

Many problems contribute to the difficulty of software design [Guindon 1988, Kant 1985]. The first set of problems relates to the misuse and misunderstanding of software design techniques. Many designers have an incomplete understanding of software design principles. Designers who do understand a technique or design strategy are often impatient with methodically following that strategy.

The second set of problems relates to limitations of human problem solving ability. Guindon [1988] cites four examples: First, a designer will focus too early on an initial solution without considering alternative solutions or exploring the problem domain completely. Second, a designer may have difficulty keeping track of several proposed designs. Third, it is difficult for a designer to consider and integrate all the constraints for a solution. Fourth, a designer has difficulty remembering to return to a postponed subproblem.

Our focus on data design leads to a third set of problems we must address. Particularly problematic in data design is the universal-relation scheme assumption. Embley notes that "This ‘hidden’ assumption often fails in practice. Thus students who understand textbook dependency theory find that they cannot apply it as practitioners. They almost never realize that it is the assumption, not the theory that fails. Even if they happen to realize that the problem is with the assumption, they usually do not know what to do because they have not been taught" [Embley 1998].

Our goal is to create a design assistant that addresses these three sets of problems: the misuse and misunderstanding of software design techniques, the limitations of human problem solving ability, and the existence of the universal-relation scheme assumption. We also seek to create an assistant that works with the designer, enhancing the designer’s ability to create a good design.

As a designer begins to model an application, the design assistant continuously monitors the state of the application model. The assistant has a set of evaluative modules, often called critics [Fischer 1994], that look at various features of the application model. When these critics find suboptimal features in the application model, they interact with the designer to resolve the problem.

The design assistant in this thesis will use a set of critics to guide the designer through all of the design steps and algorithms discussed in Chapters 9 and 10 of [Embley 1998]. The design assistant will automatically determine design transformations that can be made which preserve constraints and will ask questions of the designer to determine whether the transformations preserve information. The final output of the design assistant will be either flat or nested database schemes.

Thesis Statement

Based on Chapters 9 and 10 in [Embley 1998]; the objective of this thesis is to create a design assistant that will work with a designer to create an efficient data design for a database application.

Methods

To develop a design assistant we must first provide a robust workspace in which the design assistant can work. That environment will consist of a drawing tool that will support all OSM modeling constructs. The drawing tool will be developed using Visual C++ 5.0 and the Microsoft Foundation Classes 4.2 (MFC). The architecture of the drawing tool will be based on a model discussed in a manuscript by [Olsen 1995] which elaborates on a model developed by [McNeill 1988]. The entire application will follow the Model-View-Controller pattern discussed in [Gamma 95].

In our system, we will provide thirteen design critics. These critics will be organized into three groups: "Hypergraph Conversion" critics convert an OSM application model to an OSM design hypergraph, "Data Reduction" critics reduce the hypergraph, and "Data Synthesis" critics generate database schemes. A prototype window that lists the critics is shown in Figure 1.

A critic in our system will have four components. First, a critic has a precondition. This precondition must be satisfied before the critic will even look at the application model. For example, a critic that works on a design hypergraph will not activate until the "Convert to Hypergraph" critic has completed the conversion of the application model to a design hypergraph. Figure 1 illustrates this idea. The hypergraph conversion critics are marked with check marks indicating the designer has already responded to these critics. The data reduction critics are currently available. The data synthesis critics are crossed out since they cannot be used until at least some of the data reduction critics have been responded to. One area of research is investigating which data reduction critics must be responded to before synthesizing data schemes.

Second, a critic has a predicate that checks whether some condition exists in the diagram. For example, a "Tail Reduction" critic checks for redundancies in a design hypergraph that a tail reduction can remove. The critic lists each redundancy found in a window where the user can click on the redundancy and proceed to resolve it. In Figure 1, the tail reduction critic has found a reduction involving the object sets Student, Person, and State.

Third, a critic has an interaction routine that elicits any additional information needed from the designer to resolve the condition. Some critics may not need additional information or interaction. For critics that do need to ask questions, default answers must exist for each question.

Finally, a critic has an algorithm that can modify the application model to resolve the condition. A "Tail Reduction" critic has an algorithm that removes the redundant tail edges from the hypergraph.

Since each critic has default answers and since each critic has an algorithm that can modify the application model to resolve any condition that is found, the entire process can proceed automatically. However, for optimal results, the designer should interact with the critics and answer the questions since the default answers may not produce an optimal design. In this way, the best design comes through an interaction between designer and design critics.

Critics within the design assistant will use the drawing tool to highlight areas of interest on the diagram, and will ask the designer questions to determine whether to make transformations and reductions. The design assistant will then make appropriate transformations and reductions automatically by controlling the drawing tool.

Contribution to Computer Science

This thesis seeks to refine a critic-based approach to data design transformation. It also seeks to develop an effective user interface for data design. This thesis will also provide a new OSM drawing tool that can be used by future researchers. To facilitate future research, the drawing tool must be extendable, robust, and controllable. Extendable means additional drawing constructs or modification of existing drawing constructs should be made simple through the architecture chosen for the tool. Robust means the tool should be well tested and support features such as unlimited undo and redo. Controllable means all operations available to a user with a mouse are also available to an external tool through an application programming interface.

Delimitation of the Thesis

The design assistant created in this thesis will focus only on data design. Behavior and interaction elements in an OSM diagram will be completely ignored by the design assistant. The design assistant will also ignore any general constraints placed on an OSM diagram.

Thesis Outline

  1. Introduction
    1. Database Applications
    2. OSM
    3. Problems to Overcome
      1. Helping to Follow a Method
      2. Helping to Address All Problems
      3. Helping to Address The "Hidden" Assumption
    4. Critics
      1. Preconditions
      2. Predicate
      3. Interaction
      4. Algorithm
  2. Architecture of Drawing Tool
    1. Design
    2. Implementation
  3. Scripting Model
  4. Design Assistant Critics
    1. Hypergraph Conversion Critics
      1. Template Expander
      2. Constraint Simplifier
      3. Convert to Hypergraph
    2. Data Reduction Critics
      1. Equivalence Class Reductions
      2. Tail Reductions
      3. Head Reductions
      4. Redundant Non-FD Reductions
      5. n-ary Relationship Set Reductions
      6. Embedded FDs Reductions
    3. Data Synthesis Critics
      1. Checking for Canonical Hypergraph
      2. Flat Schemes Synthesis w/ key constraints and Inter-scheme dependencies
      3. Nested Schemes Synthesis w/ key constraints and Inter-scheme dependencies
      4. Combined Scheme Cost Analysis
  5. Interrelationships among critics
    1. Which critics are necessary in creating effective data designs
    2. Which critics are optional in creating effective data designs
    3. Which critics can be entirely automated
  6. Conclusions
  7. Bibliography
  8. Appendices
    1. ORM Diagrams for Allegro
    2. ORM Diagrams for Scripting Model
    3. ORM Diagrams for Critics
    4. Code for Allegro and Design Assistant (Critics)

Thesis Schedule

Bibliography

Davis, Alan. 201 Principles of Software Development, McGraw Hill, 1995.

Embley, David W. Object Database Development: Concepts and Principles, Addison Wesley, 1998.

Fischer, G. Domain-Oriented Design Environments, Automated Software Engineering 1, 177-203, 1994.

Fischer, G. Supporting Software Designers with Integrated Domain-Oriented Design Environments IEEE Transactions on Software Engineering, Vol. 18, No. 6, June 1992.

Guindon, R. and Curtis, B. Control of Cognitive Processes During Software Design: What Tools are Needed? Chi ’88.

Gamma E, Helm R., Johnson R., and Vlissides J. Design Patterns: Elements of Reusable Object-Oriented Software Addison-Wesley, 1995.

Harandi, M. and Young, F. Software Design Using Reusable Algorithm Abstractions

Jayaputera, G.T. and Cheng, K.E. Extending MELBA+ CASE Tool: A Design Artifact Maintenance IEEE Software 1991.

Kant, E. Understanding and Automating Algorithm Design. IEEE Transactions on Software Engineering, Vol. SE-11, No 11, November 1985.

Olsen, Dan R. Interactive Software Systems, Brigham Young University, 1995.

Peņa-Mora, F. and Vadhavkar, S. A Software Methodology Combining Design Patterns, Design Rationale, and Case-based Reasoning Principles, Intelligent Engineering Systems Laboratory, Massachusetts Institute of Technology, 1996.

Taylor, E. S. An Interim Report on Engineering Design, Massachusetts Institute of Technology, 1959.

Artifacts

In addition to the written thesis, the following artifacts will be produced:

Signatures

Signature of Committee Chair ________________________________ Date ________

Signature of Member _______________________________________ Date ________

Signature of Member _______________________________________ Date ________

Signature of Graduate Coordinator ____________________________ Date _________