Storage Facility Help

Storage Facility Documentation

NOTE : This document is currently under construction.

Storage Facility Overview

The Storage Facility was developed to facilitate storage of OSA models. It is an in-memory database that offers garbage collection, indices, tuple and table storage, nested array storage structures, relational table operators, scheme and constraint definitions, and data-instance and seamless model structures.

The following is an index into the Storage Facility Documentation:

Reference Counting

Every object in the storage facility has reference counting functionality. The reference counting class, RefObject, keeps a count of the number of times that an object is inserted into a container object. Therefore, the initial reference count is zero for an object until it is inserted into an SF_ARRAY object or a subtype of the SF_ARRAY class. There are two methods used to perform reference counting:

RefObject's also have an update flag mechanism. This is used to indicate the current status of the object; namely:

The methods are setFlag() and getFlags().

In creating a new subclass of RefObject, the ref() method must be overwritten as follows :

	SF_ARRAY* ref(void) { return ((SF_ARRAY *) RefObject::ref()); }

Example: SF_ARRAY v; .... RefObject *o = v[2]; if (o->getFlags() && OBJECT_FLAG_NONE && o->unref()) delete o;

Garbage Collection

The reference counting mechanism can be used to garbage collect. This can be done as follows :
 insertion : 		v.putAt(o->ref());	// reference count = 2 : 1 for ref() and 1 for putAt()
 garbage collection :     if (o->unref() == TRUE)
			delete o;

Name Index

The SF_NAME_INDEX class uses an SF_ARRAY object to store character strings. A string is stored based on a hash function. Thus, this name index acts like a primary index for any character string.

	char *name = "Christine";
	int indexOf = name.indexOf("Christine");

Inverted Index

The SF_INVERTED_INDEX class is a sub-class of SF_ARRAY that has an inverted index entry in each element of the array. The index entry consists of a SF_OBJECT (or sub-type) object plus a list of indices (integers) that can be used to reference rows in an SF_TABLE where that object is stored.

	SF_TUPLE *t = new SF_TUPLE(2, "Fred, 2");
	SF_TABLE tab;
	tab.putAt(3, t);
        index.insert(t->at(i), 3);	

Mapping Names to Integers

An SF_INTEGER_MAPPING object maps one integer into another integer. This is used to map an index value from a name index to a column value for a table. This was created to all the original storage facility to have names for columns rather than using position numbers.

	SF_INTEGER_MAPPING nameToTableMapping;
	int index;
      	int nIdx = nameIndex.insert("Fred");, index);


The SF_OBJECT class is a base class for basic object types available in the storage facility system. Currently, there are two types, lexical and non-lexical.


Lexical Objects

An SF_LEXICAL object is an SF_OBJECT that is lexical. Lexical means that the value of the object also acts as its representation. In other database systems lexical objects are referred to as values. Thus "Fred", 1, 6948.234, etc. are considered lexical objects. In the storage facility, lexical values are distinguishable by enclosing double quotes.


Non-lexical Objects

An SF_NONLEXICAL object is an SF_OBJECT that is non-lexical. Lexical means that object has no external value. The only value it has is a unique identifier that is internal to the system.


Array Container

The SF_ARRAY class provides a generic, array container where each entry in the array can be any RefObject. An array has dynamic size reallocation.

	SF_ARRAY array;
	int found = 0;
	SF_LEXICAL l1("Mary"), l2("John"), l3("Christine");
	array.putAt(23, l3);
	array + l1;
	ITERATE(int i, array, !found) {
	  if (array[i] == l1) {
	    found = 1;
	FOR(i, array)>printOn(cout);
	int num = array.occurrencesOf(l3);
	int index = array.indexOf(l1);

Tuple Container

The SF_TUPLE class provides a container similar to SF_ARRAY except that the size of the tuple is restricted to be a certain size (number of elements) and the SF_TUPLE can be loaded by parsing an input string. The grammar for such a string is
    	 :=   |  , 
	 := : | 
	 := character string
	 := character string

where the tuple size is determined by the number of comma separated parts.

If the tuple is part of a named storage facility mechanism, then tuple entries can be referenced by name. Otherwise, they are integer referenced.

	char *input = "\"Fred\", 2, "\18 North Street\"";
	SF_TUPLE tuple1(3), tuple2(3, input);
	if (SF_TUPLE::equal(tuple1, tuple2))
	SF_LEXICAL lex("\"Fred\"");
	if (tuple2.contains(lex) == TRUE)

Table Container

A table is similar to a relational database table in it stores an array of tuples, each tuple having the same size. A SF_TABLE also has an inverted index on each row, making indexing more efficient.

The SF_Table class is very much similar to SF_ARRAY and SF_TUPLE. Every element stored is a tuple with the same tuple size (i.e., number of elements in the tuple). A SF_TABLE can also be loaded from a string or a stream. The grammar for a table is :

    	 :=   RETURN ~

where SF_TUPLE is parsed as described in the tuple description and RETURN means a carriage return.

Some operations on tables may or may not require that the tables are compatible. Table compatibility means that the column names are exactly the same. Compatibility can be retrieved and set with the setDefaultCompatibility() and getDefaultCompatibility() methods.

If the table is part of a named storage facility mechanism, then table operations on columns can be referenced by name. Otherwise, they are integer referenced.

	SF_TABLE tab1(2);
        tab1    + &(*new SF_TUPLE(2)
                + new SF_LEXICAL("Test 1")
                + new SF_NON_LEXICAL(1034)
    	SF_TUPLE *result = tab1.selectTupleOn(1, "A", new SF_NON_LEXICAL(1034));

Relational Algebra Table Operators

The relational table operators are subclasses of the SF_TABLE class that create a table after performing a relation algebra operation on an input table(s). This is done by using the constructor with parameters that specify the table to be operated on and additional information to the operator (e.g., row number).

Column names can be used for specifying the operation parameters if the operand tables have name mapping information. This usually comes from using a model instance as the source of the tables. If no name mapping information is available, then columns have to be specified with column numbers, 0 being the first column.

Relational Algebra Table Operators.
A table is produced which has been projected on the specified columns.
A table is produced with tuples that match selection parameters. Selection is based on equality with SF_OBJECT parameters. If a comparison other than equality is desired, a function can be passed in that specifies the selection criteria.
A table is produced based on the equi-join of two other tables. If no columns are specified as joining parameters then the cartesian product is performed. If a theta join is desired using an operator other than equality, then a function can be passed in which specifies the join criteria.
A table is produced which is the union of two other tables.
A table is produced which is the difference of two other tables.
A table is produced which is the intersection of two other tables.
A table is produced that uses the chi operator to perform some operation (a function) on the input table. The operation desired is passed in as a parameter to the SF_TABLE_CHI constructor. Column grouping is possible.

	SF_TABLE &t1 = model.getTable("Person has Name"),
		 &t2 = model.getTable("Person has SSN");
	SF_LEXICAL name("Fred Flinstone");
	SF_TABLE result(
				SF_TABLE_SELECTION(t1, 1, "Name", name),
			1, "Person"),
		t2, 1, "Person", "Person"));
	SF_TABLE_UNION t(anotherTable, fredsTable);
	SF_TABLE_JOIN lastone(t, t1, 2, 1, 3, 2, 4);

Defining Tables and Constraints

A scheme can be defined as a set of relational table definitions and a set of constraints. There are two types of table definitions and eight types of constraint definitions. The table definitions define the tuple size and name of each relational table in the SF_DATA_INSTANCE. The constraints are used to define varies constraints as well as to validate a data instance to see if it violates the constraint.


Future Work

Jeff Pinkston (