Introduction

This is the manual for the third version of the DATAdesign engine (actually the second version never existed). The idea behind writing the DATAdesign engine was to get a powerful, multi-user database management system. The current version contains more than 80 extensions in a DATAdesign.engine thing which should do just that.

Please note that this manual does presume some programming expertise from the reader.

DATAdesign is what some people would call a free-form database. This means that no restrictions (ahum, as few as possible) are posed upon the creator and user of database files which are manipulated with the DATAdesign engine.


Concepts

file

A file is the entity which includes a set of related data to be used by the DATAdesign engine. When we use the word file in this manual, we don't mean a file as in "a file on disk", but we mean a file which is used by the DATAdesign engine. The conventional usage for file won't be used a lot in this manual, and the conventional usage will be called medium-file. A file may be on disk (disk-based), or it may be completely in memory (memory-based). When a medium-file is not used by any job in memory (so there are no buffers using the file (see later)), then this medium-file will not be called a file. Files are referenced by (hopefully) unique filenames. These filenames are case-dependant.

buffer

A buffer is an entry-point to a file. It contains a copy of the current record. If the buffer is not a read-only buffer then the record will be locked, that is, unavailable to all other buffers using this file. All operations which read or change records are done through a buffer. This means that you don't actually change the record in the file. To copy a record back into the actual file, you have to implement it. This makes sure that the record in the file is an exact copy of the record in the buffer if you had changed it. It doesn't change anything to the buffer, only to the file. The buffer can also be cleared, so as to obtain a new record etc. Last but not least all operations need a bufferid as this is the only way to let the DATAdesign engine know which file is affected or queried by a certain operation (but there are defaults).

Each buffer has a special property, a bufferid which is unique and defines every buffer. Buffers can only be accessed by the job who created them.

Please note that if you create a buffer (by using or creating a file), you also have to release it, as it will otherwise clog up some part in memory. Even when the job which uses the buffer is released, the buffer will keep existing (and nobody can access is). This can be solved with garbage collection.

index

An index is a special entry-point to a buffer, and thus a file. Indexes are used only for file navigation and fast searching. Indexes are the only way in which you can sort a file, or filter it, that is, specify an order in which the record are available, and/or select which records are available and which are not. Indexes are however restricted by memory. This should not be a big problem as each entry in an index uses a maximum of 94 bytes, and usually much less (that is 14 bytes plus the bytes need for each sort level, being 2 bytes for char or word, 4 bytes for long, and 8 bytes for text or double).

Each index has a special property, an indexid which is unique and defines every index. Indexes always define the buffer which was passed when they were created, and can't be shared by buffers. Some commands may be passed an indexid instead of a bufferid.

record

Records are parts of a file which combine related data. If you go to a library and you want to find a book, you search the register, which is a database. In those libraries where you still have to find them manually, there will be a place where you can find a card for each book. Each of these cards is a record, and all the cards together are the file.

All record have a special property, a recordid which is unique and defines every record. No two records in a file can ever have the same recordid, and a recordid never changes, even if the record is changed. Furthermore, even after a record has been deleted, the recordid won't be assigned to another record for a long while. This ensures that a recordid is the safest way to make links between records of different files.

More information on the programming approach to records can be found under buffer and in the explanation of the specific commands.

A recordid is an long. All values are possible except -1, which is used to denote "not found" or other problems. Recordid's are assigned cyclic. So typically only one record per each cycle of 4*10^9 records which are created will get the same recordid.

field

Fields are subdivisions of records, and these subdivisions are available in all records. To use the example of the index at the library, the fields are the subdivisions of the cards, like author, title, publisher,...

field type

In the DATAdesign engine, all fields are typed. Five basic types are provided and these types should allow you to put any kind of data you want in a field. It is up to the author of a program to determine what a certain value in a field is supposed to mean.

All field types can be sorted except one. It is impossible to sort raw fields, as these can represent just about anything.

type  code  element size    usage
raw      1     1 byte       graphics, fonts, ...
char     2     1 byte       text
short    3     2 bytes      small integer values
                            selections, statusses, ...
long     4     4 bytes      large integer values
                            dates, ...
ieee     5     8 bytes      ieee double
                            any numerical value

file-status

This is a special property which each file has. It indicates whether a file is disk-based (1) or memory-based (0). By default all files are memory based when you create them. But that can be changed. There is no automatic switching between the two statusses.

disk-based

This file-status is included for two reasons. Firstly to allow for very long files to be used, even files which are much longer than memory will permit. However there is always an index with references to the place which has to fit in memory. (This shouldn't be a problem, every record only takes 18 bytes in this index). Secondly it is the safest way you can work on a file. Even if a system-crash would occur for some reason, then a maximum of one record will be lost. There are a few commands which are actually not that safe, but it will be mentioned when they are discussed.

inter-record-space

When a file is disk based and a change to a record makes that record grow a bit, then it would have to be moved to the end of the file as it would not fit in the medium-file at the old place. To prevent this from happening too often, you can make sure that there is always a bit of empty space after each record. This space is called the inter-record-space. This is not relevant when a file is memory-based.

Even when a large inter-record-space is used, it may not be always be enough. That way large empty gaps will be created in the file. These gaps can be removed with garbage collection. A large inter-record-space is not adviceable as it can waste a lot of disk space. We advice small values, e.g. 10.

lock

As DATAdesign is a fully multi-user database, it is necessary that records which are edited by a read-write buffer can't be accessed by other read-write buffers. So each record which is accessed by another read-write buffer is locked, unless this buffer is view only. View only records can always access all existing records.


PROGS, Professional & Graphical Software
last edited September 10, 1996