Forth 2012: BEGIN-STRUCTURE

( "<spaces>name" -- struct-sys 0 )

Skip leading space delimiters. Parse name delimited by a space. Create a definition for name with the execution semantics defined below. Return a struct-sys (zero or more implementation dependent items) that will be used by END-STRUCTURE and an initial offset of 0.

name Execution:

( -- +n )

+n is the size in memory expressed in address units of the data structure. An ambiguous condition exists if name is executed prior to the associated END-STRUCTURE being executed.

See:

10.6.2.0135 +FIELD, 10.6.2.1336 END-STRUCTURE,
A.10.6.2.0763 BEGIN-STRUCTURE.

Rationale:

There are two schools of thought regarding named data structures: name first and name last. The name last school can define a named data structure as follows:

0                         \ initial total byte count
   1 CELLS +FIELD p.x    \ A single cell filed named p.x
   1 CELLS +FIELD p.y    \ A single cell field named p.y
CONSTANT point          \ save structure size

While the name first school would define the same data structure as:

BEGIN-STRUCTURE point \ create the named structure
1 CELLS +FIELD p.x \ A single cell filed named p.x
1 CELLS +FIELD p.y \ A single cell field named p.y
END-STRUCTURE

Although many systems provide a name first structure there is no common practice to the words used. The words BEGIN-STRUCTURE and END-STRUCTURE have been defied as a means of providing a portable notation that does not conflict with existing systems.

The field defining words (xFIELD: and +FIELD) are defined so they can be used by both schools. Compatibility between the two schools comes from defining a new stack item struct-sys, which is implementation dependent and can be 0 or more cells. The name first school would provide an address (addr) as the struct-sys parameter, while the name last school would defined struct-sys as being empty.

Executing the name of the data structure, returns the size of the data structure. This allows the data stricture to be used within another data structure:

BEGIN-STRUCTURE point \ -- a-addr 0 ; -- lenp
   FIELD: p.x             \ -- a-addr cell
   FIELD: p.y             \ -- a-addr cell*2
END-STRUCTURE
BEGIN-STRUCTURE rect    \ -- a-addr 0 ; -- lenr
   point +FIELD r.tlhc    \ -- a-addr cell*2
   point +FIELD r.brhc    \ -- a-addr cell*4
END-STRUCTURE

Alignment:

In practice, structures are used for two different purposes with incompatible requirements:

For collecting related internal-use data into a convenient "package" that can be referred to by a single "handle". For this use, alignment is important, so that efficient native fetch and store instructions can be used.
For mapping external data structures like hardware register maps and protocol packets. For this use, automatic alignment is inappropriate, because the alignment of the external data structure often doesn't match the rules for a given processor.

Many languages cater for the first use, but ignore the second. This leads to various customized solutions, usage requirements, portability problems, bugs, etc. +FIELD is defined to be non-aligning, while the named field defining words (xFIELD:) are aligning. This is intentional and allows for both uses.

The standard currently defines an aligned field defining word for each of the standard data types:

CFIELD:	a character
FIELD:	a native integer (single cell)
FFIELD:	a native float
SFFIELD:	a 32 bit float
DFFIELD:	a 64 bit float

Although this is a sufficient set, most systems provide facilities to define field defining words for standard data types.

Future:

The following cannot be defined until the required addressing has been defined. The names should be considered reserved until then.

`BFIELD:`	1 byte (8 bit) field
`WFIELD:`	16 bit field
`LFIELD:`	32 bit field
`XFIELD:`	64 bit field

Implementation:

Begin definition of a new structure. Use in the form BEGIN-STRUCTURE <name>. At run time <name> returns the size of the structure.

: BEGIN-STRUCTURE  \ -- addr 0 ; -- size
   CREATE
     HERE 0 0 ,      \ mark stack, lay dummy
   DOES> @             \ -- rec-len
;