Normalization is the process of compiling libs to a namespace. Remember that every def is modeled as a dict (just a normal collection of name/value pairs). The defs packaged inside libs are called declared defs. Normalization will modify the tags of each def yielding a new dict which we call the effective def. Effective defs are used to compute documentation, reflection, def aware filters, etc.
Note: normalization is only required for software which wishes to build a namespace from source libs. Pre-normalized defs can be downloaded from Project-Haystack. These downloads may also be used as test cases to verify your own normalization software.
The process to compile libs to a namespace is composed of the following ordered steps:
Each of these steps is discussed in detail in the following sections.
The input to normalization is list of lib files. These are the zip files that contain the declared defs. Each zip file is opened and searched for applicable Trio files in the "lib/" directory. There must be exactly one dict defined in "lib/lib.trio" with the lib's meta.
Every input Trio file found during scan is parsed to discover the declared def dicts identified with the def
tag. This phase is also used to discover our def extensions which are identified by the defx
tag. Every dict parsed must have either the def
or defx
tag and the value must be a symbol. During this phase we detect and report illegal duplicate symbols - a given symbol must be mapped to only one def in all the input libs.
During this phase we walk every tag in every def and defx to resolve symbolic references. For each def, the following must resolve in its lib namespace:
We compute each lib namespace by computing only the symbols in the lib's scope. These are the symbols defined by the lib itself and those imported via its includes. Includes are not transitive - a given include does not imply including their referenced lib's includes.
Example:
// lib alpha def: ^lib:alpha includes: [^lib:ph] -- def: ^alphaTag is: ^marker // lib beta def: ^lib:beta includes: [^lib:ph, ^lib:alpha] -- def: ^betaTag is: ^marker // lib gamma def: ^lib:beta includes: [^lib:ph, ^lib:beta]
The lib namespace to use for resolution would be as follows:
alpha: all symbols in ph, alphaTag (in its own lib) beta: all symbols in ph, betaTag (in its own lib), alphaTag (from include) gamma: all symbols in ph, betaTag (from include)
Note that gamma does not have visibility to alphaTag because alpha is not explicitly included.
Resolution is performed on def and defx dicts separately. This allows defx dicts to reference symbols which aren't in scope of the source def.
All of the following steps require knowing the taxonomy tree to determine each tag's supertypes. This phase should walk through each def and recusively derive its supertypes based on each def's is
tag. Specifically:
is
on feature key defsis
tag with exception of following root tags: marker, val, and featureExample:
// before def: ^filetype:json mime: "application/json" // after def: ^filetype:json mime: "application/json" is: [^filetype]
After the resolve and taxonify steps complete, we can apply our defx extensions to their respective defs. Each defx is used to add one or more tags to a source def identified by the defx tag. The defx mechanism provides for late binding of def metadata. Just like RDF allows anyone to add a triple to a given subject, anyone can add tags to a def using a defx.
Every defx must reference a def within its scope and must only add new tags. It is illegal for a defx to specify a tag declared by the def itself or by another defx. The exception to this rule is tags annotated as accumulate which should be aggregated into a list.
Here is example to illustrate:
// source def def: ^date is: ^scalar // two defx dicts declared in separate libs defx: ^date acmeTerm: "ISODate" -- defx: ^date wombatFormatter: "DateFormatter" // after defx are merged, effective def is def: ^date is: ^scalar acmeTerm: "ISODate" wombatFormatter: "DateFormatter"
This step normalizes specific tags according to the following rules:
Every def has an inferred lib tag which is a reference to its declaring lib. It is invalid for any def to declare its own lib
tag. The lib meta itself also receives a lib
tag referencing itself. Example:
// before (within phIoT lib) def: ^equip is: ^entity // after def: ^equip is: ^entity lib: ^lib:phIoT
As a convenience tags such as is and tagOn can use a single symbol instead of list of symbols. However, their normalized representation must always be a list. This phase should iterate every def and normalize any tag which subtypes from list. Example:
// before def: ^geoCity is: ^str tagOn: ^geoPlace // after def: ^geoCity is: [^str] tagOn: [^geoPlace]
The inherit phase applies tag inheritance from supertypes:
is
tagIn the case of ambiguity via multiple inheritance, a subtype should explicitly declare the tag value.
Here is a fictitious example for an El Camino which is a hybrid between a car and a pickup truck:
// declarations def: ^car numDoors: 4 color: "red" engine: "V8" ---- def: ^pickup numDoors: 2 color: "blue" bedLength: 80in ---- def: ^elCamino is: [^pickup, ^car] color: "purple"
The normalized definition with inheritance would be:
def: ^elCamino // declared is: [^pickup, ^car] // declared color: "purple" // declared numDoors: 2 // inherited from pickup first engine: "V8" // inherited from car bedLength: 80in // inherited from pickup
Before generation, normalization software should validate tag rules to flag errors not caught in previous steps:
lib
meta def has required tagsindex
which is reserved for documentation purposesof
is subtype of markertagOn
only used on a tag defs (not conjuncts or feature keys)relationship
tags are only used on defs which subtype from refOnce all previous steps have completed successfully, we can generate our namespace. From a logical perspective a namespace is a map of symbols to effective def dicts. In actual implementations this probably yields specific data structures. These data structures can then be used to perform additional computations such as reflection and def aware filters.