]>
Commit | Line | Data |
---|---|---|
15c0b25d AP |
1 | # HCL Syntax-Agnostic Information Model |
2 | ||
3 | This is the specification for the general information model (abstract types and | |
4 | semantics) for hcl. HCL is a system for defining configuration languages for | |
5 | applications. The HCL information model is designed to support multiple | |
6 | concrete syntaxes for configuration, each with a mapping to the model defined | |
7 | in this specification. | |
8 | ||
9 | The two primary syntaxes intended for use in conjunction with this model are | |
10 | [the HCL native syntax](./hclsyntax/spec.md) and [the JSON syntax](./json/spec.md). | |
11 | In principle other syntaxes are possible as long as either their language model | |
12 | is sufficiently rich to express the concepts described in this specification | |
13 | or the language targets a well-defined subset of the specification. | |
14 | ||
15 | ## Structural Elements | |
16 | ||
17 | The primary structural element is the _body_, which is a container representing | |
18 | a set of zero or more _attributes_ and a set of zero or more _blocks_. | |
19 | ||
20 | A _configuration file_ is the top-level object, and will usually be produced | |
21 | by reading a file from disk and parsing it as a particular syntax. A | |
22 | configuration file has its own _body_, representing the top-level attributes | |
23 | and blocks. | |
24 | ||
25 | An _attribute_ is a name and value pair associated with a body. Attribute names | |
26 | are unique within a given body. Attribute values are provided as _expressions_, | |
27 | which are discussed in detail in a later section. | |
28 | ||
29 | A _block_ is a nested structure that has a _type name_, zero or more string | |
30 | _labels_ (e.g. identifiers), and a nested body. | |
31 | ||
107c1cdb | 32 | Together the structural elements create a hierarchical data structure, with |
15c0b25d AP |
33 | attributes intended to represent the direct properties of a particular object |
34 | in the calling application, and blocks intended to represent child objects | |
35 | of a particular object. | |
36 | ||
37 | ## Body Content | |
38 | ||
39 | To support the expression of the HCL concepts in languages whose information | |
40 | model is a subset of HCL's, such as JSON, a _body_ is an opaque container | |
41 | whose content can only be accessed by providing information on the expected | |
42 | structure of the content. | |
43 | ||
44 | The specification for each syntax must describe how its physical constructs | |
45 | are mapped on to body content given a schema. For syntaxes that have | |
46 | first-class syntax distinguishing attributes and bodies this can be relatively | |
47 | straightforward, while more detailed mapping rules may be required in syntaxes | |
48 | where the representation of attributes vs. blocks is ambiguous. | |
49 | ||
50 | ### Schema-driven Processing | |
51 | ||
52 | Schema-driven processing is the primary way to access body content. | |
53 | A _body schema_ is a description of what is expected within a particular body, | |
54 | which can then be used to extract the _body content_, which then provides | |
55 | access to the specific attributes and blocks requested. | |
56 | ||
57 | A _body schema_ consists of a list of _attribute schemata_ and | |
58 | _block header schemata_: | |
59 | ||
107c1cdb | 60 | - An _attribute schema_ provides the name of an attribute and whether its |
15c0b25d AP |
61 | presence is required. |
62 | ||
107c1cdb | 63 | - A _block header schema_ provides a block type name and the semantic names |
15c0b25d AP |
64 | assigned to each of the labels of that block type, if any. |
65 | ||
66 | Within a schema, it is an error to request the same attribute name twice or | |
67 | to request a block type whose name is also an attribute name. While this can | |
68 | in principle be supported in some syntaxes, in other syntaxes the attribute | |
69 | and block namespaces are combined and so an an attribute cannot coexist with | |
70 | a block whose type name is identical to the attribute name. | |
71 | ||
72 | The result of applying a body schema to a body is _body content_, which | |
73 | consists of an _attribute map_ and a _block sequence_: | |
74 | ||
107c1cdb | 75 | - The _attribute map_ is a map data structure whose keys are attribute names |
15c0b25d AP |
76 | and whose values are _expressions_ that represent the corresponding attribute |
77 | values. | |
78 | ||
107c1cdb | 79 | - The _block sequence_ is an ordered sequence of blocks, with each specifying |
15c0b25d AP |
80 | a block _type name_, the sequence of _labels_ specified for the block, |
81 | and the body object (not body _content_) representing the block's own body. | |
82 | ||
83 | After obtaining _body content_, the calling application may continue processing | |
84 | by evaluating attribute expressions and/or recursively applying further | |
85 | schema-driven processing to the child block bodies. | |
86 | ||
87 | **Note:** The _body schema_ is intentionally minimal, to reduce the set of | |
88 | mapping rules that must be defined for each syntax. Higher-level utility | |
89 | libraries may be provided to assist in the construction of a schema and | |
90 | perform additional processing, such as automatically evaluating attribute | |
91 | expressions and assigning their result values into a data structure, or | |
92 | recursively applying a schema to child blocks. Such utilities are not part of | |
93 | this core specification and will vary depending on the capabilities and idiom | |
94 | of the implementation language. | |
95 | ||
96 | ### _Dynamic Attributes_ Processing | |
97 | ||
98 | The _schema-driven_ processing model is useful when the expected structure | |
99 | of a body is known a priori by the calling application. Some blocks are | |
100 | instead more free-form, such as a user-provided set of arbitrary key/value | |
101 | pairs. | |
102 | ||
103 | The alternative _dynamic attributes_ processing mode allows for this more | |
104 | ad-hoc approach. Processing in this mode behaves as if a schema had been | |
105 | constructed without any _block header schemata_ and with an attribute | |
106 | schema for each distinct key provided within the physical representation | |
107 | of the body. | |
108 | ||
109 | The means by which _distinct keys_ are identified is dependent on the | |
110 | physical syntax; this processing mode assumes that the syntax has a way | |
111 | to enumerate keys provided by the author and identify expressions that | |
112 | correspond with those keys, but does not define the means by which this is | |
113 | done. | |
114 | ||
115 | The result of _dynamic attributes_ processing is an _attribute map_ as | |
116 | defined in the previous section. No _block sequence_ is produced in this | |
117 | processing mode. | |
118 | ||
119 | ### Partial Processing of Body Content | |
120 | ||
121 | Under _schema-driven processing_, by default the given schema is assumed | |
122 | to be exhaustive, such that any attribute or block not matched by schema | |
123 | elements is considered an error. This allows feedback about unsupported | |
124 | attributes and blocks (such as typos) to be provided. | |
125 | ||
126 | An alternative is _partial processing_, where any additional elements within | |
127 | the body are not considered an error. | |
128 | ||
129 | Under partial processing, the result is both body content as described | |
130 | above _and_ a new body that represents any body elements that remain after | |
131 | the schema has been processed. | |
132 | ||
133 | Specifically: | |
134 | ||
107c1cdb | 135 | - Any attribute whose name is specified in the schema is returned in body |
15c0b25d AP |
136 | content and elided from the new body. |
137 | ||
107c1cdb | 138 | - Any block whose type is specified in the schema is returned in body content |
15c0b25d AP |
139 | and elided from the new body. |
140 | ||
107c1cdb | 141 | - Any attribute or block _not_ meeting the above conditions is placed into |
15c0b25d AP |
142 | the new body, unmodified. |
143 | ||
144 | The new body can then be recursively processed using any of the body | |
145 | processing models. This facility allows different subsets of body content | |
146 | to be processed by different parts of the calling application. | |
147 | ||
148 | Processing a body in two steps — first partial processing of a source body, | |
149 | then exhaustive processing of the returned body — is equivalent to single-step | |
150 | processing with a schema that is the union of the schemata used | |
151 | across the two steps. | |
152 | ||
153 | ## Expressions | |
154 | ||
155 | Attribute values are represented by _expressions_. Depending on the concrete | |
156 | syntax in use, an expression may just be a literal value or it may describe | |
157 | a computation in terms of literal values, variables, and functions. | |
158 | ||
159 | Each syntax defines its own representation of expressions. For syntaxes based | |
160 | in languages that do not have any non-literal expression syntax, it is | |
161 | recommended to embed the template language from | |
162 | [the native syntax](./hclsyntax/spec.md) e.g. as a post-processing step on | |
163 | string literals. | |
164 | ||
165 | ### Expression Evaluation | |
166 | ||
167 | In order to obtain a concrete value, each expression must be _evaluated_. | |
168 | Evaluation is performed in terms of an evaluation context, which | |
169 | consists of the following: | |
170 | ||
107c1cdb ND |
171 | - An _evaluation mode_, which is defined below. |
172 | - A _variable scope_, which provides a set of named variables for use in | |
15c0b25d | 173 | expressions. |
107c1cdb | 174 | - A _function table_, which provides a set of named functions for use in |
15c0b25d AP |
175 | expressions. |
176 | ||
177 | The _evaluation mode_ allows for two different interpretations of an | |
178 | expression: | |
179 | ||
107c1cdb | 180 | - In _literal-only mode_, variables and functions are not available and it |
15c0b25d AP |
181 | is assumed that the calling application's intent is to treat the attribute |
182 | value as a literal. | |
183 | ||
107c1cdb | 184 | - In _full expression mode_, variables and functions are defined and it is |
15c0b25d AP |
185 | assumed that the calling application wishes to provide a full expression |
186 | language for definition of the attribute value. | |
187 | ||
188 | The actual behavior of these two modes depends on the syntax in use. For | |
189 | languages with first-class expression syntax, these two modes may be considered | |
190 | equivalent, with _literal-only mode_ simply not defining any variables or | |
191 | functions. For languages that embed arbitrary expressions via string templates, | |
192 | _literal-only mode_ may disable such processing, allowing literal strings to | |
193 | pass through without interpretation as templates. | |
194 | ||
195 | Since literal-only mode does not support variables and functions, it is an | |
196 | error for the calling application to enable this mode and yet provide a | |
197 | variable scope and/or function table. | |
198 | ||
199 | ## Values and Value Types | |
200 | ||
201 | The result of expression evaluation is a _value_. Each value has a _type_, | |
202 | which is dynamically determined during evaluation. The _variable scope_ in | |
203 | the evaluation context is a map from variable name to value, using the same | |
204 | definition of value. | |
205 | ||
206 | The type system for HCL values is intended to be of a level abstraction | |
207 | suitable for configuration of various applications. A well-defined, | |
208 | implementation-language-agnostic type system is defined to allow for | |
209 | consistent processing of configuration across many implementation languages. | |
210 | Concrete implementations may provide additional functionality to lower | |
211 | HCL values and types to corresponding native language types, which may then | |
212 | impose additional constraints on the values outside of the scope of this | |
213 | specification. | |
214 | ||
215 | Two values are _equal_ if and only if they have identical types and their | |
216 | values are equal according to the rules of their shared type. | |
217 | ||
218 | ### Primitive Types | |
219 | ||
220 | The primitive types are _string_, _bool_, and _number_. | |
221 | ||
222 | A _string_ is a sequence of unicode characters. Two strings are equal if | |
223 | NFC normalization ([UAX#15](http://unicode.org/reports/tr15/) | |
224 | of each string produces two identical sequences of characters. | |
225 | NFC normalization ensures that, for example, a precomposed combination of a | |
226 | latin letter and a diacritic compares equal with the letter followed by | |
227 | a combining diacritic. | |
228 | ||
229 | The _bool_ type has only two non-null values: _true_ and _false_. Two bool | |
230 | values are equal if and only if they are either both true or both false. | |
231 | ||
232 | A _number_ is an arbitrary-precision floating point value. An implementation | |
233 | _must_ make the full-precision values available to the calling application | |
234 | for interpretation into any suitable number representation. An implementation | |
235 | may in practice implement numbers with limited precision so long as the | |
236 | following constraints are met: | |
237 | ||
107c1cdb ND |
238 | - Integers are represented with at least 256 bits. |
239 | - Non-integer numbers are represented as floating point values with a | |
15c0b25d AP |
240 | mantissa of at least 256 bits and a signed binary exponent of at least |
241 | 16 bits. | |
107c1cdb | 242 | - An error is produced if an integer value given in source cannot be |
15c0b25d | 243 | represented precisely. |
107c1cdb | 244 | - An error is produced if a non-integer value cannot be represented due to |
15c0b25d | 245 | overflow. |
107c1cdb | 246 | - A non-integer number is rounded to the nearest possible value when a |
15c0b25d AP |
247 | value is of too high a precision to be represented. |
248 | ||
249 | The _number_ type also requires representation of both positive and negative | |
250 | infinity. A "not a number" (NaN) value is _not_ provided nor used. | |
251 | ||
252 | Two number values are equal if they are numerically equal to the precision | |
253 | associated with the number. Positive infinity and negative infinity are | |
254 | equal to themselves but not to each other. Positive infinity is greater than | |
255 | any other number value, and negative infinity is less than any other number | |
256 | value. | |
257 | ||
258 | Some syntaxes may be unable to represent numeric literals of arbitrary | |
259 | precision. This must be defined in the syntax specification as part of its | |
260 | description of mapping numeric literals to HCL values. | |
261 | ||
262 | ### Structural Types | |
263 | ||
264 | _Structural types_ are types that are constructed by combining other types. | |
265 | Each distinct combination of other types is itself a distinct type. There | |
266 | are two structural type _kinds_: | |
267 | ||
107c1cdb | 268 | - _Object types_ are constructed of a set of named attributes, each of which |
15c0b25d AP |
269 | has a type. Attribute names are always strings. (_Object_ attributes are a |
270 | distinct idea from _body_ attributes, though calling applications | |
271 | may choose to blur the distinction by use of common naming schemes.) | |
107c1cdb | 272 | - _Tuple types_ are constructed of a sequence of elements, each of which |
15c0b25d AP |
273 | has a type. |
274 | ||
275 | Values of structural types are compared for equality in terms of their | |
276 | attributes or elements. A structural type value is equal to another if and | |
277 | only if all of the corresponding attributes or elements are equal. | |
278 | ||
279 | Two structural types are identical if they are of the same kind and | |
280 | have attributes or elements with identical types. | |
281 | ||
282 | ### Collection Types | |
283 | ||
284 | _Collection types_ are types that combine together an arbitrary number of | |
285 | values of some other single type. There are three collection type _kinds_: | |
286 | ||
107c1cdb ND |
287 | - _List types_ represent ordered sequences of values of their element type. |
288 | - _Map types_ represent values of their element type accessed via string keys. | |
289 | - _Set types_ represent unordered sets of distinct values of their element type. | |
15c0b25d AP |
290 | |
291 | For each of these kinds and each distinct element type there is a distinct | |
292 | collection type. For example, "list of string" is a distinct type from | |
293 | "set of string", and "list of number" is a distinct type from "list of string". | |
294 | ||
295 | Values of collection types are compared for equality in terms of their | |
296 | elements. A collection type value is equal to another if and only if both | |
297 | have the same number of elements and their corresponding elements are equal. | |
298 | ||
299 | Two collection types are identical if they are of the same kind and have | |
300 | the same element type. | |
301 | ||
302 | ### Null values | |
303 | ||
107c1cdb | 304 | Each type has a null value. The null value of a type represents the absence |
15c0b25d AP |
305 | of a value, but with type information retained to allow for type checking. |
306 | ||
107c1cdb | 307 | Null values are used primarily to represent the conditional absence of a |
15c0b25d AP |
308 | body attribute. In a syntax with a conditional operator, one of the result |
309 | values of that conditional may be null to indicate that the attribute should be | |
310 | considered not present in that case. | |
311 | ||
312 | Calling applications _should_ consider an attribute with a null value as | |
313 | equivalent to the value not being present at all. | |
314 | ||
315 | A null value of a particular type is equal to itself. | |
316 | ||
317 | ### Unknown Values and the Dynamic Pseudo-type | |
318 | ||
319 | An _unknown value_ is a placeholder for a value that is not yet known. | |
320 | Operations on unknown values themselves return unknown values that have a | |
321 | type appropriate to the operation. For example, adding together two unknown | |
322 | numbers yields an unknown number, while comparing two unknown values of any | |
323 | type for equality yields an unknown bool. | |
324 | ||
325 | Each type has a distinct unknown value. For example, an unknown _number_ is | |
326 | a distinct value from an unknown _string_. | |
327 | ||
328 | _The dynamic pseudo-type_ is a placeholder for a type that is not yet known. | |
329 | The only values of this type are its null value and its unknown value. It is | |
330 | referred to as a _pseudo-type_ because it should not be considered a type in | |
331 | its own right, but rather as a placeholder for a type yet to be established. | |
332 | The unknown value of the dynamic pseudo-type is referred to as _the dynamic | |
333 | value_. | |
334 | ||
335 | Operations on values of the dynamic pseudo-type behave as if it is a value | |
336 | of the expected type, optimistically assuming that once the value and type | |
337 | are known they will be valid for the operation. For example, adding together | |
338 | a number and the dynamic value produces an unknown number. | |
339 | ||
340 | Unknown values and the dynamic pseudo-type can be used as a mechanism for | |
341 | partial type checking and semantic checking: by evaluating an expression with | |
342 | all variables set to an unknown value, the expression can be evaluated to | |
343 | produce an unknown value of a given type, or produce an error if any operation | |
344 | is provably invalid with only type information. | |
345 | ||
346 | Unknown values and the dynamic pseudo-type must never be returned from | |
347 | operations unless at least one operand is unknown or dynamic. Calling | |
348 | applications are guaranteed that unless the global scope includes unknown | |
349 | values, or the function table includes functions that return unknown values, | |
350 | no expression will evaluate to an unknown value. The calling application is | |
351 | thus in total control over the use and meaning of unknown values. | |
352 | ||
353 | The dynamic pseudo-type is identical only to itself. | |
354 | ||
355 | ### Capsule Types | |
356 | ||
357 | A _capsule type_ is a custom type defined by the calling application. A value | |
358 | of a capsule type is considered opaque to HCL, but may be accepted | |
359 | by functions provided by the calling application. | |
360 | ||
361 | A particular capsule type is identical only to itself. The equality of two | |
362 | values of the same capsule type is defined by the calling application. No | |
363 | other operations are supported for values of capsule types. | |
364 | ||
365 | Support for capsule types in a HCL implementation is optional. Capsule types | |
366 | are intended to allow calling applications to pass through values that are | |
367 | not part of the standard type system. For example, an application that | |
368 | deals with raw binary data may define a capsule type representing a byte | |
369 | array, and provide functions that produce or operate on byte arrays. | |
370 | ||
371 | ### Type Specifications | |
372 | ||
373 | In certain situations it is necessary to define expectations about the expected | |
374 | type of a value. Whereas two _types_ have a commutative _identity_ relationship, | |
375 | a type has a non-commutative _matches_ relationship with a _type specification_. | |
376 | A type specification is, in practice, just a different interpretation of a | |
377 | type such that: | |
378 | ||
107c1cdb | 379 | - Any type _matches_ any type that it is identical to. |
15c0b25d | 380 | |
107c1cdb | 381 | - Any type _matches_ the dynamic pseudo-type. |
15c0b25d AP |
382 | |
383 | For example, given a type specification "list of dynamic pseudo-type", the | |
384 | concrete types "list of string" and "list of map" match, but the | |
385 | type "set of string" does not. | |
386 | ||
387 | ## Functions and Function Calls | |
388 | ||
389 | The evaluation context used to evaluate an expression includes a function | |
390 | table, which represents an application-defined set of named functions | |
391 | available for use in expressions. | |
392 | ||
393 | Each syntax defines whether function calls are supported and how they are | |
394 | physically represented in source code, but the semantics of function calls are | |
395 | defined here to ensure consistent results across syntaxes and to allow | |
396 | applications to provide functions that are interoperable with all syntaxes. | |
397 | ||
398 | A _function_ is defined from the following elements: | |
399 | ||
107c1cdb | 400 | - Zero or more _positional parameters_, each with a name used for documentation, |
15c0b25d AP |
401 | a type specification for expected argument values, and a flag for whether |
402 | each of null values, unknown values, and values of the dynamic pseudo-type | |
403 | are accepted. | |
404 | ||
107c1cdb | 405 | - Zero or one _variadic parameters_, with the same structure as the _positional_ |
15c0b25d AP |
406 | parameters, which if present collects any additional arguments provided at |
407 | the function call site. | |
408 | ||
107c1cdb | 409 | - A _result type definition_, which specifies the value type returned for each |
15c0b25d AP |
410 | valid sequence of argument values. |
411 | ||
107c1cdb | 412 | - A _result value definition_, which specifies the value returned for each |
15c0b25d AP |
413 | valid sequence of argument values. |
414 | ||
415 | A _function call_, regardless of source syntax, consists of a sequence of | |
416 | argument values. The argument values are each mapped to a corresponding | |
417 | parameter as follows: | |
418 | ||
107c1cdb | 419 | - For each of the function's positional parameters in sequence, take the next |
15c0b25d AP |
420 | argument. If there are no more arguments, the call is erroneous. |
421 | ||
107c1cdb | 422 | - If the function has a variadic parameter, take all remaining arguments that |
15c0b25d AP |
423 | where not yet assigned to a positional parameter and collect them into |
424 | a sequence of variadic arguments that each correspond to the variadic | |
425 | parameter. | |
426 | ||
107c1cdb | 427 | - If the function has _no_ variadic parameter, it is an error if any arguments |
15c0b25d AP |
428 | remain after taking one argument for each positional parameter. |
429 | ||
430 | After mapping each argument to a parameter, semantic checking proceeds | |
431 | for each argument: | |
432 | ||
107c1cdb | 433 | - If the argument value corresponding to a parameter does not match the |
15c0b25d AP |
434 | parameter's type specification, the call is erroneous. |
435 | ||
107c1cdb | 436 | - If the argument value corresponding to a parameter is null and the parameter |
15c0b25d AP |
437 | is not specified as accepting nulls, the call is erroneous. |
438 | ||
107c1cdb | 439 | - If the argument value corresponding to a parameter is the dynamic value |
15c0b25d AP |
440 | and the parameter is not specified as accepting values of the dynamic |
441 | pseudo-type, the call is valid but its _result type_ is forced to be the | |
442 | dynamic pseudo type. | |
443 | ||
107c1cdb | 444 | - If neither of the above conditions holds for any argument, the call is |
15c0b25d AP |
445 | valid and the function's value type definition is used to determine the |
446 | call's _result type_. A function _may_ vary its result type depending on | |
447 | the argument _values_ as well as the argument _types_; for example, a | |
448 | function that decodes a JSON value will return a different result type | |
449 | depending on the data structure described by the given JSON source code. | |
450 | ||
451 | If semantic checking succeeds without error, the call is _executed_: | |
452 | ||
107c1cdb | 453 | - For each argument, if its value is unknown and its corresponding parameter |
15c0b25d AP |
454 | is not specified as accepting unknowns, the _result value_ is forced to be an |
455 | unknown value of the result type. | |
456 | ||
107c1cdb | 457 | - If the previous condition does not apply, the function's result value |
15c0b25d AP |
458 | definition is used to determine the call's _result value_. |
459 | ||
460 | The result of a function call expression is either an error, if one of the | |
107c1cdb | 461 | erroneous conditions above applies, or the _result value_. |
15c0b25d AP |
462 | |
463 | ## Type Conversions and Unification | |
464 | ||
465 | Values given in configuration may not always match the expectations of the | |
466 | operations applied to them or to the calling application. In such situations, | |
467 | automatic type conversion is attempted as a convenience to the user. | |
468 | ||
469 | Along with conversions to a _specified_ type, it is sometimes necessary to | |
470 | ensure that a selection of values are all of the _same_ type, without any | |
471 | constraint on which type that is. This is the process of _type unification_, | |
472 | which attempts to find the most general type that all of the given types can | |
473 | be converted to. | |
474 | ||
475 | Both type conversions and unification are defined in the syntax-agnostic | |
476 | model to ensure consistency of behavior between syntaxes. | |
477 | ||
478 | Type conversions are broadly characterized into two categories: _safe_ and | |
479 | _unsafe_. A conversion is "safe" if any distinct value of the source type | |
480 | has a corresponding distinct value in the target type. A conversion is | |
481 | "unsafe" if either the target type values are _not_ distinct (information | |
482 | may be lost in conversion) or if some values of the source type do not have | |
483 | any corresponding value in the target type. An unsafe conversion may result | |
484 | in an error. | |
485 | ||
486 | A given type can always be converted to itself, which is a no-op. | |
487 | ||
488 | ### Conversion of Null Values | |
489 | ||
490 | All null values are safely convertable to a null value of any other type, | |
491 | regardless of other type-specific rules specified in the sections below. | |
492 | ||
493 | ### Conversion to and from the Dynamic Pseudo-type | |
494 | ||
495 | Conversion _from_ the dynamic pseudo-type _to_ any other type always succeeds, | |
496 | producing an unknown value of the target type. | |
497 | ||
498 | Conversion of any value _to_ the dynamic pseudo-type is a no-op. The result | |
499 | is the input value, verbatim. This is the only situation where the conversion | |
500 | result value is not of the the given target type. | |
501 | ||
502 | ### Primitive Type Conversions | |
503 | ||
504 | Bidirectional conversions are available between the string and number types, | |
505 | and between the string and boolean types. | |
506 | ||
507 | The bool value true corresponds to the string containing the characters "true", | |
107c1cdb | 508 | while the bool value false corresponds to the string containing the characters |
15c0b25d AP |
509 | "false". Conversion from bool to string is safe, while the converse is |
510 | unsafe. The strings "1" and "0" are alternative string representations | |
511 | of true and false respectively. It is an error to convert a string other than | |
512 | the four in this paragraph to type bool. | |
513 | ||
514 | A number value is converted to string by translating its integer portion | |
515 | into a sequence of decimal digits (`0` through `9`), and then if it has a | |
516 | non-zero fractional part, a period `.` followed by a sequence of decimal | |
517 | digits representing its fractional part. No exponent portion is included. | |
518 | The number is converted at its full precision. Conversion from number to | |
519 | string is safe. | |
520 | ||
521 | A string is converted to a number value by reversing the above mapping. | |
522 | No exponent portion is allowed. Conversion from string to number is unsafe. | |
523 | It is an error to convert a string that does not comply with the expected | |
524 | syntax to type number. | |
525 | ||
526 | No direct conversion is available between the bool and number types. | |
527 | ||
528 | ### Collection and Structural Type Conversions | |
529 | ||
530 | Conversion from set types to list types is _safe_, as long as their | |
531 | element types are safely convertable. If the element types are _unsafely_ | |
532 | convertable, then the collection conversion is also unsafe. Each set element | |
533 | becomes a corresponding list element, in an undefined order. Although no | |
534 | particular ordering is required, implementations _should_ produce list | |
535 | elements in a consistent order for a given input set, as a convenience | |
536 | to calling applications. | |
537 | ||
538 | Conversion from list types to set types is _unsafe_, as long as their element | |
539 | types are convertable. Each distinct list item becomes a distinct set item. | |
540 | If two list items are equal, one of the two is lost in the conversion. | |
541 | ||
542 | Conversion from tuple types to list types permitted if all of the | |
543 | tuple element types are convertable to the target list element type. | |
544 | The safety of the conversion depends on the safety of each of the element | |
545 | conversions. Each element in turn is converted to the list element type, | |
546 | producing a list of identical length. | |
547 | ||
548 | Conversion from tuple types to set types is permitted, behaving as if the | |
549 | tuple type was first converted to a list of the same element type and then | |
550 | that list converted to the target set type. | |
551 | ||
552 | Conversion from object types to map types is permitted if all of the object | |
553 | attribute types are convertable to the target map element type. The safety | |
554 | of the conversion depends on the safety of each of the attribute conversions. | |
555 | Each attribute in turn is converted to the map element type, and map element | |
556 | keys are set to the name of each corresponding object attribute. | |
557 | ||
558 | Conversion from list and set types to tuple types is permitted, following | |
559 | the opposite steps as the converse conversions. Such conversions are _unsafe_. | |
560 | It is an error to convert a list or set to a tuple type whose number of | |
561 | elements does not match the list or set length. | |
562 | ||
563 | Conversion from map types to object types is permitted if each map key | |
564 | corresponds to an attribute in the target object type. It is an error to | |
565 | convert from a map value whose set of keys does not exactly match the target | |
566 | type's attributes. The conversion takes the opposite steps of the converse | |
567 | conversion. | |
568 | ||
569 | Conversion from one object type to another is permitted as long as the | |
570 | common attribute names have convertable types. Any attribute present in the | |
571 | target type but not in the source type is populated with a null value of | |
572 | the appropriate type. | |
573 | ||
574 | Conversion from one tuple type to another is permitted as long as the | |
575 | tuples have the same length and the elements have convertable types. | |
576 | ||
577 | ### Type Unification | |
578 | ||
579 | Type unification is an operation that takes a list of types and attempts | |
580 | to find a single type to which they can all be converted. Since some | |
581 | type pairs have bidirectional conversions, preference is given to _safe_ | |
582 | conversions. In technical terms, all possible types are arranged into | |
583 | a lattice, from which a most general supertype is selected where possible. | |
584 | ||
585 | The type resulting from type unification may be one of the input types, or | |
586 | it may be an entirely new type produced by combination of two or more | |
587 | input types. | |
588 | ||
589 | The following rules do not guarantee a valid result. In addition to these | |
590 | rules, unification fails if any of the given types are not convertable | |
591 | (per the above rules) to the selected result type. | |
592 | ||
593 | The following unification rules apply transitively. That is, if a rule is | |
594 | defined from A to B, and one from B to C, then A can unify to C. | |
595 | ||
596 | Number and bool types both unify with string by preferring string. | |
597 | ||
598 | Two collection types of the same kind unify according to the unification | |
599 | of their element types. | |
600 | ||
601 | List and set types unify by preferring the list type. | |
602 | ||
603 | Map and object types unify by preferring the object type. | |
604 | ||
605 | List, set and tuple types unify by preferring the tuple type. | |
606 | ||
607 | The dynamic pseudo-type unifies with any other type by selecting that other | |
608 | type. The dynamic pseudo-type is the result type only if _all_ input types | |
609 | are the dynamic pseudo-type. | |
610 | ||
611 | Two object types unify by constructing a new type whose attributes are | |
612 | the union of those of the two input types. Any common attributes themselves | |
613 | have their types unified. | |
614 | ||
615 | Two tuple types of the same length unify constructing a new type of the | |
616 | same length whose elements are the unification of the corresponding elements | |
617 | in the two input types. | |
618 | ||
619 | ## Static Analysis | |
620 | ||
621 | In most applications, full expression evaluation is sufficient for understanding | |
622 | the provided configuration. However, some specialized applications require more | |
623 | direct access to the physical structures in the expressions, which can for | |
624 | example allow the construction of new language constructs in terms of the | |
625 | existing syntax elements. | |
626 | ||
627 | Since static analysis analyses the physical structure of configuration, the | |
628 | details will vary depending on syntax. Each syntax must decide which of its | |
629 | physical structures corresponds to the following analyses, producing error | |
630 | diagnostics if they are applied to inappropriate expressions. | |
631 | ||
632 | The following are the required static analysis functions: | |
633 | ||
107c1cdb | 634 | - **Static List**: Require list/tuple construction syntax to be used and |
15c0b25d AP |
635 | return a list of expressions for each of the elements given. |
636 | ||
107c1cdb | 637 | - **Static Map**: Require map/object construction syntax to be used and |
15c0b25d AP |
638 | return a list of key/value pairs -- both expressions -- for each of |
639 | the elements given. The usual constraint that a map key must be a string | |
640 | must not apply to this analysis, thus allowing applications to interpret | |
641 | arbitrary keys as they see fit. | |
642 | ||
107c1cdb | 643 | - **Static Call**: Require function call syntax to be used and return an |
15c0b25d AP |
644 | object describing the called function name and a list of expressions |
645 | representing each of the call arguments. | |
646 | ||
107c1cdb | 647 | - **Static Traversal**: Require a reference to a symbol in the variable |
15c0b25d AP |
648 | scope and return a description of the path from the root scope to the |
649 | accessed attribute or index. | |
650 | ||
651 | The intent of a calling application using these features is to require a more | |
652 | rigid interpretation of the configuration than in expression evaluation. | |
653 | Syntax implementations should make use of the extra contextual information | |
654 | provided in order to make an intuitive mapping onto the constructs of the | |
655 | underlying syntax, possibly interpreting the expression slightly differently | |
656 | than it would be interpreted in normal evaluation. | |
657 | ||
658 | Each syntax must define which of its expression elements each of the analyses | |
659 | above applies to, and how those analyses behave given those expression elements. | |
660 | ||
661 | ## Implementation Considerations | |
662 | ||
663 | Implementations of this specification are free to adopt any strategy that | |
664 | produces behavior consistent with the specification. This non-normative | |
665 | section describes some possible implementation strategies that are consistent | |
666 | with the goals of this specification. | |
667 | ||
668 | ### Language-agnosticism | |
669 | ||
670 | The language-agnosticism of this specification assumes that certain behaviors | |
671 | are implemented separately for each syntax: | |
672 | ||
107c1cdb ND |
673 | - Matching of a body schema with the physical elements of a body in the |
674 | source language, to determine correspondence between physical constructs | |
15c0b25d AP |
675 | and schema elements. |
676 | ||
107c1cdb | 677 | - Implementing the _dynamic attributes_ body processing mode by either |
15c0b25d AP |
678 | interpreting all physical constructs as attributes or producing an error |
679 | if non-attribute constructs are present. | |
680 | ||
107c1cdb | 681 | - Providing an evaluation function for all possible expressions that produces |
15c0b25d AP |
682 | a value given an evaluation context. |
683 | ||
107c1cdb | 684 | - Providing the static analysis functionality described above in a manner that |
15c0b25d AP |
685 | makes sense within the convention of the syntax. |
686 | ||
687 | The suggested implementation strategy is to use an implementation language's | |
688 | closest concept to an _abstract type_, _virtual type_ or _interface type_ | |
689 | to represent both Body and Expression. Each language-specific implementation | |
690 | can then provide an implementation of each of these types wrapping AST nodes | |
691 | or other physical constructs from the language parser. |