Expression Language (EL)
Issuer: openEHR Specification Program | |
---|---|
Release: LANG latest |
Status: DEVELOPMENT |
Revision: [latest_issue] |
Date: [latest_issue_date] |
Keywords: openehr, expressions, rules |
© 2020 - 2022 The openEHR Foundation | |
---|---|
The openEHR Foundation is an independent, non-profit foundation, facilitating the sharing of health records by consumers and clinicians via open specifications, clinical models and open platform implementations. |
|
Licence |
Creative Commons Attribution-NoDerivs 3.0 Unported. https://creativecommons.org/licenses/by-nd/3.0/ |
Support |
Issues: Problem Reports |
Amendment Record
Issue | Details | Raiser, Implementer | Completed |
---|---|---|---|
LANG Release 1.0.0 |
|||
SPECLANG-1. Add Expression Language specification. |
T Beale |
||
Add Container selectors. |
T Beale |
10 May 2020 |
|
Update to be based on BMM |
T Beale |
03 Mar 2020 |
|
Initial writing. Added external model use; |
T Beale |
19 Sep 2018 |
Acknowledgements
Contributors
This specification benefited from formal and informal input from the openEHR and wider health informatics community. The openEHR Foundation would like to recognise the following people for their contributions.
-
Pieter Bos, Senior Engineer, Nedap, Netherlands
-
Borut Fabjan, Program Manager, Better, Slovenia
-
Matija Kejžar, Senior Engineer, Better, Slovenia
-
Bostjan Lah, Architect, Better, Slovenia
1. Preface
1.1. Purpose
This document specifies an abstract openEHR Expression Language (openEHR EL) that provides a syntax counterpart to the expression
package in the openEHR Basic Meta-Model (BMM). This may be used within BMM models, to specify archetype rules, Task Planning expressions, in newer versions of GDL, and in decision language expressions.
The intended audience includes:
-
Standards bodies producing health informatics standards;
-
Academic groups using openEHR;
-
Solution vendors.
1.2. Related Documents
Prerequisite documents for reading this document include:
Releated documents include:
-
The Archetype Object Model 2 (AOM2), Assertions section;
1.3. Status
This specification is in the DEVELOPMENT state. The development version of this document can be found at https://specifications.openehr.org/releases/LANG/latest/EL.html.
The expression language described in this specification is a more powerful language than the original Basic Expression Language (BEL), and is based on the openEHR BMM expression model. It is a major evolution on BEL syntax, and does not use the BEL meta-model.
Known omissions or questions are indicated in the text with a 'to be determined' paragraph, as follows:
TBD: (example To Be Determined paragraph)
1.4. Feedback
Feedback may be provided on the openEHR languages specifications forum.
Issues may be raised on the specifications Problem Report tracker.
To see changes made due to previously reported issues, see the LANG component Change Request tracker.
1.5. Conformance
Conformance of a data or software artifact to an openEHR specification is determined by a formal test of that artifact against the relevant openEHR Implementation Technology Specification(s) (ITSs), such as an IDL interface or an XML-schema. Since ITSs are formal derivations from underlying models, ITS conformance indicates model conformance.
2. Overview
The openEHR Expression Language (EL) defines a syntax and grammar for the expressions whose meta-model is defined in the expression
package in the openEHR Basic Meta-Model (BMM). As such it may be considered the default syntax. Other syntaxes or syntaxes are certainly possible, and other expression serialisation are possible, such as object graph serialisation into XML, JSON, YAML etc. Consequently, the BMM expression
package should be considered the normative definition of openEHR EL. Not all openEHR implementations using BMM expressions need support it: they might for example only serialse in JSON or use purely graphical visualisation.
Within openEHR, the uses of EL include expressing the following:
-
pre-, post-conditions and class invariants in BMM model definition files;
-
rules in archetypes;
-
rules in openEHR Guideline Definition Language (GDL);
-
expressions within decision logic models (DLMs) designed for use with openEHR Task Planning.
It may also be used in any other suitable context.
2.1. Requirements
The semantic requirements including the usual arithmetic, boolean, and relational operators, functions, logical quantifiers, operator precedence, constant values, and variables. In addition, there is a need to support multi-lingual translations for symbolic variables, in a similar way to the openEHR Archetype Definition Language (ADL2).
2.2. Design Background
The openEHR Expression Language is based on a combination of first-order predicate logic, object-oriented structural concepts and functional computing. It has some similarities with OMG’s OCL (Object Constraint Language). It also draws on the semantics and some of the syntax (particularly agent-related) of the Eiffel Language (ECMA-367). See Sowa (2000), Hein (2002), Kilov & Ross (1994) for an explanation of predicate logic in information modelling.
It is not exactly the same as any of these languages because:
-
it has a different meta-model, namely the BMM
expression
meta-model; -
the syntax is designed to be comprehensible to developers familiar with modern mainstream object-oriented and functional languages such as Java, C#, Python, TypeScript etc.
Following the BMM meta-model, EL treats all classical operators as surface syntax for underlying functions available on types. Thus, the '+' operator in the expression total + 1
is resolved to a function call on the type Integer
: total.add(1)
. In a similar way, higher-order operators ranging over collections of items (e.g. for_all
) are resolved to calls to functions assumed to be defined on container types (e.g. my_list.for_all(agent (v:T) )
)
Key features of EL include:
-
strong typing;
-
void-safety;
-
standard operators including:
-
logical operators including universal and existential quantification;
-
arithmetic and relational comparison operators, including for date/time types;
-
parentheses for overriding operator precedence;
-
-
object-oriented qualification (dot notation);
-
decision structures, including:
-
binary choice operator (the so-called 'ternay operator' :? in C);
-
condition chains (if/then/else equivalent);
-
case tables;
-
-
functions and agents (lambdas).
2.3. Execution Model
The assumed execution model of the Expression language is that EL statements are evaluated by an evaluator against a data context, which determines the truth values of the expression(s). The data context is the origin for some or all of the variables mentioned in the expressions, which may be read from and written to. It may concretely be a retrieved data structure, or data via an API call to the EHR, demographics, laboratory system etc.
The data context may be specified in two ways. It may be inferrable from the artefact or computing context in which the EL statements appear, or it may be specified explicitly. In the former case, the EL instance is minimally a value-returning logical proposition such as systolic_pressure > 0
, where the declaration of variables or properties such as systolic_pressure
are inferred from e.g. a data binding, and any manifest values obey this EL specification. The implicit case is shown below.
In the explicit form, an EL expressions appear within a BMM model definition, or within a context that explicitly imports a BMM model.
In both cases, the result of parsing into computable form for evaluation must result in instance of the BMM EL meta-types.
3. EL Basics
3.1. Syntax style
The syntax style used in EL is inspired by elements of common languages available today, including TypeScript, Kotlin, Java, etc, with divergences to provide a syntax that is more easily readable to non-IT professionals as well as IT professionals.
The lexical style used in EL is a form of so-called 'snake_case' rather than so-called 'CamelCase', in common with other openEHR specifications, but either may be used in real applications. One reason for using snake-case is be to render EL Modules more readable to the non-IT professional. Upper- and lower-case are not formally distinguished, and the use of upper case is a matter of style only, as follows:
-
class names: upper-case first letter followed by alphanumerics with underscores where spaces would occur in natural language, e.g.
Iso8601_date_time
,Arrayed_list<T>
; -
property, routine and variable names: lower-case first letter, followed by alphanumerics with underscores, e.g.
employee_group
,average_pressure()
; -
constants and class (static) functions: upper-case first letter, followed by alphanumerics with underscores, e.g.
Maximum_speed
.
TBD: specify equivalence between snake-case and CamelCase, or a tool-level switch?
3.2. Commenting
Comments are of two styles. For end-of-line commenting, and for creating visual dividing lines, the leader pattern '--'
is used. Dividing lines are a longer line (more than three characters), e.g '---------'
or a line of (four or more) '='
symbols, i.e. '========'
. The latter is useful for multi-level decision tables.
Comment-only lines start with the bar character ('|'
). The example below shows both forms.
| | patient fit to undertake regime | patient_fit: Result := not (platelets.in_range ([very_low]) or -- platelets can't be too low neutrophils.in_range ([very_low]))
3.3. Typing
EL is fully typed, with type definitions being supplied by one or more models, represented in the form of openEHR BMM specification. All operators are assumed to be implemented by and to map to functions defined on types, including operators such as '+' mapping to the function add()
defined on primitive types such as Integer
. Accordingly, such operators are defined within the BMM as being operator aliases of their implementing function, which is made possible by the BMM meta-types BMM_ROUTINE
and descendants.
An important implementation consequence of this approach is that an expression that is parsed to a classic operator-based AST may be evaluated by progressively searching for operator-aliased functions within the model definition, invoking with the arguments found in the AST structure, and returning Results for the next such computation. Of course, use of built-in native types and functions (rather than always dispatching via a BMM) to handle primitive type operators is likely to be used in the interests of efficiency. Function-matching can be implemented by matching the inferred signature of the operator and its argument(s) with functions having both a conformant (not necessarily identical) signature, and the same operator.
Such model definitions will therefore include primitive type definitions, either the openEHR Foundation Types, i.e. primitive types, container types and interval types, or ones that correspond very closely. In the interests of completeness, EL assumes the openEHR Foundation Types, so as to have a minimal basis.
3.4. EL Foundation Types
The EL syntax for these is described below.
3.4.1. Primitive Types
The EL primitive types are shown below.
Name | Description |
---|---|
|
Boolean value |
|
Integer value |
|
Large integer value |
|
Real value |
|
Large real value |
|
ISO 8601-format date |
|
ISO 8601-format date/time |
|
ISO 8601-format time |
|
ISO 8601-format duration |
|
String |
|
Uri in IETF RFC 3986 format |
|
Terminology code reference |
Automatic type promotion from Integer
to Real
applies to mixed integer / real values and expressions, in the same fashion as most programming languages.
3.4.2. Container Types
The same container types as defined in the Foundation Types, structure
package are assumed in EL, under the following names.
Name | Description |
---|---|
|
Abstract parent of |
|
Linear list of items of any primitive type, allowing order and repeated membership |
|
Set of items of any primitive type; no order, unique membership |
|
Indexed linear container |
3.4.3. Interval Type
The same Interval
type as defined in the Foundation Types, interval
package is assumed in EL, under the following names.
Name | Description |
---|---|
|
Interval of any ordered type |
|
Sub-type used to efficiently represent closed intervals whose boundaries are the same |
|
Sub-type used to efficiently represent intervals whose boundaries are different |
Automatic type promotion from Interval<Integer>
to Interval<Real>
applies to all integer and real values and expressions, in the same fashion as most programming languages.
3.4.4. Complex Types
Complex types are imported from a formal model definition, expressed in openEHR BMM format, or any formal equivalent. The types in a model definition included in this way become available within the formalism in the same way as the foundation types, and may be used in expressions in the same way.
4. Terminal Entities
Terminal entities in EL correspond to the EL_TERMINAL
meta-type in BMM, and its descendants. These come in three categories:
-
instance references: references to instances generated by direct references to literals, constants, variables, or else function calls;
-
predicates: logical conditions on instance references;
-
agents: delayed routine call objects.
The following sub-sections describe these types.
4.1. Literals
Literal values are mostly instances of the types declared in the imported models. Assumed literals correspond to the openEHR Foundation and Based types, and are expressed in the openEHR ODIN syntax, with the exception of List<T>
, Set<T>
and Map<K,V>
, which are distinguished in EL with specific types of brackets. The corresponding classes are described in the openEHR Foundation Types specification.
Type | Literal values | Description |
---|---|---|
|
|
|
|
|
Signed integer values from −231 to 231-1, including E-notation |
|
|
Signed real values from 3.4028235 × 1038, including percentages and E-notation |
|
|
Double precision real values, including percentages and E-notation |
|
|
ISO 8601-format date |
|
|
ISO 8601-format date/time |
|
|
ISO 8601-format time |
|
|
ISO 8601-format duration |
|
|
|
|
|
Uri in IETF RFC 3986 format |
|
|
Local terminology code |
|
|
|
|
|
|
|
|
|
|
{ prim_val : val, prim_val : val, ... prim_val : val } |
A table of values of any type V, |
Object |
{ identifier : val, identifier : val, ... identifier : val } |
An object of any type T, |
|
||
|
the two-sided interval N ≥ x ≤ M |
|
|
the two-sided interval N > x ≤ M |
|
|
the two-sided interval N ≥ x < M |
|
|
the one-sided interval x < N |
|
|
the one-sided interval x > N |
|
|
the one-sided interval x ≤ N |
|
|
the one-sided interval x ≥ N |
|
|
the two-sided interval of N ±M |
|
|
the two-sided interval of N ±M |
One exception to the above is tuples, which are direct instances of the BMM meta-type BMM_TUPLE
. They take the literal form [a, b, c]
, where a
, b
, and c
are generally of different types.
TBD: consider to not bother with Array
and reserve []
for tuples.
4.2. Variables
Symbolic variables are valid within the scope of the routine in which they are declared, and are classified as read-only or writable. Read-only variables include routine parameters and the automatically declared variable Self
, and are reprsented by the meta-type EL_READONLY_VARIABLE
.
Writable variables include locally declared variables and the automatically declared variable Result
and are represented by the BMM meta-type EL_WRITABLE_VARIABLE
.
4.3. Type References
A type may be directly referenced using the syntax {TypeName}
. This has the effect of creating an anonymous variable whose value is a read-only instance of the type. This may be equivalently understood as the 'static view'. A type reference can be the scoper of any feature call that is to read-only and is:
-
a static feature, i.e. a constant or class singleton;
-
any function that depends recursively only on constants and static features in addition to any arguments.
This provides a mechanism, common to many programming languages, for access to constants and helper functions without creating instances.
4.4. Feature References
4.4.1. Qualified Referencing
Any feature reference may appear as itself (in the relevant syntactic form described below) or in a form qualified by scoping entities, using standard 'dot' referencing. The qualifier provides the reference context, and is represented by the EL_FEATURE_REF
property scoper
. Multiple qualifiers may be used in a single reference, as long as class feature visibility is satisfied, allowing the following:
person1.name
employees[1].name.first_name
blood_pressure.history.events[3].data.data.systolic
agent obstetric_risks.basic_risk
4.4.2. Constants
Constants are syntactically represented using labels of which the first letter is capitalised, and may be of any type, including complex types. The following are EL expressions containing constants.
Mph_to_kmh_factor = 1.6
Safe_glucose_limits.has (3.5)
4.4.3. Property References
Property references are valid within the scope of the class in which they are declared, and may be used in any routine definition or assertion in the class. They are represented by plain names such as diabetic_status
.
4.4.4. Function Calls
In EL expressions, computational functions may be called in the same way as for typical programming languages. An EL property reference corresponds to the BMM meta-type EL_FUNCTION_CALL
, which contains an instance of the BMM meta-type EL_FUNCTION_AGENT
, which in turn has as its closed_args
a tuple containing a set of items each of which is in turn an expression of any kind.
Consequently, EL function calls (similarly to most programming languages) may be of any level of complexity. The simplest type of function call is to a function whose signature is <[],T>
, i.e. one taking no arguments are returning a value of type T
. In EL, this may be called with or without parentheses, e.g. age
or age()
.
The following example assumes a function tnm_major_number (tnm_val: String): Integer
that extracts various elements of Tumour/Node/Metastasis ('TNM') cancer staging values, such as 'Tis'
, 'G3'
and so on, and shows two forms of call to this function.
tnm_major_number (tnm_t)
tnm_major_number ("Tis")
More complex function calls may include arguments of other function calls, agents, tuples, operator expressions and normal instance references.
To be evaluated, function calls must be mappable to class methods in external libraries that are available at expression evaluation time.
4.4.5. Built-in Functions
Some commonly used functions such as current_date()
or similar are often thought of as 'built-in' to a language environment. In the openEHR EL context, there are no built-in functions as such; useful utility functions must be supplied by classes or interfaces included as part of an imported model. In the openEHR environment, many utility calls are available in the openEHR Base Types. They will resolve correctly as long as this model is imported, which it normally will be as part of a larger model, such as the openEHR RM.
As a consequence, the total set of available utility functions for use in an EL expression is just what is available from the sum of all imported models. Assuming the openEHR Foundation and Base Types, the following kinds of functions are available for use in EL expressions:
{Env}.current_date -- obtain today's date as a Iso8601_date
blood_glucose_list: List<Real>
{Statistical_evaluator}.max (blood_glucose_list) -- compute a maximum of Numerics
{Locale}.language -- the primary language in the locale as a Coded_term
4.4.6. Container Item Access
Access to members of instances of a container type may be achieved by normal functional means (typically functions like Array<T>.get()
or List<T>.item()
), and also via the []
operator, which is an alias for such functions defined on the relevant types, as follows.
Operator | Function | Meaning |
---|---|---|
|
|
i-th element of an array; 1-based |
|
|
i-th element of a list; 1-based |
|
|
element at key k of a Map |
TBD: to achieve this generically, the above map of operators to member functions of appropriate types needs to be supplied in the model supplying the types themselves.
Container element access may be used on any expression whose effective type is a container, including function calls.
4.4.7. Matching Objects
Matching of objects is possible via use of predicates using the []
syntax used after any variable or feature reference. This is achieved by supplying an agent argument whose signature is <[T], Boolean>
, or in functional form, (v:T): Boolean
. For non-container objects, the type T
is the statically declared type of the object. If the object is of a container type (list, array etc) then the type T
is the type of the container items.
The []
syntax is shorthand for the following assumed functions:
Type | Function |
---|---|
|
|
|
|
TBD: For Any
, need type anchoring… or else generic functions.
Here, 'matching' is understood to mean obtain all matching items.
This enables a reference of the following form to be constructed (final line).
class Book {
title: String;
pub_date: Date;
country: Terminology_code;
}
book_list, old_spanish_books: List<Book>
old_spanish_books := book_list [(b:Book) {b.title.contains("Quixote")}]
The part in {}
is any Boolean-valued expression, and may therefore be an operator expression, e.g.:
old_spanish_books := book_list [(b:Book) {b.title.contains("Quixote") OR b.pub_date < P1650Y AND b.country = #iso639::es}]
Since the function signature is invariant with respect to the container item type (here, Book
), a shorter form can be used in which the b
is assumed:
old_spanish_books := book_list [title.contains("Quixote") OR pub_date < P1650Y AND country = #iso639::es]
In the above, the variable old_spanish_books
is of type List<Book>
, and in general may contain more than one item (as well as be empty). To obtain the first book in the list, the standard array reference syntax may be used, i.e. old_spanish_books[1]
. By extension, the following is also legal:
old_spanish_book: Book
old_spanish_book := book_list [title.contains("Quixote") OR pub_date < P1650Y AND country = #iso639::es][1] -- safe if it is known that there is at least one
Operator expressions based on the types of the items in the container may be used. The following predicate uses the short form of the expression b.pub_date >= PY2003
.
book_list [pub_date >= PY2003]
Qualified referencing can be combined with selector agents to obtain an effect similar to the use of Xpath on XML data, as follows.
book_list [title.contains("Quixote")][1].pub_date.year
For matching to work, there must be an appropriate function available on all container types. In the case of the openEHR Foundation types, this is match (<[T], Boolean>): List<T>
defined on Container<T>
; any equivalent function in a different model will do. The return type is nullable.
TBD: to achieve this generically, the map of operators to member functions of appropriate types needs to be supplied in the model supplying the types themselves.
Other short forms are available, making a predicate syntax reminiscent of Xpath possible, as follows.
Lambda expression | Short form |
---|---|
|
|
|
|
|
|
4.5. Predicates
EL predicates are special meta-operators that are used to express tests on runtime object structures.
4.5.1. Attached() Predicate
The attached()
predicate is the EL equivalent of the expressions such as someVar == null
(C, C++, C#, Java), some_var is None
(Python) and similar. In EL, a reference is understood as being attached (or not) to a value. Attached status is therefore tested using attached (ref)
, and may be applied to any target of a BMM EL_INSTANCE_REF
, which includes references to variables, properties, constants, functions and tuples.
Attached()
returns a Boolean value, and thus may be negated, to form expressions such as:
not attached (test_result) or else test_result.data.events[1].data.value > 6.5
4.6. Agents
Delayed routine calls for both functions and procedures may occur as terminals in an EL expression. The evaluation type (eval_type
) of an agent is its signature
. Syntactically, these take various forms. An agent can be created using a function or procedure visible in the current scope, using the keyword agent
. The arguments list may range from empty to full. For a completely empty list, the routine name on its own may be used.
|
| define a naive obstetric risk function
|
obstetric_risk (age: Duration[1]; previous_pregnancies: Integer[1]): Coded_term[1]
|
| generate an agent with signature <[Duration, Integer], Coded_term>
|
agent obstetric_risk
For a partial argument list, ?
symbols are used for the non-filled arguments. This generates an agent whose signature corresponds to the remaining open arguments. In the following example, an agent of the signature <[Integer], Coded_term>
is generated, which, since the age of 38 years is supplied, may be thought of as a new function called obstetric_risk_38_years()
.
agent obstetric_risk ('P38Y', ?)
Theoretically, an agent could be created with all arguments supplied, without the intention of immediate execution, e.g. agent obstetric_risk ('P38Y', 2)
, which would generate an agent of signature <[],Coded_term>
. This could be later executed by simply using the receiver variable or parameter reference in the normal way, in a later expression.
Agents for procedure calls can be created in the same way as described above. In each case, the evaluation type is a signature of the form <[args]>
, i.e. having no return type.
5. Complex Expressions
Complex expressions in EL consist of non-atomic value-returning expressions, in a typed, operator-based syntax common to many programming languages and logics. In EL, the syntactic use of operators is understood as a shorthand for specific functions assumed to be available on types inferred from the context of the operator use. An EL implementation would therefore map such operators to the appropriate methods in a class library.
5.1. Equality Operator
The equality operator =
in EL is understood as the function equal()
defined on the type Any
, of which every other class is a descendant. For all primitive value types (types for which use in expressions directly generates values rather than instance references), the semantics are value comparison, while for all other types, the semantics are reference comparison. For non-openEHR models, '='
will normally map to a similarly-named method, e.g. equals()
.
To obtain value comparison for non-value types, the function Any.is_equal()
, which may be redefined in any sub-type, is used.
5.2. Primitive Operators
Primitive operators in EL are the infix or prefix syntax forms of functions available on primitive types. For example, the operator -
(minus) is defined on the class Numeric
(an inheritance ancestor of the classes Integer
, Real
etc) as the following:
|
| in class Numeric
|
subtract (other: Numeric): Numeric
alias infix '-'
|
| redefined in class Integer as
|
subtract (other: Integer): Integer
This means that where the expression 100 - 5
is encountered in EL, what is really invoked is {Integer}.subtract()
, specifically 100.subtract(5)
.
For convenience, the operators for the Numeric
and Boolean
types from the openEHR Foundation Types are reproduced below.
Operators | Function | Meaning |
---|---|---|
Arithmetic Operators - Numeric operands and result; descending precendence order |
||
|
|
Exponentiation |
|
|
Multiplication |
|
|
Division |
|
|
Modulo (whole number) division |
|
|
Addition |
|
|
Subtraction |
Relational Operators - Numeric, Date/time operands and Boolean result; equal precedence |
||
|
|
Value equality |
|
|
Inequality relation |
|
|
Less than relation |
|
|
Less than or equal relation |
|
|
Greater than relation |
|
|
Greater than or equal relation |
Logical Operators - Boolean operands and result; descending precendence order |
||
|
|
Negation, "not p" |
|
|
Logical conjunction, "p and q" |
|
|
Logical disjunction, "p or q" |
|
|
Exclusive or, "only one of p or q" |
|
|
Material implication, "p implies q", or "if p then q" |
Expressions using logical operators may thus be written using standard English names or symbols, as in the following.
systolic_bp > 140 AND (is_smoker OR is_hypertensive)
systolic_bp > 140 ∧ (is_smoker ∨ is_hypertensive)
In addition, some operators are defined on the other primitive types, including the following.
Operator | Function | Meaning |
---|---|---|
|
||
|
|
String concatenation, appending |
|
||
|
|
Add a precise duration to a date |
|
|
Add a nominal duration to a date |
|
|
Subtract a precise duration from a date |
|
|
Subtract a nominal duration from a date |
|
|
Difference of two dates |
|
||
|
|
Add a precise duration to a date/time |
|
|
Add a nominal duration to a date/time |
|
|
Subtract a precise duration from a date/time |
|
|
Subtract a nominal duration from a date/time |
|
|
Difference of two date/times |
|
||
|
|
Add a duration to a time |
|
|
Subtract a duration from a time |
|
|
Difference of two times |
|
||
|
|
Add a duration to a duration |
|
|
Subtract a duration from a duration |
Operator semantics that require further explanation are described below.
5.3. Higher-order Operators
5.3.1. Quantification Operators
The two standard quantification operators from predicate logic there exists
(∃ operator) and for all
(∀ operator) are defined in EL for the container types found in the openEHR Foundation Types.
The textual syntax of there exists
is as follows:
there_exists v in container_var | <Boolean expression mentioning v>
Here, the |
symbol is usually read in English as 'such that'. The symbolic equivalent may also be used:
∃ v : container_var | <Boolean expression mentioning v>
The above may also be expressed in EL as its functional equivalent:
list_of_reals: List<Real>
|
| an expression that will return true if list_of_reals
| contains a value greater than 140.0
|
list_of_reals.there_exists (
agent (v: Real): Boolean {
v > 140.0
}
)
The for_all
operator has similar textual syntax:
for_all v in container_var | <Boolean expression mentioning v>
Here, the |
symbol is normally read in English as as 'it holds that'. The symbolic equivalent may also be used:
∀ v : container_var | <Boolean expression mentioning v>
The above may also be expressed in EL as its functional equivalent:
list_of_reals: List<Real>
|
| an expression that will return true if list_of_reals
| consists of values all greater than 140.0
|
list_of_reals.for_all (
agent (v: Real): Boolean {
v > 140.0
}
)
5.4. Decision Tables
In EL, a decision table is a construct that expresses the equivalent logic of a multi-branch construct that returns a single expression as a result. There are two flavours, both familiar to programmers in mainstream languages: the condition chain (i.e. an if/then/else construct) and the case table (i.e. a case statement). The evaluation of both constructs determines which of a number of possible expressions to return as the result, based on the prior evaluation of branch conditions, whose particular form depends on which flavour of construct is used. Both constructs are thus purely functional, i.e. their branches cannot contain statements (i.e. assignments, procedure calls etc), only expressions.
5.4.1. Condition Chain (if/then)
The syntax for a condition chain (the if/then equivalent) takes a standard form and a compact form. The standard form is as follows.
choice in
<condition_1>: <expression_1>,
<condition_2>: <expression_2>,
...
<condition_N>: <expression_N>,
*: <else expression>
;
In the above, the '*'
character is understood as a wildcard, meaning 'all other cases'. A final row containing '*'
is thus equivalent to a catch-all 'else' branch in the if/then/else chain of a procedural language.
A realistic example is illustrated below, making use of line comments to visually aid the author.
molecular_subtype: Terminology_term
Result := choice in
=========================================================
er_positive and
her2_negative and
not ki67.in_range ([high]): #luminal_A,
---------------------------------------------------------
er_positive and
her2_negative and
ki67.in_range ([high]): #luminal_B_HER2_negative,
---------------------------------------------------------
er_positive and
her2_positive: #luminal_B_HER2_positive,
---------------------------------------------------------
er_negative and
pr_negative and
her2_positive and
ki67.in_range ([high]): #HER2,
---------------------------------------------------------
er_negative and
pr_negative and
her2_negative and
ki67.in_range ([high]): #triple_negative,
---------------------------------------------------------
*: #none
=========================================================
;
For the common degenerate case where there is a single condition, the standard form looks as follows:
calculate_score: Integer
Result := choice in
============
expr1: 2,
------------
*: 0
============
;
While perfectly understandable (and legal syntax), the following compact form may be used instead:
calculate_score: Integer
Result := expr1 ? 2 : 0
The above syntax is adopted from the C language family. It may be used to construct intelligible conditional arithmetic operations such as summing, e.g.:
ipi_raw_score: Integer
Result := Result.add (
=============================================
age > 60 ? 1 : 0,
staging ∈ {#stage_III, #stage_IV} ? 1 : 0,
ldh.in_range (#normal) ? 1 : 0,
ecog > 1 ? 1 : 0,
extranodal_sites > 1 ? 1 : 0
=============================================
)
;
5.4.2. Case Table
The Case Table syntax form (case statement equivalent) is logically no different from the more general condition chain, except that every branch condition expression takes the form Expr ∈ Constri
, where Expr
is the same expression left-hand side for all branches, each having a variable right-hand side in the form of a value range constraint. Here the ∈
operator is read as 'is in', i.e. set-membership. The case table construct is designed to enable the value of a single determining expression to be tested against any number of value ranges. This is illustrated in the following example:
gfr_range: Real
risk_assessment: Real
Result := case gfr_range in
=================
|>20|: 1,
|10 - 20|: 0.75,
|<10|: 0.5
=================
;
This expression returns one of the values 1, 0.75 or 0.5, depending on the evaluated value of gfr_range
, but it could equally return the value of a more complex expression, including further instances of Case tables, Condition chains, operator expressions etc.
5.4.3. Nested Case Table
The following shows the use of nested case tables to achieve the effect of a credit application test, from an example in the DMN specification.
post_bureau_risk_category: Terminology_term
Result := case existing_customer in
========================================
True: case
appl_risk_score
in
--------------------------------
|≤120|: case
credit_score
in
--------------------
|<590|: #HIGH,
|590..610|: #MEDIUM,
|>610|: #LOW
--------------------
;,
|>120|: case
credit_score
in
--------------------
|<600|: #HIGH,
|600..625|: #MEDIUM,
|>625|: #LOW
--------------------
;
--------------------------------
;,
False: case
appl_risk_score
in
--------------------------------
|≤100|: case
credit_score
in
--------------------
|<580|: #HIGH,
|580..600|: #MEDIUM,
|>600|: #LOW
--------------------
;,
|>100|: case
credit_score
in
--------------------
|<590|: #HIGH,
|590..615|: #MEDIUM,
|>615|: #LOW
--------------------
;
--------------------------------
;
========================================
;
;
5.4.4. Multi-dimensional Case Table (experimental)
The credit assessment example above can be recoded as a sparse table.
post_bureau_risk_category := multicase
=======================================================================================
{existing_customer, appl_risk_score, credit_score} in
---------------------------------------------------------------------------------------
True: |≤120|: |<590|: #HIGH,
|590..610|: #MEDIUM,
|>610|: #LOW;
-------------------------------------------------------------------
|>120|: |<600|: #HIGH,
|600..625|: #MEDIUM,
|>625|: #LOW;
,
----------------------------------------------------------------------------------------
False: |≤100|: |<580|: #HIGH,
|580..600|: #MEDIUM,
|>600|: #LOW;
-------------------------------------------------------------------
|>100|: |<590|: #HIGH,
|590..615|: #MEDIUM,
|>615|: #LOW;
;
=======================================================================================
;
5.4.5. Two-dimensional Tables (experimental)
Two-dimensional decision tables are common in all sectors. Although they can be reduced to a condition chain, EL provides a more direct syntax that enables them to be expressed in a form visually very close to their logical form.
item in
==========================================================================
{ isEconomy(p), isBusiness(p), isFirstClass(p) },
--------------------------------------------------------------------------
isChild(p): { 50, 250, 1000 },
--------------------------------------------------------------------------
isAdult(p): { 250 + trip.d, 450 + trip.d, 750 + trip.d },
--------------------------------------------------------------------------
isMilitary(p): { 90, 250, 750 - 2 * p.age }
==========================================================================
;
Appendix A: Syntax Specification
Antlr4 files may be found for EL at in the openEHR Antlr4 Git repository.
The Antlr4 grammar for the EL syntax is shown below.
//
// description: Antlr4 grammar for openEHR Expression Language baed on BMM meta-model.
// author: Thomas Beale <thomas.beale@openehr.org>
// contributors:Pieter Bos <pieter.bos@nedap.com>
// support: openEHR Specifications PR tracker <https://openehr.atlassian.net/projects/SPECPR/issues>
// copyright: Copyright (c) 2016- openEHR Foundation <http://www.openEHR.org>
// license: Apache 2.0 License <http://www.apache.org/licenses/LICENSE-2.0.html>
//
parser grammar ElParser;
options { tokenVocab=ElLexer; }
import Cadl2Parser;
// ========================== EL Statements ==========================
statementBlock: statement+ EOF ;
statement: declaration | assignment | assertion ;
declaration:
variableDeclaration
| constantDeclaration
;
variableDeclaration: instantiableRef ':' typeId ( SYM_ASSIGNMENT expression )? ;
constantDeclaration: constantId ':' typeId ( SYM_EQ expression )? ;
assignment: valueGenerator SYM_ASSIGNMENT expression ;
assertion: ( ( LC_ID | UC_ID ) ':' )? SYM_ASSERT booleanExpr ;
// ========================== EL Expressions ==========================
//
// Expressions are either value-generators, or operator expressions (containing value-generators)
//
expression:
terminal
| operatorExpression
| tuple
;
operatorExpression:
booleanExpr
| arithmeticExpr
;
// ------------------- Boolean-returning operator expressions --------------------
//
// Expressions evaluating to boolean values, using standard precedence;
// These map to ordinary 1- and 2-argument function calls on Boolean instances
//
booleanExpr:
SYM_NOT booleanExpr
| booleanExpr SYM_AND booleanExpr
| booleanExpr SYM_XOR booleanExpr
| booleanExpr SYM_OR booleanExpr
| booleanExpr SYM_IMPLIES booleanExpr
| booleanExpr ( SYM_IFF | SYM_EQ ) booleanExpr
| booleanLeaf
;
//
// Atomic Boolean-valued expression elements
//
booleanLeaf:
booleanValue
| forAllExpr
| thereExistsExpr
| arithmeticConstraintExpr
| generalConstraintExpr
| '(' booleanExpr ')'
| SYM_DEFINED '(' valueGenerator ')'
| arithmeticComparisonExpr
| objectComparisonExpr
| valueGenerator
;
//
// Universal and existential quantifier
//
forAllExpr: SYM_FOR_ALL localVariableId ':' valueGenerator '|' booleanExpr ;
thereExistsExpr: SYM_THERE_EXISTS localVariableId ':' valueGenerator '|' booleanExpr ;
// Constraint expressions
// This provides a way of using one operator (matches) to compare a
// value (LHS) with a value range (RHS). As per ADL, the value range
// for ordered types like Integer, Date etc may be a single value,
// a list of values, or a list of intervals, and in future, potentially
// other comparators, including functions (e.g. divisible_by_N).
//
// For non-ordered types like String and Terminology_code, the RHS
// is in other forms, e.g. regex for Strings.
//
// The matches operator can be used to generate a Boolean value that
// may be used within an expression like any other Boolean (hence it
// is a booleanLeaf).
// TODO: non-primitive objects might be supported on the RHS in future.
arithmeticConstraintExpr: arithmeticLeaf SYM_MATCHES '{' cInlineOrderedObject '}' ;
generalConstraintExpr: simpleTerminal SYM_MATCHES '{' cObjectMatcher '}' ;
// --------------------------- Arithmetic operator expressions --------------------------
//
// Comparison expressions of arithmetic operands generating Boolean results
//
arithmeticComparisonExpr: arithmeticExpr comparisonBinop arithmeticExpr ;
comparisonBinop:
SYM_EQ
| SYM_NE
| SYM_GT
| SYM_LT
| SYM_LE
| SYM_GE
;
//
// Expressions evaluating to values of arithmetic types, using standard precedence
//
arithmeticExpr:
<assoc=right> arithmeticExpr '^' arithmeticExpr
| arithmeticExpr ( '/' | SYM_ASTERISK | '%' ) arithmeticExpr
| arithmeticExpr ( '+' | '-' ) arithmeticExpr
| arithmeticLeaf
;
// TODO: need to be able to plug in terminal to allow decision tables in expressions
arithmeticLeaf:
arithmeticValue
| '(' arithmeticExpr ')'
| valueGenerator
| simpleCaseTable
;
arithmeticValue:
integerValue
| realValue
| dateValue
| dateTimeValue
| timeValue
| durationValue
;
// -------------------- Equality operator expressions for other types ------------------------
//
// Compare any kind of objects
//
objectComparisonExpr: simpleTerminal equalityBinop simpleTerminal ;
equalityBinop:
SYM_EQ
| SYM_NE
;
//
// -------------------------- tuples -----------------------------
//
tuple: '[' expression ( ',' expression )+ ']';
//
// -------------------------- value-generating expressions -----------------------------
//
terminal:
simpleTerminal
| decisionTable
;
simpleTerminal:
primitiveObject
| valueGenerator
;
//
// TODO: Can't syntactically distinguish between a local or other variable id
// and a property or constant reference.
//
valueGenerator:
bareRef
| scopedFeatureRef
| typeRef
;
bareRef:
boundVariableId
| staticRef
| localRef
| functionCall
;
//
// Static and constant feature refs, distinguished by the use of
// initial capital in the id.
// Will map to EL_READABLE_VARIABLE or EL_STATIC_REF (unscoped)
//
staticRef:
SYM_SELF
| constantId
;
//
// Local writable reference, distinguished by use of initial lowercase id
// Will map to EL_WRITABLE_VARIABLE or EL_PROPERTY_REF (unscoped)
//
localRef:
SYM_RESULT
| localVariableId
;
//
// scoped feature references.
// Will map to any EL_FEATURE_REF (scoped)
//
scopedFeatureRef: scoper featureRef ;
scoper: ( typeRef '.' )? ( bareRef '.' )* ;
typeRef: '{' typeId '}' ;
typeId: UC_ID ( '<' typeId ( ',' typeId )* '>' )? ;
featureRef:
functionCall
| instantiableRef
;
//
// Instantiable feature refs
//
instantiableRef:
boundVariableId
| localVariableId
| constantId
;
//
// TODO: analyse how a boundVariableId can be created as a built-in feature
//
boundVariableId: BOUND_VARIABLE_ID ;
localVariableId: LC_ID ;
constantId: UC_ID ;
//
// Function calls
//
functionCall: LC_ID '(' exprList? ')' ';'? ;
exprList: expression ( ',' expression )* ;
//
// -------------------------- decision tables -----------------------------
//
decisionTable:
binaryChoice
| caseTable
| conditionTable
;
caseTable:
| simpleCaseTable
| generalCaseTable
;
//
// condition chains (if/then statement equivalent)
// choice in
// =========================================================
// er_positive and
// her2_negative and
// not ki67.in_range (#high): #luminal_A,
// ---------------------------------------------------------
// er_positive and
// her2_negative and
// ki67.in_range (#high): #luminal_B_HER2_negative,
// ---------------------------------------------------------
// *: #none
// =========================================================
// ;
//
conditionTable: SYM_CHOICE SYM_IN ( conditionBranch ',' )+ ( conditionBranch | conditionDefaultBranch ) ';' ;
conditionBranch: booleanExpr ':' expression ;
conditionDefaultBranch: SYM_ASTERISK ':' expression ;
//
// Binary-choice version of condition table, using old-school
// C/Java syntax:
// booleanExpr ? x : y ;
//
binaryChoice: booleanExpr '?' simpleTerminal ':' simpleTerminal ;
//
// Case tables, e.g.:
// Result := case qCSI_score in
// ============================
// 0: expr0,
// ----------------------------
// |1..2|: expr1,
// ----------------------------
// |3..5|: expr2,
// ----------------------------
// |6..8|: expr3,
// ----------------------------
// |≥ 9|: expr4
// ============================
// ;
//
generalCaseTable: SYM_CASE expression SYM_IN ( generalCaseBranch ',' )+ ( generalCaseBranch | generalCaseDefaultBranch ) ';' ;
generalCaseBranch: primitiveObject ':' expression ;
generalCaseDefaultBranch: SYM_ASTERISK ':' expression ;
//
// Simple value-based (typed) Case tables, e.g.:
// case gfr_range in
// =================
// |>20|: 1,
// |10..20|: 0.75,
// |<10|: 0.5
// =================
// ;
//
simpleCaseTable: SYM_CASE simpleTerminal SYM_IN ( simpleCaseBranch ',' )+ ( simpleCaseBranch | simpleCaseDefaultBranch ) ';' ;
simpleCaseBranch: primitiveObject ':' simpleTerminal ;
simpleCaseDefaultBranch: SYM_ASTERISK ':' simpleTerminal ;
The Antlr4 lexer for the EL syntax is shown below.
//
// description: Antlr4 grammar for openEHR Expression Language baed on BMM meta-model.
// author: Thomas Beale <thomas.beale@openehr.org>
// contributors:Pieter Bos <pieter.bos@nedap.com>
// support: openEHR Specifications PR tracker <https://openehr.atlassian.net/projects/SPECPR/issues>
// copyright: Copyright (c) 2016- openEHR Foundation <http://www.openEHR.org>
// license: Apache 2.0 License <http://www.apache.org/licenses/LICENSE-2.0.html>
//
lexer grammar ElLexer;
import AdlPathLexer, Cadl2Lexer, GeneralLexer;
channels {
COMMENT
}
// ------------------ lines and comments ------------------
CMT_LINE : '--' .*? EOL -> channel(COMMENT) ;
TABLE_CMT_LINE : '===' '='* EOL -> channel(COMMENT) ;
EOL : '\r'? '\n' -> channel(HIDDEN) ;
WS : [ \t\r]+ -> channel(HIDDEN) ;
// --------- keywords ----------
SYM_DEFINED : 'defined' ;
SYM_SELF : 'Self' ;
SYM_IN : 'in' ;
SYM_CHOICE : 'choice' ;
SYM_CASE : 'case' ;
SYM_RESULT : 'Result' ;
// --------- symbols ----------
SYM_ASSIGNMENT: ':=' ;
SYM_COLON : ':' ;
SYM_INTERROGATION: '?' ;
SYM_NE : '/=' | '!=' | '≠' ;
SYM_EQ : '=' ;
SYM_GT : '>' ;
SYM_LT : '<' ;
SYM_LE : '<=' | '≤' ;
SYM_GE : '>=' | '≥' ;
SYM_PLUS : '+' ;
SYM_MINUS : '-' ;
SYM_SLASH : '/' ;
SYM_PERCENT : '%' ;
SYM_CARET : '^' ;
SYM_DOT : '.' ;
SYM_DOUBLE_MINUS: '--' ;
SYM_DOUBLE_PLUS: '++' ;
SYM_THEN : 'then' | 'THEN' ;
SYM_AND : 'and' | 'AND' | '∧' ;
SYM_OR : 'or' | 'OR' | '∨' ;
SYM_XOR : 'xor' | 'XOR' ;
SYM_NOT : 'not' | 'NOT' | '!' | '~' | '¬' ;
SYM_IMPLIES : 'implies' | '⇒' | '→' ;
SYM_IFF : '⇔' | '↔' ;
SYM_FOR_ALL : 'for_all' | '∀' ;
SYM_THERE_EXISTS: 'there_exists' | '∃' ;
SYM_MATCHES : 'matches' | 'is_in' | '∈' ;
SYM_ASSERT : 'assert' ;
// TODO: replace with defined() and attached() predicates
SYM_EXISTS : 'exists' ;
BOUND_VARIABLE_ID: '$' LC_ID ;
// ---------- local code that are not ADL codes -------
// e.g. [heart_rate]
LOCAL_TERM_CODE_REF: '[' ALPHANUM_US_CHAR+ ']' ;