Simplified Data Template (SDT)

Issuer: openEHR Specification Program
Release: ITS-REST latest	Status: DEVELOPMENT
Revision: [latest_issue]	Date: [latest_issue_date]
Keywords: JSON, template, REST, commit

Issuer: openEHR Specification Program

Release: ITS-REST latest

Status: DEVELOPMENT

Keywords: JSON, template, REST, commit

© 2019 - 2019 The openEHR Foundation
The openEHR Foundation is an independent, non-profit foundation, facilitating the sharing of health records by consumers and clinicians via open specifications, clinical models and open platform implementations.
Licence	Creative Commons Attribution-NoDerivs 3.0 Unported. https://creativecommons.org/licenses/by-nd/3.0/
Support	Issues: Problem Reports Web: specifications.openEHR.org

The openEHR Foundation is an independent, non-profit foundation, facilitating the sharing of health records by consumers and clinicians via open specifications, clinical models and open platform implementations.

Licence

Creative Commons Attribution-NoDerivs 3.0 Unported. https://creativecommons.org/licenses/by-nd/3.0/

Support

Issues: Problem Reports
Web: specifications.openEHR.org

Amendment Record

Issue	Details	Raiser	Completed
ITS_REST Release 1.?.?
0.7.0	SPECITS-33. Add JSON Data Template format specification; Initial writing, adapted from the Marand Better Platform 'Web Templates' specification and EtherCIS documentation.	openEHR SEC	17 Jul 2019

Issue

Details

Raiser

Completed

ITS_REST Release 1.?.?

0.7.0

SPECITS-33. Add JSON Data Template format specification;
Initial writing, adapted from the Marand Better Platform 'Web Templates' specification and EtherCIS documentation.

openEHR SEC

17 Jul 2019

Acknowledgements

Primary Author

Thomas Beale, Ars Semantica (UK); openEHR Foundation Management Board.

Contributors

This specification has benefited from formal and informal input from the openEHR and wider health informatics community. The openEHR Foundation would like to recognise the following for their contributions.

A significant part of the design ideas and content of this specification was adapted from

the Marand 'Web Templates' specification, kindly provided by Better by Marand d.o.o. (Slovenia), and
the EtherCIS ECISFLAT format by the EtherCIS community, see https://github.com/ethercis/ethercis/blob/master/doc/flat%20json.md

Trademarks

'openEHR' is a registered trademark of the openEHR Foundation
'OMG' and 'UML' are registered trademarks of the Object Management Group

1. Preface

1.1. Purpose

This document describes a 'simplified template' (ST) format, along with various concrete data formats, intended to be easier to use than the canonical formats, for typical developers with minimal exposure to openEHR. These formats are designed to enable easier creation of openEHR content (i.e. Compositions etc).

1.2. Status

This specification is in the DEVELOPMENT state. The development version of this document can be found at {json_data_template}[{json_data_template}^].

Known omissions or questions are indicated in the text with a 'to be determined' paragraph, as follows:

TBD: (example To Be Determined paragraph)

1.3. Feedback

Feedback may be provided on the technical mailing list.

Issues may be raised on the specifications Problem Report tracker.

To see changes made due to previously reported issues, see the ITS-REST component Change Request tracker.

1.4. Conformance

When used with openEHR-REST API, the available calls to be conformance tested are the same as for other openEHR serialisation formats (canonical JSON etc), but with at different representation format (indicated by setting appropriate Content-Type http header).

TBD: Conformance testing and Content-Type will be further described later.

2. Overview

2.1. Requirements

The default serialised data representations for openEHR content are canonical XML, based on the openEHR RM XSDs, canonical JSON, described by the openEHR JSON schemas, and potentially any other canonical serial format based on the underlying Reference Model (e.g. YAML etc). Here, 'canonical' means any fully-expressed instance data in which all RM mandatory fields are present, all attributes are named as per the RM, and all cardinalities respect the RM.

The canonical formats are routinely used by all openEHR implementations implementing the openEHR REST API specification, and in many other ways, e.g. for DB dump/load implementation, ETL and so on. However, creating data instances according to these formats is not always straightforward, particularly for developers with minimal exposure to openEHR, and various alternatives have been used in the past to simplify the job of content creation and committal for application developers.

This specification responds to the requirement for a minimal but formally complete serial form of content creation for openEHR applications.

2.2. Conceptual Approach

The information models of openEHR are structured in multiple layers, with the primary distinction being between an information model layer (the 'Reference Model' or RM), and domain-level models expressed in archetypes and templates, that latter of which express particular data sets. Each such data set is defined in terms of an openEHR Operational Template (OPT), derived from a source template, and ultimately particular archetypes, which are themselves constraint models based on the RM, i.e. the 'canonical model.

The openEHR reference model (RM) and supporting models (BASE component) are designed with two computational goals in mind:

data instances (healthcare data) are fully defined and self-standing, for example if shared with a data partner that does not use openEHR;
software that implements the model works in regular, expected ways; for example, the structure of the openEHR OBSERVATION, HISTORY and EVENT classes will generically represent any observation, from a single weight measurement to 100,000 samples of complex vital signs data.

The model is accordingly rigorous. However, for some developers who only need to instantiate, commit and/or read relatively limited data-sets, the canonical format can be demanding. This is particularly so for situations in which only a few kinds of data are implicated, i.e. a relatively small number of templates, e.g. vitals signs data, lab results etc.

The starting point for defining a developer-friendly commit format is therefore to assume that the great majority of applications are typically targeted to one or a few specific data sets, e.g. vital signs, diabetic monitoring, pregnancy plan etc.

Template-specificity provides a route for defining and generating a serial format such that each openEHR template can be used to define one or more reasonably simple commit formats. Two such formats are discussed here:

near-canonical RM Simplified Data Template (ncSDT), originally devised for the EtherCIS project, see https://github.com/ethercis/ethercis/blob/master/doc/flat%20json.md
simplified IM Simplified Data template (simSDT), based on the 'web template' format originally created by Marand for the Better platform, see https://www.ehrscape.com/examples.html

The first of these (ncSDT) is an extract from an Operational Template (OPT) that uses AQL-style paths (based on natural language independent codes like at0001), and apart from simplification of the DV_XXX and PARTY_PROXY types, retains the openEHR RM structure. An example is shown below.

{
    "/context/health_care_facility|name":"Northumbria Community NHS",
    "/context/health_care_facility|identifier":"999999-345",
    "/context/start_time":"2015-09-28T10:18:17.352+07:00",
    "/context/end_time":"2015-09-28T11:18:17.352+07:00",
    "/context/participation|function":"Oncologist",
    "/context/participation|name":"Dr. Marcus Johnson",
    "/context/participation|identifier":"1345678",
    "/context/participation|mode":"face-to-face communication::openehr::216",
    "/context/location":"local",
    "/context/setting":"openehr::227|emergency care|",
    "/composer|identifier":"1345678",
    "/composer|name":"Dr. Marcus Johnson",
    "/category":"openehr::433|event|",
    "/territory":"FR",
    "/language":"fr",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/participation:0":"Nurse|1345678::Jessica|face-to-face communication::openehr::216",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/participation:1":"Assistant|1345678::2.16.840.1.113883.2.1.4.3::NHS-UK::ANY::D. Mabuse|face-to-face communication::openehr::216",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/activities[at0001]/timing":"before sleep",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/activities[at0001]/description[openEHR-EHR-ITEM_TREE.medication_mod.v1]/items[at0001]":"aspirin",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/activities[at0002]/timing":"lunch",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/activities[at0002]/description[openEHR-EHR-ITEM_TREE.medication_mod.v1]/items[at0001]":"Atorvastatin"
}

The latter (simSDT) is based on a more radical simplification of the openEHR RM and BASE models, as well as a more programmer-friendly (natural language based) rendering of paths. Under this approach, a lab result data structure could be represented using non-canonical, attribute or tag names such as 'serum_sodium', 'serum_potassium', etc (in English or in any other natural language, including script-based languages), instead of each node appearing under the canonical CLUSTER.items attribute and codes like at0001. An example is shown below.

{
  "laboratory_order/_uid": "23d69330-7790-4394-8abc-1455681f6ffa::ydh.code4health.com::1",
  "laboratory_order/language|code": "en",
  "laboratory_order/language|terminology": "ISO_639-1",
  "laboratory_order/territory|code": "GB",
  "laboratory_order/territory|terminology": "ISO_3166-1",
  "laboratory_order/context/_health_care_facility|id": "999999-345",
  "laboratory_order/context/_health_care_facility|id_scheme": "2.16.840.1.113883.2.1.4.3",
  "laboratory_order/context/_health_care_facility|id_namespace": "NHS-UK",
  "laboratory_order/context/_health_care_facility|name": "Northumbria Community NHS",
  "laboratory_order/context/setting|terminology": "openehr",
  "laboratory_order/laboratory_test_request/_uid": "b8c17799-457d-4583-8d85-c369dffacc21",
  "laboratory_order/laboratory_test_request/lab_request/service_requested|code": "444164000",
  "laboratory_order/laboratory_test_request/lab_request/service_requested|value": "Urea, electrolytes and creatinine measurement",
  "laboratory_order/laboratory_test_request/lab_request/service_requested|terminology": "SNOMED-CT",
  "laboratory_order/laboratory_test_request/lab_request/timing": "R5/2015-04-10T00:19:00+02:00/P2M",
  "laboratory_order/laboratory_test_request/lab_request/timing|formalism": "timing",
  "laboratory_order/laboratory_test_request/narrative": "Urea, electrolytes and creatinine measurement",
  "laboratory_order/laboratory_test_request/language|code": "en",
  "laboratory_order/laboratory_test_tracker/time": "2015-04-10T00:19:02.518+02:00",
  "laboratory_order/laboratory_test_tracker/language|code": "en",
  "laboratory_order/laboratory_test_tracker/language|terminology": "ISO_639-1",
  "laboratory_order/laboratory_test_tracker/encoding|code": "UTF-8",
  "laboratory_order/laboratory_test_tracker/encoding|terminology": "IANA_character-sets",
  "laboratory_order/composer|name": "Dr Joyce Smith",
  "ctx/language": "en",
  "ctx/territory": "GB"
}

A developer just using the simSDT or ncSDT as illustrated above in a specific example-based use case does not necessarily need to understand the detailed steps of conversions described below. Platforms based on openEHR can have services that generate example instances based on openEHR templates to make work easier for such developers. The detailed descriptions below are primarily intended for developers creating and maintaining underlying openEHR platforms or dealing with complex use cases.

To make any form of 'simplified format' work, the following requirements must be met:

the format makes it possible to abstract away rigorous structural complexity of the canonical model where possible, mainly by making the data less self-standing, and relying more on a schema;
the format definition for any given commit data can be completely and routinely machine generated from its canonical definition, i.e. from an openEHR OPT, or other upstream canonical definition;
data instances of the simplified format definition can be routinely machine converted to canonical format at execution time.

A generic high-level algorithm for creating both kinds of data template definition from an Operational Template (OPT) is illustrated below.

Figure 1. Scheme for generation of JSON Template definitions

In the above, both the near-canonical data and simplified data template definitions are created via a series of transformations starting with an OPT, followed by RM flattening, and then two stages of JSON format generation. The more heavily simplified form is created via an extra step, in which an original OPT is converted by the sOPT transformer to a simplified OPT (sOPT), which is a regular-structured OPT, but whose underlying reference model is a Simplified Information Model (SIM), based on the canonical Reference Model (RM) and related openEHR Information Models (Base, etc).

TODO: in fact, even the near-canonical data template has to be generated via a minimal sOPT step.

The SIM is approximately a logical sub-set of classes relevant to the definition of EHR commitable content, with each class being a potentially simplified form of one or more classes in the RM. The simplifications may consist of:

merging of Composition relationships (de-normalisation), which has the effect of reducing data path depth; i.e. in some cases, 2 RM classes are replaced by a single SIM class, which is relatively easy in the case of 0..1 and 1..1 relationships;
stringification of specific attributes, i.e. replacement of (usually low-level) types with String, so that the attribute may contain a string form of a complex object.

These rules are formalised in the model-to-model Transformation rules shown above. Using the SIM and the rules, a Simplified OPT (sOPT) can be generated from any Operational Template (OPT), and from there, various concrete form JSON Data Templates (JDTs) may be generated, including regular JSON and 'flat form' JSON. Regular JSON is the usual sparse hierarchical structure where hierarchy follows data model. Flat form JSON is legal JSON, extracted from regular JSON by converting it to the logical model of the tuple [path, leaf_data_item:Any], i.e. a logical 2-column table of path/value. In the generation of the regular JSON, paths can expressed in AQL (standard openEHR) format, or be converted to simplified format according to a small algorithm. The option to do this is shown in the JDT formatter in the diagram above.

Instances of both JSON regular and flat JDT formats can be created by developers to represent openEHR data to be committed to a system. These will be converted to canonical RM format (also obeying their original OPTs) by the simSDT → RM converter on the server side at data commit time, as shown in the following diagram.

Figure 2. Scheme for conversion of Simplified Template instance to canonical form

Following this scheme, this specification describes the Simplified Reference Model (SIM), Simplified OPT Transformer (sOPT Transformer), and the downstream JSON concrete formats and the ST → canonical instance converter.

3. sOPT Generation

This section describes how a Simplified OPT may be machine-derived from its corresponding canonical OPT definition and the Simplified Reference Model (SIM).

3.1. Visitor algorithm

3.1.1. C_COMPLEX_OBJECT

SOptVisitor:: OptVisitor {

    //
    // Visitor function for C_COMPLEX_OBJECT node in OPT
    //
    public enterCComplexObject (CComplexObject cObj) {
        // obtain or synthesise SIM class name for cCObj.rmTypeName
        simClassName = xxx;

        // create SOpt CComplexObject
        CComplexObject SCobj = new CComplexObject(simClassName);

        // process attributes
        for (CAttribute cAttr in cCObj.attributes) {

            // find any rule for this attribute
            if (rules.hasPathRule (cObj.rmTypeName, CAttr.rmAttributeName)) {
                attrRule = rules.pathRule (cObj.rmTypeName, CAttr.rmAttributeName);

                // deal with collapse case - go through all matching paths
                if (attrRule.collapse) {
                    // create a new output C_ATTRIBUTE with the SIM attribute name
                    CAttribute sCAttr = new CAttribute (baseName(attrRule.simPath));

                    // attach new C_OBJECT to cAttr
                    for (collapseRule in rules.matchingPathRules (cCObj.rmTypeName, CAttr.rmAttributeName)
                        sCAttr.appendChildren (makeCCObjects (collapseRule));

                }
                else
                    sCAttr.appendChildren (makeCCObjects (attrRule));
            }
        }
    }

    private List<CObject> makeCCObjects (CAttribute cAttr, SOptRule aRule) {
        List<CObject> sChildCObjList = new List<CObject>;
        CObject sChildCObj;

        // if SIM target type is primitive, apply constraint conversion rule
        // e.g. convert C_TERMINOLOGY_CODE to a C_STRING
        if (PrimitiveTypes.has (aRule.sim_type)) {
            for (cChildObj in cAttr.children) {
                sChildCObj = rules.execute (aRule.constraintConvRule (cChildObj));
                sChildCObjList.append (cChildObj)
            }
        }
        // otherwise, execute visitor on sub-tree at path to create new C_COMPLEX_OBJECT
        else
            sChildCObjList.extend (visit (aRule.rmPath));

        return sChildCObjList;
    }
}

4. Simple Template Concrete Formats

Note	under development; currently just notes.

4.1. Sparse JSON

xxxx

5. Instance Conversion

This section describes how an instance of a given Simplified Template can be converted to its canonical form at execution time, so as to enable committing to the persistence layer.