openEHR logo

Simplified Data Template (SDT)

Issuer: openEHR Specification Program

Release: ITS-REST latest

Status: DEVELOPMENT

Revision: [latest_issue]

Date: [latest_issue_date]

Keywords: JSON, REST, web template, commit, simplified data, TDS, SDT, ncSDT, simSDT, structSDT

openEHR components
© 2019 - 2023 The openEHR Foundation

The openEHR Foundation is an independent, non-profit foundation, facilitating the sharing of health records by consumers and clinicians via open specifications, clinical models and open platform implementations.

Licence

image Creative Commons Attribution-NoDerivs 3.0 Unported. https://creativecommons.org/licenses/by-nd/3.0/

Support

Issues: Problem Reports
Web: specifications.openEHR.org

Amendment Record

Issue Details Raiser Completed

ITS_REST Release 1.0.3

ITS_REST Release 1.0.2

1.0.2

SPECITS-57. Updating information and links of simplified JSON formats

E Sundvall,
S Iancu

13 Mar 2021

ITS_REST Release 1.0.1

0.7.0

SPECITS-33. Add Simplified Data Template format specification;
Initial writing, adapted from the Marand Better Platform 'Web Templates' specification, EtherCIS documentation and the Ocean Health Systems Template Data Schema.

openEHR SEC

17 Jul 2019

Acknowledgements

Primary Author

  • Thomas Beale, Ars Semantica (UK); openEHR Foundation Management Board.

Contributors

This specification has benefited from formal and informal input from the openEHR and wider health informatics community. The openEHR Foundation would like to recognise the following for their contributions.

  • Christian Chevalley, Architect, EtherCIS, Thailand

  • Borut Fabjan, Program Manager, Better, Slovenia

  • Bostjan Lah, Senior Architect, Better, Slovenia

  • Ian McNicoll MD, FreshEHR, UK

  • Bjørn Næss, DIPS, Norway

  • Matija Polajnar, Architect, Marand, Slovenia

A significant part of the design ideas and content of this specification was adapted from:

  • the Marand 'Web Templates' specification, kindly provided by Better by Marand d.o.o.;

  • the EtherCIS ECISFLAT format by the EtherCIS community;

  • the XSD-based Template Data Schema (TDS) format developed by Ocean Health Systems.

Trademarks

  • 'openEHR' is a registered trademark of the openEHR Foundation

  • 'OMG' and 'UML' are registered trademarks of the Object Management Group

1. Preface

1.1. Purpose

This document describes a 'simplified template' (ST) format, along with various concrete data formats, intended to be easier to use than the canonical formats, for typical developers with minimal exposure to openEHR. These formats are designed to enable easier creation of openEHR content (i.e. Compositions etc).

1.2. Status

This specification is in the DEVELOPMENT state. The development version of this document can be found at https://specifications.openehr.org/releases/ITS-REST/latest/simplified_data_template.html.

Known omissions or questions are indicated in the text with a 'to be determined' paragraph, as follows:

TBD: (example To Be Determined paragraph)

1.3. Feedback

Feedback may be provided on the openEHR ITS forum.

Issues may be raised on the specifications Problem Report tracker.

To see changes made due to previously reported issues, see the ITS-REST component Change Request tracker.

1.4. Conformance

When used with openEHR-REST API, the available calls to be conformance tested are the same as for other openEHR serialisation formats (canonical JSON etc), but with at different representation format (indicated by setting appropriate Content-Type http header).

TBD: Conformance testing and Content-Type will be further described later.

2. Overview

2.1. Requirements

The default serialised data representations for openEHR content are canonical XML, based on the openEHR RM XSDs, canonical JSON, described by the openEHR JSON schemas, and potentially any other canonical serial format based on the underlying Reference Model (e.g. YAML etc). Here, 'canonical' means any fully-expressed instance data in which:

  • the containment structure follows that of the RM;

  • all RM mandatory fields are present;

  • all attributes are named as per the RM, and

  • all cardinalities respect the RM.

The canonical formats are routinely used by all openEHR implementations implementing the openEHR REST API specification, and in other ways, e.g. for DB dump/load implementation, ETL and so on. However, creating data instances according to these formats is not always straightforward, particularly for developers with minimal exposure to openEHR, and various alternatives have been used in the past to simplify the job of content creation and committal for application developers. These include TDS/TDD (XSD-based Template Data Schema & Document), and so-called EtherCIS Flat JSON, Marand Flat JSON and Structured JSON.

This specification responds to the requirement for a complete representation of all of these serial formats within a common formal framework that will permit further variants in the future as and when needed.

3. Conceptual Approach

3.1. Background

The information models of openEHR are structured in multiple layers, with the primary distinction being between an information model layer (the 'Reference Model' or RM), and domain-level models expressed in archetypes and templates, that latter of which express particular data sets. Each such data set is defined in terms of an openEHR Operational Template (OPT), derived from a source template, and ultimately particular archetypes, which are themselves constraint models based on the RM, i.e. the 'canonical model.

The openEHR reference model (RM) and supporting models (BASE component) are designed with two computational goals in mind:

  • data instances (healthcare data) are fully defined and self-standing when shared with a data partner that does not use openEHR;

  • software that implements the model works in regular, expected ways; for example, the structure of the openEHR OBSERVATION, HISTORY and EVENT classes will generically represent any observation, from a single weight measurement to 100,000 samples of complex vital signs data.

The model is accordingly rigorous. However, for some developers who only need to instantiate, commit and/or read relatively limited data-sets, the canonical format can be demanding. This is particularly so for situations in which only a few kinds of data are implicated, i.e. a relatively small number of templates, e.g. vitals signs data, lab results etc.

The starting point for defining a developer-friendly commit format is therefore to assume that the great majority of applications are typically targeted to one or a few specific data sets, e.g. vital signs, diabetic monitoring, pregnancy plan etc.

3.2. Historical Formats

Template-specificity provides a route to simplification such that each openEHR template can be used to define one or more reasonably simple commit formats. Such formats in historical use are:

  • TDS, or Template Data Schema, an XSD format originally devised by Ocean Health Systems;

  • near-canonical RM Simplified Data Template (ncSDT), based on the ECISFLAT format, originally devised for the EtherCIS project;

  • simplified IM Simplified Data Template (simSDT), based on the FLAT version of the 'web template' format, originally created by Marand for the Better platform (see also their examples page, as well as their tests on GitHub);

  • structured IM Simplified Data Template (structSDT), based on the STRUCTURED version of the 'web template' format, originally created by Marand for the Better platform.

The web template format was originally based on the TDS, with a concrete expression in JSON and using paths, rather than sparse XML.

The simSDT format expressed in JSON is also colloquially known as the "Simplified JSON Format" or "Flat Format". It is also supported and documented by EHRbAse.

3.2.1. XML Schema Formats

The TDS format in historical use is defined by an openly available XSLT script that transformed .oet template source files and archetypes to a single XSD for any given template. The transformation flattened various RM structures to make them simpler to understand, and also converted archetype node codes (at-codes of Object nodes) to XSD tag names, e.g. 'serum_sodium'. This enabled developers to easily identify the XML Element for each data item they needed to populate to create a TDS instance document, known as a Template Data Document (TDD). The following is an extract of a TDD, which illustrates developer-friendly tags, in this case in Portuguese language.

<Problema_Diagnóstico>
  <name>
    <value>Problema Diagnóstico</value>
  </name>
  <language>
    <terminology_id>
      <value>ISO_639-1</value>
    </terminology_id>
    <code_string>pt</code_string>
  </language>
  <encoding>
    <terminology_id>
      <value>IANA_character-sets</value>
    </terminology_id>
    <code_string>UTF-8</code_string>
  </encoding>
  <subject xsi:type="oe:PARTY_SELF"/>
  <data>
    <Diagnóstico>
      <name>
        <value>Diagnóstico</value>
      </name>
      <value>
        <oe:value>Hipertensão secundária</oe:value>
        <oe:defining_code>
          <oe:terminology_id>
            <oe:value>CID-10_1998.v1.0.0</oe:value>
          </oe:terminology_id>
          <oe:code_string>I15</oe:code_string>
        </oe:defining_code>
      </value>
    </Diagnóstico>
  </data>
</Problema_Diagnóstico>

3.2.2. JSON Formats

The first of the JSON formats (ncSDT) is an extract from an Operational Template (OPT) that uses AQL-style paths (based on natural language independent codes like at0001), and apart from simplification of the DV_XXX and PARTY_PROXY types, retains the openEHR RM structure. An example is shown below.

{
    "/context/health_care_facility|name":"Northumbria Community NHS",
    "/context/health_care_facility|identifier":"999999-345",
    "/context/start_time":"2015-09-28T10:18:17.352+07:00",
    "/context/end_time":"2015-09-28T11:18:17.352+07:00",
    "/context/participation|function":"Oncologist",
    "/context/participation|name":"Dr. Marcus Johnson",
    "/context/participation|identifier":"1345678",
    "/context/participation|mode":"face-to-face communication::openehr::216",
    "/context/location":"local",
    "/context/setting":"openehr::227|emergency care|",
    "/composer|identifier":"1345678",
    "/composer|name":"Dr. Marcus Johnson",
    "/category":"openehr::433|event|",
    "/territory":"FR",
    "/language":"fr",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/participation:0":"Nurse|1345678::Jessica|face-to-face communication::openehr::216",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/participation:1":"Assistant|1345678::2.16.840.1.113883.2.1.4.3::NHS-UK::ANY::D. Mabuse|face-to-face communication::openehr::216",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/activities[at0001]/timing":"before sleep",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/activities[at0001]/description[openEHR-EHR-ITEM_TREE.medication_mod.v1]/items[at0001]":"aspirin",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/activities[at0002]/timing":"lunch",
    "/content[openEHR-EHR-SECTION.medications.v1]/items[openEHR-EHR-INSTRUCTION.medication.v1]/activities[at0002]/description[openEHR-EHR-ITEM_TREE.medication_mod.v1]/items[at0001]":"Atorvastatin"
}

The simSDT JSON format is based on a more radical simplification of the openEHR RM and BASE models, as well as a more programmer-friendly (natural language based) rendering of paths. Under this approach, a lab result data structure could be represented using non-canonical, attribute or tag names such as 'serum_sodium', 'serum_potassium', etc (in English or in any other natural language, including script-based languages), instead of each node appearing under the canonical CLUSTER.items attribute and codes like at0001. An example is shown below.

{
  "laboratory_order/_uid": "23d69330-7790-4394-8abc-1455681f6ffa::ydh.code4health.com::1",
  "laboratory_order/language|code": "en",
  "laboratory_order/language|terminology": "ISO_639-1",
  "laboratory_order/territory|code": "GB",
  "laboratory_order/territory|terminology": "ISO_3166-1",
  "laboratory_order/context/_health_care_facility|id": "999999-345",
  "laboratory_order/context/_health_care_facility|id_scheme": "2.16.840.1.113883.2.1.4.3",
  "laboratory_order/context/_health_care_facility|id_namespace": "NHS-UK",
  "laboratory_order/context/_health_care_facility|name": "Northumbria Community NHS",
  "laboratory_order/context/setting|terminology": "openehr",
  "laboratory_order/laboratory_test_request/_uid": "b8c17799-457d-4583-8d85-c369dffacc21",
  "laboratory_order/laboratory_test_request/lab_request/service_requested|code": "444164000",
  "laboratory_order/laboratory_test_request/lab_request/service_requested|value": "Urea, electrolytes and creatinine measurement",
  "laboratory_order/laboratory_test_request/lab_request/service_requested|terminology": "SNOMED-CT",
  "laboratory_order/laboratory_test_request/lab_request/timing": "R5/2015-04-10T00:19:00+02:00/P2M",
  "laboratory_order/laboratory_test_request/lab_request/timing|formalism": "timing",
  "laboratory_order/laboratory_test_request/narrative": "Urea, electrolytes and creatinine measurement",
  "laboratory_order/laboratory_test_request/language|code": "en",
  "laboratory_order/laboratory_test_tracker/time": "2015-04-10T00:19:02.518+02:00",
  "laboratory_order/laboratory_test_tracker/language|code": "en",
  "laboratory_order/laboratory_test_tracker/language|terminology": "ISO_639-1",
  "laboratory_order/laboratory_test_tracker/encoding|code": "UTF-8",
  "laboratory_order/laboratory_test_tracker/encoding|terminology": "IANA_character-sets",
  "laboratory_order/composer|name": "Dr Joyce Smith",
  "ctx/language": "en",
  "ctx/territory": "GB"
}

Another variant for this simplification is the structSDT JSON format, with the difference that data is represented in JSON structures based on paths from the associated Web Template, rather than flattening them as a key-value list. An example is shown below.

{
    "ctx": {
      "language": "en",
      "territory": "SI",
      "composer_name": "matijak_test"
    },
    "vitals": {
      "vitals": [
        {
          "body_temperature": [
            {
              "any_event": [
                {
                  "description_of_thermal_stress": [
                    "Test description of symptoms"
                  ],
                  "temperature": [
                    {
                      "|magnitude": 37.2,
                      "|unit": "°C"
                    }
                  ],
                  "symptoms": [
                    {
                      "|code": "at0.64",
                      "|value": "Chills / rigor / shivering",
                      "|terminology": "local"
                    }
                  ],
                  "time": [
                    "2014-01-22T15:18:07.339+01:00"
                  ]
                }
              ]
            }
          ]
        }
      ],
      "context": [
        {
          "setting": [
            {
              "|code": "238",
              "|value": "other care",
              "|terminology": "openehr"
            }
          ],
          "start_time": [
            "2014-01-22T15:18:07.339+01:00"
          ]
        }
      ]
    }
  }

3.3. General Form of an Algorithm

Note
A developer just using the simplified formats as illustrated above in a specific example-based use case does not need to understand the detailed steps of conversions described below.
Platforms based on openEHR can have services that generate example instances based on openEHR templates to make work easier for such developers. The detailed descriptions below are primarily intended for developers creating and maintaining underlying openEHR platforms or dealing with complex use cases.

To make any form of 'simplified format' work, the following requirements must be met:

  • the format makes it possible to abstract away rigorous structural complexity of the canonical model where possible, mainly by making the data less self-standing, and relying more on a schema;

  • the format definition for any given commit data can be completely and routinely machine generated from its canonical definition, i.e. from an openEHR OPT, or other upstream canonical definition;

  • data instances of the simplified format definition can be routinely machine converted to canonical format at execution time.

A generic high-level algorithm for creating both kinds of data template definition from an Operational Template (OPT) is illustrated below.

simplified template definition
Figure 1. Scheme for generation of JSON Template definitions

In the above, both the near-canonical data and simplified data template definitions are created via a series of transformations starting with an OPT, followed by RM flattening, and then two stages of JSON format generation. The more heavily simplified form is created via an extra step, in which an original OPT is converted by the sOPT transformer to a simplified OPT (sOPT), which is a regular-structured OPT, but whose underlying reference model is a Simplified Information Model (SIM), based on the canonical Reference Model (RM) and related openEHR Information Models (Base, etc).

TODO: in fact, even the near-canonical data template has to be generated via a minimal sOPT step.

The SIM is approximately a logical sub-set of classes relevant to the definition of EHR commitable content, with each class being a potentially simplified form of one or more classes in the RM. The simplifications may consist of:

  • merging of Composition relationships (de-normalisation), which has the effect of reducing data path depth; i.e. in some cases, 2 RM classes are replaced by a single SIM class, which is relatively easy in the case of 0..1 and 1..1 relationships;

  • stringification of specific attributes, i.e. replacement of (usually low-level) types with String, so that the attribute may contain a string form of a complex object.

These rules are formalised in the model-to-model Transformation rules shown above. Using the SIM and the rules, a Simplified OPT (sOPT) can be generated from any Operational Template (OPT), and from there, various concrete form JSON Data Templates (JDTs) may be generated, including regular JSON and 'flat form' JSON. Regular JSON is the usual sparse hierarchical structure where hierarchy follows data model. Flat form JSON is legal JSON, extracted from regular JSON by converting it to the logical model of the tuple [path, leaf_data_item:Any], i.e. a logical 2-column table of path/value. In the generation of the regular JSON, paths can expressed in AQL (standard openEHR) format, or be converted to simplified format according to a small algorithm. The option to do this is shown in the JDT formatter in the diagram above.

Instances of both JSON regular and flat JDT formats can be created by developers to represent openEHR data to be committed to a system. These will be converted to canonical RM format (also obeying their original OPTs) by the simSDT → RM converter on the server side at data commit time, as shown in the following diagram.

simplified template data conversion
Figure 2. Scheme for conversion of Simplified Template instance to canonical form

Following this scheme, this specification describes the Simplified Reference Model (SIM), Simplified OPT Transformer (sOPT Transformer), and the downstream JSON concrete formats and the ST → canonical instance converter.

4. sOPT Generation

This section describes how a Simplified OPT may be machine-derived from its corresponding canonical OPT definition and the Simplified Reference Model (SIM).

4.1. Visitor algorithm

4.1.1. C_COMPLEX_OBJECT

SOptVisitor:: OptVisitor {

    //
    // Visitor function for C_COMPLEX_OBJECT node in OPT
    //
    public enterCComplexObject (CComplexObject cObj) {
        // obtain or synthesise SIM class name for cCObj.rmTypeName
        simClassName = xxx;

        // create SOpt CComplexObject
        CComplexObject SCobj = new CComplexObject(simClassName);

        // process attributes
        for (CAttribute cAttr in cCObj.attributes) {

            // find any rule for this attribute
            if (rules.hasPathRule (cObj.rmTypeName, CAttr.rmAttributeName)) {
                attrRule = rules.pathRule (cObj.rmTypeName, CAttr.rmAttributeName);

                // deal with collapse case - go through all matching paths
                if (attrRule.collapse) {
                    // create a new output C_ATTRIBUTE with the SIM attribute name
                    CAttribute sCAttr = new CAttribute (baseName(attrRule.simPath));

                    // attach new C_OBJECT to cAttr
                    for (collapseRule in rules.matchingPathRules (cCObj.rmTypeName, CAttr.rmAttributeName)
                        sCAttr.appendChildren (makeCCObjects (collapseRule));

                }
                else
                    sCAttr.appendChildren (makeCCObjects (attrRule));
            }
        }
    }

    private List<CObject> makeCCObjects (CAttribute cAttr, SOptRule aRule) {
        List<CObject> sChildCObjList = new List<CObject>;
        CObject sChildCObj;

        // if SIM target type is primitive, apply constraint conversion rule
        // e.g. convert C_TERMINOLOGY_CODE to a C_STRING
        if (PrimitiveTypes.has (aRule.sim_type)) {
            for (cChildObj in cAttr.children) {
                sChildCObj = rules.execute (aRule.constraintConvRule (cChildObj));
                sChildCObjList.append (cChildObj)
            }
        }
        // otherwise, execute visitor on sub-tree at path to create new C_COMPLEX_OBJECT
        else
            sChildCObjList.extend (visit (aRule.rmPath));

        return sChildCObjList;
    }
}

5. Simple Template Concrete Formats

Note
under development; currently just notes.

5.1. Sparse JSON

xxxx

6. Instance Conversion

This section describes how an instance of a given Simplified Template can be converted to its canonical form at execution time, so as to enable committing to the persistence layer.