Health information model: Difference between revisions

From Endeavour Knowledge Base
 
(121 intermediate revisions by the same user not shown)
Line 1: Line 1:
This article describes the approach taken to producing information models,  including ; what they are, what their purpose is, and what the technical components of the models are.


Information modelling is the set of processes by which representations of data relationships are created maintained and queried.  
The article does not include the content of any particular model.  


The Discovery models are designed both for human visualisation and for computers to use.  
== What is the health information model (IM) and what is its purpose? ==
The IM is a representation of the meaning and structure of data held in the electronic records of the health and social care sector, together with libraries of query, value sets, concept sets, data set definitions and mappings.


Systems that use the models can use any or all of three approaches:
The main purpose is to bridge the chasm that exists between highly technical digital representations and plain language so that when questions are asked of data, a lay person could use plain language without prior knowledge of the underlying models.


# Direct use of the model data content as a database (or set of files that can populate a database via  script)
It is a computable abstract logical model, not a physical structure or schema. "computable" means that operational software operates directly from the model artefacts, as opposed to using the model for illustration purposes. As a logical model it models data that may be physically held any a variety of different types of data stores, including relational or graph data stores. Because the model is independent of the physical schemas, the model itself has to be interoperable and without any proprietary lock in.
# Use via a set of APIs (both local and remote) designed to provide access to the data within the model, or to trigger outputs of the model for 1)
#Use of the information model technologies themselves via the use of the published open source code


__TOC__
The IM is a broad model that integrates a set of different approaches to modelling using a common ontology. The components of the model are:


== Information model functions ==
# A set of ontologies, which is a vocabulary and definitions of the concepts used in healthcare, or more simply put, a vocabulary of health. The ontologies is made up of the world's leading ontology Snomed-CT, with a London extensions, various code based taxonomies (e.g. ICD10, Read, supplier codes and local codes)
The information Information models have 4 core functional requirements internal to the model: '''Description of the model , validation of model content, population of the model, and query of the model.''' In support of query there is also the need to support  '''inference''' which generates new insights that were not necessarily authored.
# A common data model, which is a set of classes and properties, using the vocabulary, that represent the data and relationships as published by  live systems that have published data, Note that  this data model is NOT a standard model but a collated set of entities and relationships bound to the concepts based on real data,  that are mapped to a common model.
# A library of business specific concept value sets, (aka reference sets) which are  expression constraints on the ontology for the purpose of query
# A catalogue of reference data such as geographical areas, organisations and people derived and updated from public resources.
# A library of Data set (query) definitions  for querying and extracting instance data from the information model, reference data, or health records.
# A set of maps creating mappings between published concepts and the core ontology as well as structural mappings between submitted data and the data model.
# An open source set of utilities that can be used to browse, search, or maintain the model.


In addition the information model must support the same 4 core functional requirements on actual health data that is modelled.
<br />


* '''Description of the model.''' There is little point in having a model unless it can be described and understood. Knowing what is in a model is a pre-requisite to using it. For example, there is no point in trying to find out if a patient record indicates whether or not they have diabetes if the model doesn't include the ability to record it. In order to understand a model, two techniques are required: diagrammatic representation and human readable text representation. A model must support both.  
== Model building blocks and visualisation ==
The model consists of classes, sets and objects that are instances of classes. 
[[File:Ethnicity.jpg|thumb|Ethnicity]]
Objects can act as objects in their own rights (e.g. an instance of chest pain) or may also act as classes (e.g. the class of objects that are chest pain). Likewise sets have members that are objects and the objects may also act as classes or sets. For example a set for the 2011 Ethnicity census will contain a member object of "British" which is also a set with members such as English and so on.


* '''Data Validation''' is essential for consistent business operations. Data models, user input forms, and data set specifications are designed to enable data collections to be validated. Maintaining a standard for data collection is essential. For example, if you have a patient record in front of you, you will likely need to know their approximate age. To work this out  date of birth must be recorded. Validating that the date of birth can be and has been recorded is important. However,  if ''more than one'' date of birth was recorded for the same patient, it would be less valuable. Thus a modelling language must include the ability to '''constrain''' data models to suit particular business needs.
The model itself is stored as an RDF based  knowledge graph, which means it is implementable in any mainstream Graph database technology. There are no vendor specific extensions to RDF.  


* '''Population of the model.''' It is impractical to build model content from scratch and likewise virtually impossible to populate instances with existing data without some manipulation. An information model must contain the ability to model mappings between currently held data and model conformant data.
In line with the RDF standard,  all persistent types, classes, , property identifiers and object value identifiers are uniquely named using international resource identifiers. In most cases the identifiers are externally provided (e.g. Snomed-CT identifiers) whilst in others that have been created for a particular model. Organisations that author elements of the models use their own identifiers.  


* '''Enquiry (or query''') is necessary to generate information from data. There is little point in recording data unless it can be interrogated and the results of the interrogation acted upon. Thus a modelling language must include the ability to query the data as defined or described, including the use of inference rules to find data that was recorded in one context for use in another.
From a data modelling perspective the arrangements of types may be referred to as archetypes, which are conceptually similar to FHIR profiles. In the semantic web world they would be considered "shapes". There are an unlimited number of these which frees the model from any particular conventional relational database schema. Inheritance of types is supported which enables broad classifications of types and re-usability.  


* '''Inference''' is pivotal to decision making. For example, if you are about to prescribe a drug containing methicillin to a patient, and the patient has previously stated that they are allergic to penicillin, it is reasonable to infer that if they take the drug, an allergic reaction might ensue, and thus another drug is prescribed. Thus a modelling language must include the ability to infer things and classify things for safe decisions to be made
The variation between the parts of the model that model terminology concepts and those that model data use slightly different grammars in keeping with their different purposes. The information model language describes the differences.


== Model structure ==
The models can be viewed in their raw technical form (in JSON or Turtle) or can be viewed by the information model viewer at the online tool [https://im.endeavourhealth.net/#/ Information model directory] 
A model must be built from some structure, using some tools or processes to build it. This section describes the nature of the structure that makes up the information model. The tools used to build the model includes the use of an information modelling language which is described separately.
[[File:IM classes.png|thumb|IM main structural types as classes]]
A model must have a model i.e. a meta model that models the model i.e. the types of things a model is made up of.


The Discovery model can be described as an "'''Object Role Model (ORM) that includes an Ontology as one of the roles'''". It can also be described and implemented as a small number of main classes with each main class covering a role type.
== Information model language ==


Both perspectives
''Main article'' [[Health Information modelling language - overview|information modelling language]] describes the language in more detail.


The roles themselves can be categorised into types. As one of the types of roles includes ontological axioms, this means that the model can operate both with the open world assumption (as required by the semantic web) and a closed world assumption (as required by the business of healthcare).
The semantic web approach is adopted for the purposes of identifiers and grammar. In this approach, data can be described via the use of a plain language grammar consisting of a subject, a predicate, and an object;  A triple, with an additional context referred to as a graph or RDF data set. The theory is that all health data can be described  in this way (with predicates being extended to include functions).


The main types are illustrated in the right hand image.
However, the semantic web languages are highly complex and a set of more pragmatic approaches are taken for the more specialised structures.


Interaction between the model and the external world is undertaken via the [[Health Information modelling language - overview|Discovery information model language,]] (or alternatively a set of W3C recommended languages) . These are described separately but consists of a language built from RDF triples applying the W3C language grammars and vocabularies of OWL2, SHACL, SPARQL, with support for GRAPHQL
The consequence of this approach is that W3C web standards can be used such as the use of [[wikipedia:Resource_Description_Framework|Resource Descriptor Framework o]]<nowiki/>r RDF. This sees the world as a set of triples (subject/ predicate/ object) with some things named and somethings anonymous. Systems that adopt this approach can exchange data in a way that the semantics can be preserved. Whilst RDF is an incredibly arcane language at a machine level, the things it can describe can be very intuitive when represented visually. In other words the Information modelling approach involves an RDF Graph.


The following sections briefly describe the various model classes illustrated above. The IM language specification provides more insight into the details of the classes.
In addition to semantic web languages, other commonly used languages are in place are used to enable the model to be accessed by more people.. For example the Snomed-CT expression constraint language is a common way of defining concept sets. ECL is logically equivalent to a closed world query on an open world OWL ontology. The IM language uses the semantic language of SPARQL together with entailment to model ECL but ECL can be exported or used as input as an alternative.


<br />
<br />
=== Concept ===
All things that can be referenced via an identifier can be thought of as a concept. Even the classes and structures of the information model themselves are concepts.
A concept is defined as an ‘abstract idea’ or ‘general understanding of something’ and this meaning is preserved in the modelling language. It is one of the few abstract classes in the information model. This means that there is no actual object of 'type concept' unless it is also a type of some subtype of a concept.
Types of concepts include : Class, Property, Shape, Value Set, Data type, Query, Collection, term, and annotation.  Each of these specialise in their function and properties and inherit the core properties of a concept and specialise by extension.
'''Use of aliases.''' Aliases enable properties and classes to be used in their alias form. See language specification for how context is used to provide aliases to enable key terms to be used in business processes without the inconvenience of using IRIs. Thus these sections use aliases for convention, the aliases themselves defined as aliases to concepts.
In this section aliases that are instantiated as a number of alternatives are enclosed in { } e.g. {axiom} in a class refers to subclass, equivalent, or disjoint
A concept  also comes with a fixed set of annotation properties that can be relied on to be present or have null values
{| class="wikitable"
|+
!Property alias
!Cardinality
|Type
!Description
|-
|'''iri'''
|1
|IRI
|an international identifier, the format as described within the language specification
|-
|'''status'''
|1
|Status type
|A status concept representing the status of the concept in terms if its activity status   e.g. Active or inactive
|-
|'''name     '''
|1
|String
|This is the full name of the concept (or preferred term in Snomed-CT.) In OWL2 this is a label annotation
|-
|'''description'''
|1
|String
|''' '''A plain language meaning of the concept, and how it may be used
|-
|'''version'''
|1
|integer
|The version in which this concept was first created
|-
|'''code'''
|0..1
|String
|''' '''If the concept has a code, the code assigned to this concept by the original creator, e,g, a Snomed-CT, READ2, ICD10, OPCS or local code or auto generated code
|-
|'''scheme'''
|0..1
|IRI
|f the concept has a code, the code scheme assigned to this code, the scheme itself being an IRI
|-
|'''termKey'''
|0..*
|String
|A number of keys used to link to the concept.  Should not be confused with a term concept which is an alternative term linked to a concept
|-
|'''annotation'''
|0..*
|Annotation
|Concepts may have additional informative simple string properties used for a variety of business purposes
|-
|'''alias'''
|0..*
|String
|Aliases for this concept i.e. reserved terms within the context of a particular application that is implementing the information model and wishes to use aliases rather than the IRIs
|}
=== Ontological Class ===
An ontological class is an extension of the concept class and is used as the main means for defining semantic concepts that are classes of objects for use in healthcare records.
The difference between an ontological class (often referred to as an owl class) and a simple concept is that it can be semantically defined by the use of class axioms. Class axioms such as subclass or equivalent classes are used for reasoning (inferencing and classification )and enable the information model to be queries using subsumption query.
{| class="wikitable"
|+
!Property alias
|Cardinality
!Type
!Description
|-
|'''type'''
|1
| Class
| A type of concept that is a class for the purposes of ontological definition
|-
|'''{axiom}'''
|0..*
|Class axiom
|An axiom normally used to define a class e.g. Subclass, equivalent class, disjoint classes
|-
|}
=== Ontological property ===
An ontological property is an extension of the concept class and is used as the main means for defining semantic concepts that are used as properties or predicates. The difference between this and a class is that properties themselves cannot have properties. Nevertheless the use of property axioms to define properties makes them very powerful. Sub properties are included in subsumption tests on classes as well as linking properties that operate in reverse directions in a graph.
<br/>
{| class="wikitable"
|+
!Property alias
!Type
!Cardinality
!Description
|-
|'''type'''
|1
| Property
| A type of concept that is a property or predicate, and used throughout the model. This includes most of the reserved tokens used in the IM classes and the IM language itself
|-
|'''{axiom}'''
|1..*
|Property axiom
|An axiom normally used to define a property including domain, range, sub property, and whether transitive, or inverse of etc
|-
|}
=== Value set ===
This is a specialised class that defines and holds a collection of concepts, those concepts not necessarily being related by subclass relationships.
A value set member is a definition of a concept which is defined by a simple form of query called an expression constraint, which is a definition of a collection of classes as described below. A value set without members can be used as a means of inferencing subclass value sets.
A value set like any other class can be ontologically defined e.g. as a subclass of another value set and thus if a value set has members defined then the subclass members would be subsets of the superclass members. Conversely, when selecting a value set that has no members in a query, and that value set has subclasses, then the inference engine would include all the members of all of the subclasses.
{| class="wikitable"
|+
!Property alias
!Type
!Cardinality
!Description
|-
|'''type'''
|1
| "ValueSet"
| A type of concept that is used as a value set, a specialised class for defining concepts in a query
|-
|'''subClassOf'''
|0..*
|Class Expression
|A value set may be a subclass of another value set
|-
|'''member'''
|0..1
| Expression constraint
| A specialised form of query that defines a collection of classes that would be subsumed when the query is run
|-
|'''expansion'''
|0..*
| IRI
| A list of concept identifiers produced by inference from the member definition, or in the absence of a member definition, simply a list of concepts
|}
=== Expression constraint ===
An expression constraint is a specialised query describing a set of concepts using class expressions and boolean logic . I.e. describes the attributes that a concept must have to be included or excluded from the set, using Boolean logic when necessary.  Because ontological classes are defined as being things with certain properties and values or value types (attribute value pairs) then the definition can include simple constraints such as something being a subclass of another concept using inference.
An expression constraint is a String of one of the IM supported language grammars e.g. Discovery expression constraint, SPARQL fragment, Snomed-CT Expression constraint language
=== Collection===
A collection is a constraint of a concept in that the concept type is one of the collection subtypes.
Collections subtypes are either lists or sets and lists may be ordered lists or unordered lists. Lists such as folders are used to initiate user navigation of the model. Collection contents have no inherent relationship with a collection concept. N.B Collections in this context should not be confused with the collection construct used in the language.
A collection is defined by its type e.g. a folder
=== Shape (data model)===
A shape extends a concept and is the mainstay of the data modelling section of the information model.
A shape dictates the properties and values used in  set of business oriented data stores i.e. defines and constrains the properties for particular purposes. A shape seems on the surface to be similar to a semantic class i.e. the properties described in a shape are all properties that one would expect to be properties of a class (e.g. date of birth as a property of a person). However, a shape is designed to be more prescriptive and "closed world". Consequently a shape can be used both to define a database schema, a message schema, and validate data content.
The shape constraint language is the major part of the modelling language and based on the W3C SHACL language.
A shape is a shape of something i.e. has a target class or a target properties and thus a shape is by default a class shape or property shape. The connection between a shape and a corresponding class (e.g. shape of a person) brings together the semantic ontology and the data modelling. Class shapes will contain property shapes which may be embedded in line rather than distinct.
{| class="wikitable"
|+
!Property alias
!Type
!Cardinality
!Description
|-
|'''type'''
|1
| Shape
| This is a shape class
|-
|'''{targetClass}'''
|0..1
| Class
| What the shape is a shape of. In health information model the alias 'record of' is likely to be used
|-
|'''targetSubjectOf'''
|0..1
| Property
| the predicate that this shape is the subject of
|-
|'''targetObjectOf'''
|0..1
| Property
| the predicate that this shape is the object of.
|}
=== Classification ===
The models include  modular classifications of  concepts. The classification modules are either generated from the ontology via classifiers (which are functions of reasoners) or have been incorporated as handcrafted classifications . Examples of ontology generated classifications are Snomed-CT "ISA" hierarchy and the Discovery health classification. Examples of handcrafted classification modules are ICD10, Read. The main thing to note about the difference between the two is that concepts in classifications that are generated from an ontology subsume their descendants as proper subtypes whereas handcrafted classifications may include subcategories that are inconsistent.
To illustrate the difference between an ontology and a classification,  Let us say that we state in an ontology that "ALL THINGS THAT BARK ARE DOGS".
Let us then say we go to the beach at Ravenscar on the North East coast of England and hear a bark. We see the animal at a distance. We ask the computer what it is.  The computer, using the generated classification would classify that animal automatically as  a DOG ( because it barks) .
However, as we get closer we see that it is something else. The ontology is clearly incorrect. Consequently we amend the ontology to state that there is such a thing as an "AN ANIMAL THAT BARKS" and that "A DOG IS A SUBCLASS OF AN ANIMAL THAT BARKS". We also state that such a thing exists as an "ANIMAL" and that "AN ANIMAL THAT BARKS IS A SUBCLASS OF AN ANIMAL". Now, when asking the computer what the animal is, the computer knows only that it is an animal that barks but ''does not know what it is''. 
We then amend the ontology to state that such a thing exists called a SEAL and that a "SEAL BARKS".  We also author the ontology to say that "AN ANIMAL THAT BARKS IS EQUIVALENT TO A THING THAT BARKS" i.e. by definition if it barks it is an animal that barks.  Now the computer automatically classifies the seal to state that "A SEAL "IS A"  ANIMAL THAT BARKS" (because it is a thing that barks and must therefore be an animal that barks). It will then be found when searching for types of animals that bark, things that bark, and things that are animals i.e. the seal MUST be an animal because all things that barks are animals that bark and an animal that barks is a subclass of an animal. The human has not needed to find the category, the reasoner does it for you, it has automatically created the classification from the properties of the thing on the beach.
== Information model storage architecture ==
[[File:IM logical object model.png|thumb]]
An information model is an abstract representation of data, but an information model must have content and that content must be stored.
Data cannot be stored conceptually, only physically, and thus there must be a relationship between the abstract model and a physical store.
In the information model services, the abstract model is instantiated as a set of objects of classes, the data element of those classes holding the subject, predicate and object structures. In reality those objects together with translation and data access methods are instantiated in some form of language. e.g. Java.
The physical store is currently held in a triple like relational database accessed by a relational database engine but could be easily stored as a native graph.
The model can then be used as the source and target of the exchange of data, the latter using a language interoperating via a set of APIs
This can be visualised as in the diagram on the right. It can be seen that the inner physical store, is accessed by an object model layer, which is itself accessed by APIs using modelling language grammar and syntax. The diagram shows the main grammars supported by the Discovery information model, including the Discovery information modelling language grammar itself.
Support for the main languages means that a Discovery information model instance has 2 levels of  separation of concerns from the languages used to exchange data, and the underlying model store. There is thus no reason to buy into Discovery language to use the information model.
Likewise, an implementation of objects that hold data in a form that is compatible with a particular data model and ontology module, can be accessed using the same language.
This makes the language just as useful for exchanging query definitions, value sets as well as useful for actual query of health record stores via interpreters.
The remainder of this article describes the language itself, starting with some high level sections on the components, and eventually providing a specification of the language and links to technical implementations, all of which are open source.
=== Data models and value sets ===
Business domain data models are modules that define relationships in the context of a particular business or set of businesses and include the health data models.
A model is only relevant for a particular set of business purposes and here is no single model that can accommodate all business purposes, although common information models can accommodate quite broad purposes.  A reasonably well understood set of business purposes is referred to in these topics as  a "business domain" or "domain of interest" and a particular information model is designed to cover a business domain.
Examples of data model modules are the core Discovery health data model, and the PRSB core record model. Related message models such as FHIR profiles or openEHR archetypes are examples of business domain specific data models. Specialist data models may exist for particular business purposes such as cancer data set definitions or be more general, such as the Discovery common data model.  The thing to note about the Discovery data models is that all concepts are defined in the semantic ontology.
The Discovery common data  model (a broad model for each domain) will generally include data relationships needed by many domains, arranged in a way that inconsistency or unreliability is avoided.
Data models, also define the expected values of properties. Sometimes these values are class statements (e.g. has colour -> Colour meaning that the colour of something is a colour) and more often they are sets of concepts brought together for business purposes - value sets.
=== Query Library ===
A variation on a conventional ontology is that concept properties can also be defined according to functional definitions, as expressed in query language. The model contains a library of query definitions that like Data models, are usually business specific.
For example one would expect  a record of a person's religion to be the concept of "Person's religion". In a data model this might be defined as " Person-> has religion -> Religion i.e. the value of the religion property of a person is a religion e.g. Hindu.
However, the same person may have many religions recorded about them during their lifetime. Thus the definition of a person's religion is more likely to be "the latest religion for this person" and that is a functional property.
A query is essentially a definition of a concept using both standard and functional properties, together with the related value sets with the addition of instructions as to what properties to return.
== Ontologies and modules ==
The Discovery common information model can be thought of as an ontology of ontologies. More precisely though it should be considered as an ontology consisting of a set of [[wikipedia:Ontology_modularization|ontology module]]<nowiki/>s with each module defined according to business needs. The principle of concept sharing, whereby one concept is identified once across the entire set of domains,  suggests that there is a single ontology. However a data model that is specified for a particular business purpose may have different class structures from another business purpose even though they share the same semantic definitions.
For example, take the idea of recoding information about a blood pressure. This is an example of a component in a data model. In General practice, it would be common practice to record a systolic and  diastolic blood pressure and thus the component would consist of 3 classes. However, in a specialist research study involving different interpretations of blood pressures, including perhaps the size o nature of the cuff, or the exact position of the patient, this component may be more complex.
This is addressed by modularisation where the axioms that define the classes belong to a particular model, even though the property domains and their ranges are shared across the ontology. This is analogous to the idea of templates derived from subsets of archetypes. The difference is that there is no "super-archetype" requiring international agreement on the items in the archetype, but instead there is a demand that the same identifier of the diastolic blood pressure record class is used throughout, even though the class definition is business specific.

Latest revision as of 10:28, 21 August 2022

This article describes the approach taken to producing information models, including ; what they are, what their purpose is, and what the technical components of the models are.

The article does not include the content of any particular model.

What is the health information model (IM) and what is its purpose?

The IM is a representation of the meaning and structure of data held in the electronic records of the health and social care sector, together with libraries of query, value sets, concept sets, data set definitions and mappings.

The main purpose is to bridge the chasm that exists between highly technical digital representations and plain language so that when questions are asked of data, a lay person could use plain language without prior knowledge of the underlying models.

It is a computable abstract logical model, not a physical structure or schema. "computable" means that operational software operates directly from the model artefacts, as opposed to using the model for illustration purposes. As a logical model it models data that may be physically held any a variety of different types of data stores, including relational or graph data stores. Because the model is independent of the physical schemas, the model itself has to be interoperable and without any proprietary lock in.

The IM is a broad model that integrates a set of different approaches to modelling using a common ontology. The components of the model are:

  1. A set of ontologies, which is a vocabulary and definitions of the concepts used in healthcare, or more simply put, a vocabulary of health. The ontologies is made up of the world's leading ontology Snomed-CT, with a London extensions, various code based taxonomies (e.g. ICD10, Read, supplier codes and local codes)
  2. A common data model, which is a set of classes and properties, using the vocabulary, that represent the data and relationships as published by live systems that have published data, Note that this data model is NOT a standard model but a collated set of entities and relationships bound to the concepts based on real data, that are mapped to a common model.
  3. A library of business specific concept value sets, (aka reference sets) which are expression constraints on the ontology for the purpose of query
  4. A catalogue of reference data such as geographical areas, organisations and people derived and updated from public resources.
  5. A library of Data set (query) definitions for querying and extracting instance data from the information model, reference data, or health records.
  6. A set of maps creating mappings between published concepts and the core ontology as well as structural mappings between submitted data and the data model.
  7. An open source set of utilities that can be used to browse, search, or maintain the model.


Model building blocks and visualisation

The model consists of classes, sets and objects that are instances of classes.

Ethnicity

Objects can act as objects in their own rights (e.g. an instance of chest pain) or may also act as classes (e.g. the class of objects that are chest pain). Likewise sets have members that are objects and the objects may also act as classes or sets. For example a set for the 2011 Ethnicity census will contain a member object of "British" which is also a set with members such as English and so on.

The model itself is stored as an RDF based knowledge graph, which means it is implementable in any mainstream Graph database technology. There are no vendor specific extensions to RDF.

In line with the RDF standard, all persistent types, classes, , property identifiers and object value identifiers are uniquely named using international resource identifiers. In most cases the identifiers are externally provided (e.g. Snomed-CT identifiers) whilst in others that have been created for a particular model. Organisations that author elements of the models use their own identifiers.

From a data modelling perspective the arrangements of types may be referred to as archetypes, which are conceptually similar to FHIR profiles. In the semantic web world they would be considered "shapes". There are an unlimited number of these which frees the model from any particular conventional relational database schema. Inheritance of types is supported which enables broad classifications of types and re-usability.

The variation between the parts of the model that model terminology concepts and those that model data use slightly different grammars in keeping with their different purposes. The information model language describes the differences.

The models can be viewed in their raw technical form (in JSON or Turtle) or can be viewed by the information model viewer at the online tool Information model directory

Information model language

Main article information modelling language describes the language in more detail.

The semantic web approach is adopted for the purposes of identifiers and grammar. In this approach, data can be described via the use of a plain language grammar consisting of a subject, a predicate, and an object; A triple, with an additional context referred to as a graph or RDF data set. The theory is that all health data can be described in this way (with predicates being extended to include functions).

However, the semantic web languages are highly complex and a set of more pragmatic approaches are taken for the more specialised structures.

The consequence of this approach is that W3C web standards can be used such as the use of Resource Descriptor Framework or RDF. This sees the world as a set of triples (subject/ predicate/ object) with some things named and somethings anonymous. Systems that adopt this approach can exchange data in a way that the semantics can be preserved. Whilst RDF is an incredibly arcane language at a machine level, the things it can describe can be very intuitive when represented visually. In other words the Information modelling approach involves an RDF Graph.

In addition to semantic web languages, other commonly used languages are in place are used to enable the model to be accessed by more people.. For example the Snomed-CT expression constraint language is a common way of defining concept sets. ECL is logically equivalent to a closed world query on an open world OWL ontology. The IM language uses the semantic language of SPARQL together with entailment to model ECL but ECL can be exported or used as input as an alternative.