Ontology services

From Endeavour Knowledge Base

DRAFT - this article is in a very draft state and must not be relied upon as being up to date or an actual reflection of the world's true state.

The Information model service is a set of services that enables the Discovery common information model to be created, updated, distributed and accessed.

The service includes a set of web applications and a set of APIs to an information model server and an information model distribution service

All components of the service are open source and available at the Discovery Endeavour githubs https://github.com/endeavourhealth and https://github.com/endeavourhealth-discovery

The information model itself is internally defined via the Discovery_information_model_language which is a group of standards based languages with some pragmatic additions that enable the service to operate.  Externally, the Information model can be accessed via standards based APIs or by Discovery language based APIs that are supported by the Information model server.

The following lists the functional groupings of the IM Service. Some of these use web applications , some use the information model server, and some use the information model distribution service components that maintain locally owned copies of the information model.

 

Common Information model

The Discovery common information model is a group of components that contribute to achieving the following objectives:

  • Enable people who are not technical experts to visualise and understand the structure and content of health records.
  • Enable people who are technical experts to design systems based on the logical structure and content of the model
  • Enable people to define the data they need in order to perform advanced analytics or decision support, in particular where the definition involves subsumption testing 
  • Enable query authors to have a library of value sets and query definitions for re-use across the health sector

The model is technology and system independent, thus can be implemented in a technology of choice.

The model language is standards based, thus the model content can be exchanged using standards based messages.  However, Discovery also uses a simpler JSON based syntax that is easier to comprehend and parse using object oriented programming languages. In addition the outputs of the model can be relational so they can easily be used by RDB based implementations.

At the core of the model is an ontology. The greater part of the ontology is based on the world leading health ontology Snomed-CT, which is itself now defined using the same ontology language OWL2. 

Support for Value sets

The service supports the creation of, maintenance of, distribution of, and access to, value set definitions and value sets, sometimes referred to as reference sets or concept sets.

A value set definition is a definition of a collection of concept expressions that have been brought together for a particular business purpose.  A value set definition is different from a standard concept definition because the meaning of some members of a value set may not be subsumed by the implied meaning of the value set.  For example a value set for gender which consists of male, female, and other, is different from the concept of gender which may include many more specialised variations.

Value set definitions are described in more detail in the Discovery_Information_model_language specification. In summary:

A value set definition is a class that has member properties and the value of each membet is a class expression i.e a Value set has members who are concept expressions.

A Class expression may be a simple pre-cordinated concepts such as a term like  SN_1240751000000100 |Coronavirus disease 19 caused by severe acute respiratory syndrome coronavirus 2 (disorder)|or may be a complex class such as :

Covid 19 {EquivalentTo : Disease 

                and(causative_agent some coronavirus-2)

               and(has_pathological_process some infectious_process)}.

Value sets themselves are the collections of concepts that are defined by the defintion i.e a list of concepts.

Value set editor

The value set editor enables people to create and maintain value set definitions which can then be downloaded, accessed via an API or distributed via the information model distribution service. 

Value set run time generator 

The value set generator returns a list of concepts that are defined by a value set definition for use in queries, thus supporting advanced subsumption_testing against health care records.

The value set generator API accepts the IRI of a value set either in full, or relative to a baseline IRI e.g. http://DiscoveryDataService/InformationModel#VSET_Covid1 or simple VSET_Covid1, and returns a list of concepts to be used in the query. The API supports both core concepts and original codes that have been mapped to the core concepts, depending on whether the database uses Discovery concepts or actual original codes

 

Value set distributor

The value set distributor maintains tables of value sets for databases that use local instances of the Discovery information model. 

This is part of the information model distribution service that runs on an application server,  and is designed to detect changes to the content of the information model and regenerate the value sets from the value set definitions. The value sets are regenerated whenever a value set definition changes or whenever there is an update to the concepts within the information model.