Ontology services: Difference between revisions

From Endeavour Knowledge Base
No edit summary
 
(38 intermediate revisions by 3 users not shown)
Line 1: Line 1:
The Information model service is a set of services that enables the Discovery common information model to be created, updated, distributed and accessed.
The service includes a set of web applications and a set of APIs to an [[IM_Server|information model server]] and an''' '''[[IM_Distribution_service|information model distribution service]]
All components of the service are open source and available at the Discovery Endeavour githubs [https://github.com/endeavourhealth https://github.com/endeavourhealth] and [https://github.com/endeavourhealth-discovery https://github.com/endeavourhealth-discovery]
The following lists the functional groupings of the IM Service. Some of these use web applications , some use the information model server, and some use the information model distribution service components that maintain locally owned copies of the information model.
== Information model applications ==
__TOC__
__TOC__


 The information model service supports a number of applications for use by people who need to maintain or (simply access) the information model content. The applications  are a suite of modules that can operate separately or together. For example the IM manager brings together a number of modules.
The ontology services can be categorised into two main forms
 
Each module contains components for use in other applications.
 
=== Information model Manager ===
 
The information model manager  is the main application for maintaining the information model content and provides viewing and authoring capabilities  in order to maintain the content of the  [[Common_information_model|Discovery common information model]] , which is the model of data that the information model services uses underpinned by a standards based [[Information modelling language|modelling language.]]
 
The manager consists of a number of modules varying from a viewer through to the authoring of advanced ontological concepts. Modules include:
 
* IM viewer which enables a view of the ontology and/ or the data model and all artefacts created as part of the model, including a view of all the data maintained by the editors
* Concept expression editor, which enables the creation of new concepts and expressions and define their meaning
* Data model editor, which maintains one or more data models.
* Value set editor, which maintains any number of sets of concepts or value sets
* Data set editor, often used by the data sharing manager or data project manager to create data set definitions
* Map maker, used to maintain maps between database schemas and the common model or maps to message types
* Workflow manager, used to  manage tasks associated with the above
 
=== Value set editor ===
The editor supports the creation of, maintenance of, distribution of, and access to, value set definitions and value sets, sometimes referred to as reference sets or concept sets.
 
A value set definition is a definition of a collection of concept expressions that have been brought together for a particular business purpose.  A value set definition is different from a standard concept definition because the meaning of some members of a value set may not be subsumed by the implied meaning of the value set.  For example a value set for gender which consists of male, female, and other, is different from the concept of gender which may include many more specialised variations.
 
Value set definitions are described in more detail in the [[Discovery_Information_model_language.|Discovery_Information_model_language]] specification. In summary:
 
A value set definition is a class that has member properties and the value of each member is a class expression i.e. a Value set has members who are concept expressions.
 
A Class expression&nbsp;may be a simple pre-coordinated concepts such as a term like&nbsp; SN_1<span style="color:#8e44ad;">240751000000100 |Coronavirus disease 19 caused by severe acute respiratory syndrome coronavirus 2 (disorder)</span>|or may be a complex class&nbsp;such as&nbsp;:
<pre> Covid 19:  EquivalentTo: Disease
                              and (causative_agent some coronavirus-2)
                              and (has_pathological_process some infections_process)
</pre>
 
Value sets themselves are the collections of concepts that are defined by the definition i.e. a list of concepts.
 
The value set editor enables people to create and maintain value set definitions which can then be downloaded, accessed via an API or distributed via the information&nbsp;model distribution service.&nbsp;
 
===Ontology editor ===
 
==== Reasoners and classifiers ====
The information model services include the use of reasoners&nbsp;that operate on the semantic ontology subset of the information model. A subset of a reasoner is a classifier which uses [[Subsumption test|subsumption testing]] on the ontological entities to generate inferred  relationships which can then be used in run time query, or to generate transitive closure tables.


The purposes of the reasoners are :
# An Application that enables people to view and maintain ontologies, value sets, information models and query definitions
# A set of (REST) APIs so that the resources can be extracted and used by other systems i.e. integrated into other systems.
All are built using open source code available in a set of repositories at https://github.com/endeavourhealth-discovery


* Help to make sure that the ontologies are logically consistent. Whilst most of the problems with the ontology are as a result of faulty axioms authored by humans, reasoners help to make sure that axioms are logically consistent within an ontology.
===Ontology and information model model application===


* To generate inferred relationships from stated axioms, in particular the generation of the "is a" relationships from subclass axioms, equivalent class axioms, as well mapped to, replaced by and replaced relationships between active and inactive concepts or legacy concepts.
The live manager is accessible at https://im.endeavourhealth.net/#/


Reasoners are accessed via the use of the java [https://github.com/owlcs/owlapi/wiki OWL API] ,which itself supports a number of reasoners such as Hermit and Elk.
The development version is accessible at https://dev.endhealth.co.uk/#/ which is generally a month or two further forward


In addition a [[Ontology classifier|simple ontology classifie]]<nowiki/>r is used to generate inferred relationships from the stated axioms, so that subsumption testing of the kind used in standard query can operate easily
Default contents include


<br />
* The major health ontologies such as Snomed-CT (UK version and London extensions), considered as "core ontology"
=== Information model libraries ===
* The standard code based taxonomies such as OPCS-4 and ICD10 and the maps between them and the core ontology
IM repositories hold the content of the information model. There are various categories of repositories that align with the model manager modules and the modellin language. The types of repositories include:
* The local code schemes such as EMIS, TPP and some Hospital trusts, and legacy code  taxonomies such as Read 2 and CTV3 and the maps between them and the core ontology
* A proven real world common data model based on FHIR like types (extended to real data) , modelled as an RDF graph, with maps to the model from source data formats used in primary, community, acute and urgent care providers that used the London One London Level 2 Discovery data service.
* Libraries of live in use value sets bound to the fields within the model
* Library of value sets used in queries including the core Snomed-CT UK reference sets and sets used in queries including QOF concept sets.
* Example Libraries  of feature and query definitions used as actual operational queries for data on health records.
Application functionality to enable creation and maintenance of Libraries for organisational specific ontologies, value sets, data models and queries


* The Ontology library, which holds all of the concepts and their definitions from a multiplicity of taxonomies and classifications
Information service APIs
* Expression library, which holds a set of re-usable expressions that have been created from the concepts.
* Value set library, which holds collections of concept definitions for use in query and reporting
* Data model library, which holds data models.
* Data set library, which holds data sets
* Data map library, which holds collections of maps between data models, object models and related artefacts
* Query library, which holds collections of queries designed to query data models in order to produce data sets for reports, or provide knowledge to aid decisions.


=== Information service APIs ===
As well as the information model manager and various modules, the service provides a suite of APIs to support the use of data held within the information model libraries.
As well as the information model manager and various modules, the service provides a suite of APIs to support the use of data held within the information model libraries.
 
===Web APIs===
==== Get run time value set&nbsp;====
The ability to obtain resources such as Value sets using FHIR APIs  
 
==Distribution services==
The value set generator returns a list of concepts that are defined by a value set definition for use in queries,&nbsp;thus supporting advanced [[Subsumption_test|subsumption_testing]] against health care records. A run time value set is effectively the same as the output of descendants from a transitive closure table, but includes indicators as to the nature of the leaf concepts (e.g. whether mapped or replaced, or replaced by relationships)
 
The [[Value_set_generator_API|value set&nbsp;generator API]] accepts the IRI of a&nbsp;value set either in full, or relative to a baseline IRI e.g.. [http://DiscoveryDataService/InformationModel#VSET_Covid1 http://DiscoveryDataService/InformationModel#VSET_Covid1]&nbsp;or simple VSET_Covid1, and returns a list of concepts to be used in the query. The API supports both core concepts and original codes that have been mapped to the core concepts, depending on whether the database uses Discovery concepts or actual original codes
 
==== Get Map APIs ====
The information model server provides a number of APIs and utilities that support the mapping of original fields and values into the common information model.
 
The [[Data map API|data mapping APIs]] article describes the use of the mapping server's mapping APIs to support inbound and outbound data transformation processes that involve a map between two data models. The [[map maker manager]] article describes the way that the map maker manager operates when authoring maps
 
For example there are a set of [[Mapping hint algorithms]] that are machine assisted approaches to improving the speed and accuracy of mapping.
=== Distribution services ===
As well as accessible by APIs and applications, the information model services provide distribution facilities for content of the IM for use in subscriber data bases or subscriber applications. All content of the information model can be distributed both in bulk and delta form
As well as accessible by APIs and applications, the information model services provide distribution facilities for content of the IM for use in subscriber data bases or subscriber applications. All content of the information model can be distributed both in bulk and delta form
===Set distributor===
The concept and value set distributor maintains tables of value sets for databases that use local instances of the Discovery information model.&nbsp;


==== Value set distributor ====
This is part of the [https://wiki.discoverydataservice.org/IM_Distribution_service information model distribution service] that runs on an application server,&nbsp;&nbsp;and is designed to detect changes to the content of the information model and regenerate the value sets from the value set definitions. The value sets are regenerated whenever a value set definition changes or whenever there is an update to the concepts within the information model.
 
The value set distributor maintains tables of value sets for databases that use local instances of the Discovery information model.&nbsp;
 
This is part of the [[IM_Distribution_service|information model distribution service]] that runs on an application server,&nbsp;&nbsp;and is designed to detect changes to the content of the information model and regenerate the value sets from the value set definitions. The value sets are regenerated whenever a value set definition changes or whenever there is an update to the concepts within the information model.
 
=== Map manager ===
This is an application that creates and maintains maps between objects for use in transformation and data access.
 
The maps thus generated, are made available through the map APIs.
<br />
==UPRN and address matching==
''main article'' [https://wiki.discoverydataservice.org/UPRN_-_address_matching_service UPRN  allocation and address matching service]
 
Unique property reference numbers are special identifiers of properties.


Discovery information model supports the mapping of health related addresses to addresses provided by an authoritative organisation, those addresses being a gold standard for pointing to a UPRN.
=== Underpinning modelling architectures ===
All of the services are based on RDF graphs with classes defined in SHACL shapes from which code classes are generated and made available as Java or Typescript.  


&nbsp;
Details of the underlying models can be viewed on the extensive det of [[Information modelling|Information model]] pages

Latest revision as of 17:24, 7 December 2023

The ontology services can be categorised into two main forms

  1. An Application that enables people to view and maintain ontologies, value sets, information models and query definitions
  2. A set of (REST) APIs so that the resources can be extracted and used by other systems i.e. integrated into other systems.

All are built using open source code available in a set of repositories at https://github.com/endeavourhealth-discovery

Ontology and information model model application

The live manager is accessible at https://im.endeavourhealth.net/#/

The development version is accessible at https://dev.endhealth.co.uk/#/ which is generally a month or two further forward

Default contents include

  • The major health ontologies such as Snomed-CT (UK version and London extensions), considered as "core ontology"
  • The standard code based taxonomies such as OPCS-4 and ICD10 and the maps between them and the core ontology
  • The local code schemes such as EMIS, TPP and some Hospital trusts, and legacy code taxonomies such as Read 2 and CTV3 and the maps between them and the core ontology
  • A proven real world common data model based on FHIR like types (extended to real data) , modelled as an RDF graph, with maps to the model from source data formats used in primary, community, acute and urgent care providers that used the London One London Level 2 Discovery data service.
  • Libraries of live in use value sets bound to the fields within the model
  • Library of value sets used in queries including the core Snomed-CT UK reference sets and sets used in queries including QOF concept sets.
  • Example Libraries of feature and query definitions used as actual operational queries for data on health records.

Application functionality to enable creation and maintenance of Libraries for organisational specific ontologies, value sets, data models and queries

Information service APIs

As well as the information model manager and various modules, the service provides a suite of APIs to support the use of data held within the information model libraries.

Web APIs

The ability to obtain resources such as Value sets using FHIR APIs

Distribution services

As well as accessible by APIs and applications, the information model services provide distribution facilities for content of the IM for use in subscriber data bases or subscriber applications. All content of the information model can be distributed both in bulk and delta form

Set distributor

The concept and value set distributor maintains tables of value sets for databases that use local instances of the Discovery information model. 

This is part of the information model distribution service that runs on an application server,  and is designed to detect changes to the content of the information model and regenerate the value sets from the value set definitions. The value sets are regenerated whenever a value set definition changes or whenever there is an update to the concepts within the information model.

Underpinning modelling architectures

All of the services are based on RDF graphs with classes defined in SHACL shapes from which code classes are generated and made available as Java or Typescript.

Details of the underlying models can be viewed on the extensive det of Information model pages