Information model meta model: Difference between revisions

From Endeavour Knowledge Base
(Created page with "== Scope of the model == The model meta model consists of a small set of specialised classes or 'shapes', made interoperable via the use of the semantic web languages which us...")
 
No edit summary
Line 1: Line 1:
== Scope of the model ==
== Scope of the meta model ==
The model meta model consists of a small set of specialised classes or 'shapes', made interoperable via the use of the semantic web languages which use RDF grammar and syntax.
The model meta model consists of a small set of specialised classes or 'shapes', made interoperable via the use of the semantic web languages which use RDF grammar and syntax.


Line 9: Line 9:
#A catalogue of reference data such as geographical areas, organisations and people derived and updated from public resources.
#A catalogue of reference data such as geographical areas, organisations and people derived and updated from public resources.
#A library of Queries for querying and extracting instance data from reference data or health records. This uses a more extended class model than 3) but fundamentally is a set definition which is mapped to mainstream query languages to get actual data.
#A library of Queries for querying and extracting instance data from reference data or health records. This uses a more extended class model than 3) but fundamentally is a set definition which is mapped to mainstream query languages to get actual data.
#A set of maps creating mappings between published concepts and the core ontology as well as structural mappings between submitted data and the data model.
#A set of maps creating mappings between published concepts and the core ontology as well as structural mappings between submitted data and the data model. This uses a context class.
#A set of form generators that are used by the IM application to create forms for the creation and editing of the IM entities which are instances of a meta model class.


<br />
<br />
===SHACL shapes - data model===
[https://www.w3.org/TR/shacl/ The shapes constraint language], as in the semantic ontology, the language borrows the constructs from the W3C standard SHACL, which can also be represented in any of the RDF supporting languages such as TURTLE or JSON-LD.
Use of shacl property, shacl class and shacl node and shacl datatypes are the mainstay as described in the [https://wiki.endeavourhealth.org/Language_grammar_and_syntax language grammar and syntax] article.
===Query language===
As the IM itself is held as RDF quads (triple+ graphs) the IM can be queried using SPARQL for graph query and Lucene query for text query. The IM manager also supports a full Elastic (AWS OpenSearch) index for advanced text queries.
But as the IM is designed to support query on actual health records (usually in relational format), the IM also has to enable SQL query.
Both SPARQL and SQL are complex specialised languages and to program using these languages the user must not only be a technical expert, but must have intimate knowledge of the RDF schema and /or the specific target health record schema.
Ordinary people express query concepts in plain language and most queries can be expressed using logical statements from plain language.
IM employs a Json format domain specific language (DSL), that operates as an intermediary between plain language logical query statements, and the underlying query languages such as SQL or SPARQL.
Even a DSL is highly technical so the IM also provides a user friendly application that enables a lay person to construct highly complex health queries without the need to understand  the query languages, or the technical storage formats of the IM.
The approach to the design of IM Query is to take the various logical plain language patterns and map them directly to a DSL query format , and provide direct maps between the Json query DSL objects and the relevant SPARQL or SQL
IM query is specified more fully in the article on [https://wiki.endeavourhealth.org/Information_model_query information model query]
===Form generator===
In order to maintain and edit the content of the information model there is a need to be able to build forms that enable something to be edited. Examples of things to edit are concepts, value sets, concept sets, queries (of the IM) , data models and maps.
The information model language uses an extension to SHACL shapes to enable form generation. Another way of putting it is that SHACL shapes define the structure and content of data, whereas a form generator provides instructions as to how a particular shape could be hand authored.
The language does not dictate the style or technology used in forms, only the things which a form based application would need when generating components on the screen.
The form generator language vocabulary and how it may be used is documented on the article on information model [https://wiki.endeavourhealth.org/Form_generator_language form generator language]

Revision as of 16:02, 20 September 2022

Scope of the meta model

The model meta model consists of a small set of specialised classes or 'shapes', made interoperable via the use of the semantic web languages which use RDF grammar and syntax.

The classes cover the following areas:

  1. An ontology of terminology concepts, which is a vocabulary and definitions of the concepts used in healthcare, or more simply put, a vocabulary of health. The ontology is made up of the world's leading ontology Snomed-CT, with a London extension and supplemented with additional concepts for data modelling. Whether concepts or Snomed-CT concepts, or the London extension, or any legacy code based concept (e.g. ICD10 or EMIS local codes or Read codes), the class structure is the same.
  2. A data model, which is a set of classes and properties, using the vocabulary, that represent the data and relationships as published by live systems that have published data to a data service that uses these models. The data model is part of the overall ontology and there is seamless boundary between the data model shapes and the terminology concepts, as both use RDF. The data model meta model uses SHACL shapes and thus conforms to the W3C SHACL recommendation.
  3. A library of business specific concept and value sets, which are expression constraints on the ontology for the purpose of query. This uses a specialised "query" or "set definition" class, and encompasses the Snomed-CT expression constraint language with which it is compatible, using a simple translation API
  4. A catalogue of reference data such as geographical areas, organisations and people derived and updated from public resources.
  5. A library of Queries for querying and extracting instance data from reference data or health records. This uses a more extended class model than 3) but fundamentally is a set definition which is mapped to mainstream query languages to get actual data.
  6. A set of maps creating mappings between published concepts and the core ontology as well as structural mappings between submitted data and the data model. This uses a context class.
  7. A set of form generators that are used by the IM application to create forms for the creation and editing of the IM entities which are instances of a meta model class.