Mapping and matching concepts: Difference between revisions

Latest revision as of 10:53, 26 October 2022

Mapping approach

From an ontological perspective it could be said that 'A' in the context of 'C', when manipulated by some function, is probably equivalent to 'B'. In OWL this would be denoted as an equivalent class axiom.

From a slightly more process based perspective it could be said that 'A' with context 'C' -> Maps to (via a function) > 'B'

A and B may be entities, properties, objects or literals. A and B may be collections such that there is no one to one map.

When transforming data from one form (the source) to another form (the target), a process of transformation is undertaken. That transformation uses software.

There are general two broad approaches to transformation:

a) Each transformation from A to B from each context C is written, or at least generated, in source code, which is executed by some form of transformation engine.

b) A transformation engine includes software that uses a 'Transform Map' which is a separate resource, describing the way in which A in context C, maps to B.

As one would expect, in an information model, the latter approach is used, with the transform maps having a visualisation capability for humans to debate, decide, and assure, that the equivalence exists to the extent that is safe.

Thus, the role of the information model is to provide a library of maps using, a meta model of classes that define the maps.

In line with all of the information model, these classes are defined using the same information model language that the rest of the model uses, with a set of meta classes, conceptually equivalent to transformation languages.

More specifically, the mapping meta classes have been informed by FHIR Mapping language, the RML semantic web based mapping language, and the simpler concept maps provided by many sources.

Types of mapping can be broadly categorised into 3 categories

Things which map entities and properties from one type to another
Things which map codes or concepts or text.
Things which map property values using the source properties and values as key to a look up to a reference entity.

Mapping entities and properties

This includes:

a) The provision of definitions that determine how one set of source entities (or resources, messages or tables) map to a target entity.

b) the provision of a reference transform engine that illustrates how these maps can be used in the real world.

Mapping codes and taxonomies

The new generation of health record management systems tend towards the recording of concepts, with the objective being for the record entry to closely match the idea behind the entry. These types of concepts can be called term based concepts as the term is the thing that describes the idea.

A modern term based concept is defined in relation to other concepts by a set of assertions indicating whether the concept is equivalent to, or a subtype of, a set of other concepts. This is normally referred to as an ontology.

Snomed-CT is the worlds largest ontology of healthcare term based concepts and is authored using a form of Description logic, which enables a reasoner to automatically classify a concept according to its properties.

The idea of codes originated from a different starting point. The intention of a coded entry is to pre-classify an entry before it is recorded. The code is designed for a particular set of business processes e.g. analytics or payment and it is important to understand the context in which a code has been used. A coded concept, being pre-classified, relies on categorisation of the codes, and that classification may or may not imply that one code is a subtype of another. Nothing can be inferred from a code other than its relation to another code as authored. Consequently, as the philosophy is different, code based concepts have to be dealt with differently from term based concepts, even if they seem to saying the same thing.

Because of their history, it is not always possible to assert the exact meaning of a code. However, it is often the case that meaning can be inferred or approximated from a coded entry. With preference to move to an ontology, this inference can be achieved via the use of a mapping process that matches coded concepts to term based concepts that are identified from a code.

These types of concepts are referred to as "legacy concepts , or non core concepts"

There are two strategies to link none core code concepts to core concepts.

1. A coded term may be stated confidently to be the same as, or a variation on, a concept. Typically code systems like Read2 or CTV3 can be dealt with in this way because they are designed to try and capture the idea in the clinicians mind, and they have been incorporated as concepts anyway. Likewise many system supplier codes have been created in this way. In this case the term code can be said to be a term code of the concept. Read2 G33 - Angina pectoris is a term code for the concept of angina pectoris.

2. A coded term might be the same term as a concept but may have been entered without the assertion that is a true representation of a state. Typically code systems such as ICD10 and OPCS fall into this category. E11 - Diabetes type 2, seems to be the same as the concept of diabetes type 2, but was entered without clinician attestation and may have been approximated for payment purposes. In this case a legacy concept is produced and a map between this concept and the similar clinical concept is generated.

A map is just another form of relationship, but unlike an ontological equivalent or subclass axiom it implies that the relationship is an approximation. It is a sort of statement that something is possibly or probably similar to something else and thus has much less weight than an asserted relationship.

Legacy Code based concepts can be mapped to Core concepts , and this enables the use of the vast volumes of data already recorded in systems. Maps must be used with care as it is almost always the case that the use of a mapped code in a query is dependent on the purpose of the query. This means that mappings are more of a guide to the things to include rather than a confident statement of meaning. When querying records the query author may need to determine which codes to include or exclude on a case by case basis.

Maps between core concepts and legacy concepts

As mentioned above the relationships are managed as mappings which state the type or degree of match.

Maps generally fall into 2 patterns. These are illustrated in the context of code based concepts as follows:

Simple match

A core concept may be matched to many code based concepts. In a simple match the legacy concept is deemed to be probably equivalent to, or a subclass of. the code concept


sn:194828000 |Angina (disorder)
    :matchedTo emis:G33 |Angina Pectoris|.

Complex optional match

A concept may be matched to a number of alternative concepts and it is expected that a query author may wish to select these.

In this example, the concept : "Ketoacidotic coma due to diabetes mellitus (disorder)" has a complex map which is selection of either

a) Coma unspecified

and

b) one of either Diabetes mellitus in pregnancy: Pre-existing diabetes mellitus, unspecified, or Diabetes mellitus in pregnancy, unspecified, or Diabetes mellitus arising in pregnancy

In effect meaning that the compound entry in the record would need to have 2 icd 10 codes to fulfill the criteria.

sn:26298008
  :hasMap [
       :combinationOf  [ 
                           :oneOf  icd10:R402 ] 
                       [
                           :oneOf  icd10:O24.3 icd10:O24.9 O24.4]

@@ Line 1: / Line 1: @@
-Information consists of ideas. Another word for an idea is a 'concept' . A concept may be named,( in which case the meaning of the concept can usually be understood), or they may be an unnamed expression, which is made up of a set of interrelated named or unnamed concepts.
+== Mapping approach ==
+From an ontological perspective it could be said that 'A' in the context of 'C', when manipulated by some function, is probably equivalent to 'B'. In OWL this would be denoted as an equivalent class axiom.
-For example the term "chest pain" implies the idea of a pain in the chest. In Snomed-CT it is a named concept. "Chest pain, worsened by exercise" may be an example of an expression style concept made up from the concept of "chest pain", and the statement that it is "made worse by -> exercise".
+From a slightly more process based perspective it could be said that 'A' with context 'C' -> Maps to (via a function) > 'B'
-The new generation of health record management systems tend towards the recording of concepts, with the objective being for the record entry to closely match the idea behind the entry. These types of concepts can be called term based concepts as the term is the thing that describes the idea.
+A and B may be entities, properties, objects or literals. A and B may be collections such that there is no one to one map.
-A modern term based concept is defined in relation to other concepts by a set of assertions indicating whether the concept is equivalent to, or a subtype of, a set of other concepts. The standard approach to this is via the use of Description Logic (DL). By using DL, a computer can automatically classify a concept which can result in a computer deducing additional knowledge over and above the human who created the concept. Snomed-CT is the worlds largest ontology of healthcare term based concepts. A collection of concepts defined in this way constitute an "Ontology".
+When transforming data from one form (the source) to another form (the target), a process of transformation is undertaken. That transformation uses software.
-Coded concepts originate from a different starting point. The intention of a coded entry is to ''pre-classify'' an entry before it is recorded. The code is designed for a particular set of business processes e.g. analytics or payment. A coded concept, being pre-classified, relies on categorisation of the codes, which may or may not imply subtypes. Nothing can be inferred from a code other than its relation to another code as authored.  Consequently, as the philosophy is different, code based concepts have to be dealt with differently from term based concepts, even if they seem to saying the same thing.
+There are general two broad approaches to transformation:
-Because of their history, it is not always possible to assert the exact meaning of a code based concept. However, it is often the case that meaning can be inferred or approximated from a coded entry. This inference is achieved via the use of maps.
+a) Each transformation from A to B from each context C is written, or at least generated, in source code, which is executed by some form of transformation engine.
-A map is just another form of relationship, but it implies that the relationship is an approximation. It is a sort of statement that something is possibly or probably similar to something else. It has much less weight than an asserted relationship.
+b) A transformation engine includes software that uses a 'Transform Map' which is a separate resource, describing the way in which A in context C, maps to B.
-Code based concepts can be mapped to term based concepts, and this enables the use of the vast volumes of data already recorded in systems. Maps must be used with care as it is almost always the case that the use of a mapped code in a query is dependent on the purpose of the query. This means that mappings are more of a guide to the things to include rather than a confident statement of meaning.
+As one would expect, in an information model, the latter approach is used, with the transform maps having a visualisation capability for humans to debate, decide, and assure, that the equivalence exists to the extent that is safe.
-Maps generally fall into 4 patterns. These are illustrated in the context of code based concepts as follows:
+Thus, the role of the information model is to provide a library of maps using, a meta model of classes that define the maps.
-* A coded concept may have one map which is mapped to one term based concept, the mapping having a certain weighting or category. For example the icd10 code for Angina may have a map which maps to the single term based Snomed-CT concept of angina, with a category indicating that the source concept is properly classified. Note that many coded concepts may be mapped to one single term based concept. The map is viewed from the perspective of the coded concept.
+In line with all of the information model, these classes are defined using the same information model language that the rest of the model uses, with a set of meta classes, conceptually equivalent to transformation languages.
- icd10:I209 |Angina Pectoris (ICD10 I20.9)|
+More specifically, the mapping meta classes have been informed by FHIR Mapping language, the RML semantic web based mapping language, and the simpler concept maps provided by many sources.
-             :hasMap [:mappedTo sn:194828000 |Angina (disorder);
-                     :mapCategory sn:447637006 |Map source concept is properly classified]
-* A coded concept may have more than one map and each map may map to more than one potential term based concept i.e. a union of concepts
+Types of mapping can be broadly categorised into 3 categories
-<pre>
-icd10:E140| Unspecified diabetes mellitus with coma
+# Things which map entities and properties from one type to another
-           //This maps to a number of target concepts
+# Things which map codes or concepts or text.
-  :hasMap  [:mappedTo :owl:UnionOf [
+# Things which map property values using the source properties and values as key to a look up to a reference entity.
-                          sn:26298008|Ketoacidotic coma due to.....,
-                          sn:421725003|Hypoglycemic coma due to diabetes mellitus];
+<br />
+== Mapping entities and properties ==
+This includes:
+a) The provision of definitions that determine how one set of source entities (or resources, messages or tables) map to a target entity.
+b) the provision of a reference transform engine that illustrates how these maps can be used in the real world.
+<br />
+== Mapping codes and taxonomies ==
+The new generation of health record management systems tend towards the recording of concepts, with the objective being for the record entry to closely match the idea behind the entry. These types of concepts can be called term based concepts as the term is the thing that describes the idea.
+A modern term based concept is defined in relation to other concepts by a set of assertions indicating whether the concept is equivalent to, or a subtype of, a set of other concepts. This is normally referred to as an ontology.
+Snomed-CT is the worlds largest ontology of healthcare term based concepts and is authored using a form of Description logic, which enables a reasoner to automatically classify a concept according to its properties.
+The idea of codes originated from a different starting point. The intention of a coded entry is to ''pre-classify'' an entry before it is recorded. The code is designed for a particular set of business processes e.g. analytics or payment and it is important to understand the context in which a code has been used.  A coded concept, being pre-classified, relies on categorisation of the codes, and that classification may or may not imply that one code is a subtype of another. Nothing can be inferred from a code other than its relation to another code as authored.  Consequently, as the philosophy is different, code based concepts have to be dealt with differently from term based concepts, even if they seem to saying the same thing.
+Because of their history, it is not always possible to assert the exact meaning of a code. However, it is often the case that meaning can be inferred or approximated from a coded entry. With  preference to move to an ontology, this inference can be achieved via the use of a mapping process that matches  coded concepts to term based concepts that are identified from a code.
+These types of concepts are referred to as "legacy concepts , or non core concepts"
+There are two strategies to link none core code concepts to core concepts.
+. A coded term may be stated confidently to be the same as, or a variation on, a concept. Typically code systems like Read2 or CTV3 can be dealt with in this way because they are designed to try and capture the idea in the clinicians mind, and they have been incorporated as concepts anyway. Likewise many system supplier codes have been created in this way. In this case the term code can be said to be a term code of the concept. Read2 G33 - Angina pectoris is a term code for the concept of angina pectoris.
+. A coded term might be the same term as a concept but may have been entered without the assertion that is a true representation of a state. Typically code systems such as ICD10 and OPCS fall into this category. E11 - Diabetes type 2, seems to be the same as the concept of diabetes type 2, but was entered without clinician attestation and may have been approximated for payment purposes. In this case a legacy concept is produced and a map between this concept and the similar clinical concept is generated.
+A  map is just another form of relationship, but unlike an ontological equivalent or subclass axiom it implies that the relationship is an approximation. It is a sort of statement that something is p''ossibly or probably similar to s''omething else and thus has much less weight than an asserted relationship.
+Legacy Code based concepts can be mapped to Core concepts , and this enables the use of the vast volumes of data already recorded in systems. Maps must be used with care as it is almost always the case that the use of a mapped code in a query is dependent on the purpose of the query. This means that mappings are more of a guide to the things to include rather than a confident statement of meaning. When querying records the query author may need to determine which codes to include or exclude on a case by case basis.
+== Maps between core concepts and legacy concepts ==
+As mentioned above the relationships are managed as mappings which state the type or degree of match.
+Maps generally fall into 2 patterns. These are illustrated in the context of code based concepts as follows:
+=== Simple match ===
+A core concept may be matched to many code based concepts. In a simple match the legacy concept is deemed to be probably equivalent to, or a subclass of. the code concept<pre>
-             :mapCategory  sn:447637006 |Map source concept is properly classified ],
+sn:194828000 |Angina (disorder)
+    :matchedTo emis:G33 |Angina Pectoris|.
-             //This map is dependent on the context this map is used in
-            [:mappedTo sn:267384006 |Coma due to hypoglycemia|;
-             :mapCategory sn:447639009 |Map of source concept is context dependent]
 </pre>
+=== Complex optional match ===
+A concept may be matched to a number of alternative concepts and it is expected that a query author may wish to select these.
+In this example, the concept ''':''' "Ketoacidotic coma due to diabetes mellitus (disorder)" has a complex map which is selection of either
+a) Coma unspecified
+and
-* An unnamed concept consisting of a combination of coded concepts (e.g. A and B) has a map which maps to a term based concept. This means that the target concept is dependent on the combination of source coded concepts.
+b)  one of either Diabetes mellitus in pregnancy: Pre-existing diabetes mellitus, unspecified, or Diabetes mellitus in pregnancy, unspecified, or Diabetes mellitus arising in pregnancy
+In effect meaning that the compound entry in the record would need to have 2 icd 10 codes to fulfill the criteria.
+<pre>
+sn:26298008
+  :hasMap [
+       :combinationOf  [
+                           :oneOf  icd10:R402 ]
+                       [
+                           :oneOf  icd10:O24.3 icd10:O24.9 O24.4]
+</pre><br /><br />