Baseline data set

From Endeavour Knowledge Base

Discovery Data Service contains health and care-related data.

These articles describe the nature of the types of data processed by the data service. 

In addition to reading these articles, the reader can visit the information model viewer at [x] which enables the entire information model to be explored, including the ontology, the data model, and the various code sets and data sets that have been developed.

Overview of the data model

The Discovery common information model includes a data model. The data model covers the structural arrangements of care record entries and related entries such as information about people and organisations.

Only the broad categories of types of data are described in these articles. ,The information model viewer describes and defines precisely the extent of  support for types of data at a particular point in time. 

A number of terms are used in describing structures, namely entities, properties, relationships, attributes, types and subtypes. Properties and relationships are often collectively referred to as attributes. The sometimes subtle semantic difference between properties and relationships are described in the article 'Class Modelling in the common model' and it is useful to understand the meaning in order to follow the descriptions of the health data types included in these articles. 

The entities described here are derived from actual health records. These are not a "standard" or set of statements of what a data model should be but instead refelct the type of data that actually exists in the Discovery data service.

A health record consists of a set of "entries", each entry describing either an event that has occurred, an event that might occur, a state, or statement by the person making the record.  In addition, entries contain references to external things, often organised into Directories. Thus an entity is either a type of entry or a type of thing that the entry refers to.

The approach to categorising the data types has been heavily influenced by HL7 FHIR. In this approach, entities are roughly categorised according to the type of business process that the entry describes. In that sense, the categorisation is entirely pragmatic. At a deeper level, the entities are also defined according to their meaning in such a way that a machine can infer things from the definition. This further level of definition occurs in the Discovery ontology and the two main parts of the information model work hand in hand together.

The result of integration between the data model and the ontology is that the data model supports unlimited but ontologically controlled extensibility. This provides the potential for holding data at any granular level of specialisation without the need to continually change an implementation schema. How this is managed is a subject of specialist articles elsewhere in the Wiki.

Each entity has its own article and thus the data model is described by categorising articles along the same lines as the ontololigal classification of the entities themselves. However, there will always be some mismatch between the articles and the model itself and it is the model itself that should be relied on at all times.

 

Generic entities

This section deals with entities that are relevant to all parts of the model.

Provenance

Discovery tracks data throughout the pipeline including the receipt of the data and the transformation of the data.
Discovery also retains any provenance related information relating to the item as is was originally recorded in the publisher system, including any provenance information that the publisher may have.
Discovery itself, broadly speaking, follows the W3C PROV standard in that it records Entities, activities and agents and any number of relationships between them based on sub-properties of the main W3C provenance relationships.

 

Provenance.jpg

 

The main entity types and main properties are listed here:

Provenance entity

This is a reference to a stored item of data which is of sufficient importance to require a record of provenance. The data may be a record entry, or in the case of a deleted record, the previous entry. In addition, it may point to messages or files that were stored or created as part of the processing of health data.

Property Description
Entity identifier The identifier of the entity in question providing sufficient information to determine the type and location
Attributes A set of attribute value pairs that provide meta data about the entity, specific usually to the type of entity.  These are modelled in the information model

 

Provenance activity

In order to have generated some data, or changed some data, or deleted some data, some form of activity has taken place. This entity holds the nature of the activity that took place and the date and time it took place. Provenance can be illustrated by providing a timeline of all linked activities, operating as a chain going back in time.

Property Description
Activity type The type of activity
Start time The date and time the activity started
End time The date and time the activity ended

 

Provenance agent

This is a person or thing that performed an activity, or is responsible for an entity. Agents operate in the context of roles, which are represented as properties of the relationship between the agent and the activity.

Property Description
Agent type The type of agent involved
Agent identifier The identifier of the agent which might be a DBID or URL

 

Main Provenance relationships

This is a high-level listing of the types of relationships between the provenance objects. An ontology of relationships can be viewed in the information model viewer

Relationship Description
Derived from Links an entity to another entity from which it was derived
Generated by Links an entity to the activity that generated it
Attributed to Links an entity to an agent that the entity is attributed to e.g.. the author or owner
Was associated with Links an activity to the agent that it was associated with, e.g.. who performed it, including the role the agent was performing in
Acted on behalf of Links an agent to the organisation (or other agent) that an agent acted on behalf of

 

Health Event

A health event is an abstract class referring to an entry that represents something that has happened, or may happen, at a point in time, or over a period of time, related to the health or care or a person.