Skip to content

Ingest Service Framework

Samantha Chan edited this page Apr 25, 2017 · 8 revisions

Overview

Ingest Service Framework

The goal of ingest services is to help customers ingest healthcare related data from common data sources, like bedside monitors, ADT sources, etc. An ingest service is responsible to acquire data from a data source, transform the data into the standardized data format, and then publish the data for downstream applications to analyze. To help with this task, we have created a Foundation Ingest Toolkit.

Foundation Ingest Toolkit

At the base of the framework is a foundation toolkit for building healthcare ingest service. This toolkit contains two types of artifacts:

  • Common Types - All services should ingest and publish data using these common types. The common types are defined in SPL, Java and Python. By implementing the types in all three languages, it makes it easier for developer to develop additional services using the language that they are most comfortable with.
  • Connectors - Connectors are glue code for ingest services to publish data and also for downstream services to consume data. The connectors standardize the output data schema from ingest services. It also standardizes how services communicate with each other.

Common Types

Here's a list of common type definition in SPL:

type Observation = Device device, rstring patientId, 
    ReadingSource readingSource, Reading reading;

type Reading = int64 ts, rstring readingType, float64 value, rstring uom;

type Device = rstring id, rstring locationId;	

type ReadingSource = rstring id, rstring sourceType, rstring deviceId;

type Patient = rstring id, rstring name, rstring gender, rstring DOB, 
    rstring status ;	

type Location = rstring id, rstring name, rstring locationType;	

  • Observation - An observation represents a vital / waveform reading for a specific data source from a device. It contains information about the device, the patient connected to the device, the data source of which the reading is collected from, and the actual reading.

  • Reading - A reading from a datasource on a device. It provides information on timestamp, the type of reading this is, the reading value, and unit of measure.

  • Device - Represents the device that is generating the data. It provides information on the device id and the location of the device.

  • ReadingSource - A reading source represents a data source from a device. A device can have one more more data sources (e.g. ECG devices can have 12 channels). A reading source provides information on the source id, source type, and a device Id (to help us associate the data source back to its parent device.)

  • Patient_T - Represents a patient connected to the device.

  • Location_T - Represents the location (e.g. the bed) of the device.

Connectors

Connectors are implemented in SPL, Java and Python. The publish connectors translate data (Observation_T) into JSON. The data is published using the "Publish" operator from the topology toolkit. The subscribe connectors ingest the JSON data and translate it back to the standardized native types (Observation_T) before sending it downstream.

While the current strategy is to use the import/export and JSON as a communication protocol between services, this framework can be extended to use other strategy for inter-service communication. For example, we can implement connectors that use a Kafka server for communication. The benefit of having standardized connector is that these kind of changes / enhancements do not affect the core logic of the services. To take advantage of the new features, customers can simply recompile their applications.

Why JSON?

We picked JSON for inter-service communication to make it easier to interoperate with Java or Python services.

Ingest Service

Ingest services are built on top of the streamsx.health.ingest toolkit. An ingest service is responsible for:

  • acquiring data from an external system
  • translate the data into the standardized format (in this case: Observation_T)
  • publish the data using the Publish Connector from the foundation toolkit

ViNES Ingest Service

ViNES® from True Process is a scalable, vendor-neutral platform for acquiring, sharing, and storing high-resolution data from biomedical devices.

The following is the architecture diagram outlining how the ViNES ingest service. At a high-level, the service ingests ViNES messages from a ViNES server, parses those messages into the common type definition described above, and publishes the data out using the Publish Connector.

ViNes Ingest Service

Consuming Data

To consume data from one of the ingest service, downstream service can just make use of the Subscribe Connector from the health.ingest foundation toolkit. The subscribe connector will receive data for topics that the downstream service is interested. The connector converts the JSON data to the native types, e.g. Observation_T for SPL, and sends the data out for further processing.