Sunday, May 31, 2009

Semantic Interoperability - Part VI - As it Relates to Principles and Processes of Information Interchange

I am a member of the Information Interchange Subcommittee of the HITSP Foundations Committee and for the past several days have been reviewing two documents that will be discussed during our meeting on June 1, 2009.

The documents to be reviewed were developed by Norman Daoust [normandaoust@EARTHLINK.NET] which he described as follows:

(1) The Information Interchange Subcommittee Principles and Processes document. - adapted from a document produced by the Foundations Harmonization committee.

(2) The Interactions Patterns document. - summarizes the information interchange patterns identified in reviews of the information interchange standards published by Standard Development Organization (SDO) members of the U.S. Health Information Technology Standards Panel (HITSP) and identified in the Office of the National Coordinator for Health Information Technology's (ONC) Common Data Transport (CDT) Draft AHIC Extension/Gap document.

My comments that I plan to share are based upon my experience with "semantic interoperability" that I've described in my blog as "Semantic Interoperability - Parts I - V". My comments primarily relate to "Value set harmonization principles". In order to provide a context for my comments, I've copied particular portions of the "...Principles and Processes" document and added my comments in red following the copied sections.

Value set harmonization principles:

  • Codes for a value set should be drawn from a standard reference terminology.

  • HITSP Foundations reserves the right the add values into value sets prior to the code being added into the source code system. These would be contained in an interim US national code system (with the expectation that the interim codes are replaced once adopted by the source code system).

  • Representation of NULL and OTHER will not be addressed on a per value set basis, but may be addressed as harmonization topics themselves.

  • Extensibility (e.g. whether or not local codes can be included in an instance in addition to harmonized codes) is outside the scope of HITSP Foundations, and is a decision left to the SDO and the HITSP TCs.

  • Dynamic vs. Static considerations (e.g., whether or not an instance can include codes from a value set that were added after the relevant specification was balloted) is outside the scope of HITSP Foundations, and is a decision left to the SDO and the HITSP TCs.[1]

  • There will be a uniform policy for versioning harmonized artifacts, based on a predictable frequency. The frequency may vary depending on the artifact type.


Value set: A vocabulary domain that has been constrained to a particular realm and coding system.

  • An ENUMERATED value set (aka an EXTENSIONAL value set) is one that is comprised of an explicit listing of the set of codes. Versioning occurs if values are added or deleted. SDOs typically have STATIC bindings to ENUMERATED value sets.

  • A CRITERIA-BASED value set (aka an INTENSIONAL value set) is one that is defined by a computable expression that can be resolved to an exact list of codes (e.g. “all SNOMED CT concepts that are descendants of the SNOMED CT concept Diabetes Mellitus”). Versioning occurs if the criteria changes. SDOs typically have DYNAMIC bindings to CRITERIA-BASED value sets.

Comments: Twelve, or so, years ago, I worked in the Center for Health Statistics (CHS) in Wisconsin's Department of Health and Family Services (DHFS). My position identified as: "Linked Database Analyst" was funded through a grant from the Robert Wood Johnson Foundation. The objective of my position was to link data sets and then develop anonymized data sets for public use.

At the time, data sets were primarily located on a mainframe computer and were in a variety of formats, some in relational databases while others were in flat files. Because there was no unique person identifier used in the agency, it was necessary to use a variety of matching techniques. While I was familiar with various matching techniques, I gained expertise in the subject by participating in tutorials conducted by the world's leading experts in record linkage.

In 1997, I attended the International Record Linkage Workshop in Arlington , VA, and participated in several workshops but most notably those conducted by Martha Fair of Statistics Canada and William Winkler of the U.S. Census Bureau. The proceedings were sponsored and then published in 1999, by The Committee on Applied and Theoretical Statistics, National Research Council; Federal Committee on Statistical Methodology, Office of Management and Budget. Tutorials on probabilistic linking, not only provided me with insight on how to utilize various bits of information, from disparate sources, to link records of specific persons but with an appreciation of how difficult it is (probably impossible) to truly anonymize records.

The Center for Health Statistics (CHS) was a "SAS (Statistical and Analysis System)" shop. SAS is a system that allows one to bring together data from nearly any imaginable source. Data residing on a mainframe or PC, in flat files or as a part of relational databases can be brought together in SAS datasets and be analyzed in an almost unimaginable ways. SAS has a "data step language" and a SQL language implemented as Proc SQL which allows linking of up to 16 tables in one query. SAS also has an XML engine.

In 1996, Bertrand Russell's book, "The Principles of Mathematics" (originally published in 1903) was reissued in paperback. Part I of the book, which is the first 106 pages, sets forth Russell's landmark thesis that mathematics and logic are identical and that symbolic logic "investigates the general rules by which inferences are made". Russell describes a calculus of propositions, a calculus of classes, and a calculus of relations and an Algebra of symbolic logic which can be visualized by utilizing Venn diagrams.

For discussion purposes, a HITSP Information Interchange Calculus might be described as follows:

5.9 HITSP Information Interchange Calculus

  • Element set: A collection of numbers, objects, or anything that can be conceptualized. Elements can include both Extensional and Intensional value sets. An empty set is called a Null Set and is a subset of every Other set.
  • Information Interchange sets: Combining Information Interchange sets is called a Union of the sets. If the elements that are common to two or more sets are placed into a new set, that set is called an Intersection of the Information Interchange sets.
  • Venn diagrams can be used to represent Information Interchange sets and their Intersection with other sets.

(to be continued)




No comments:

Post a Comment