Wednesday, May 20, 2009

Semantic Interoperability - Part III









Following are several paragraphs taken from a paper I wrote in 1999 relating to my study of names and birthdates as unique identifiers. The idea of the tool depicted above is to be able to develop data sets by "visually" creating Venn diagrams. The SQL programming would be transparent to users.

Structured Query Language (SQL) and Relational Database Theory
Relational theory was originally developed by E. F. Codd, an International Business Machines( IBM) researcher, and was first implemented at IBM in a prototype product called System R It was developed in the early 1970s as a way to provide computer users with a standardized method for selecting data from various database formats. The intent was to build a language that was not based on any existing programming language, but which could be used within any programming language as a way to update and query information in databases. The Structured Query Language (SQL) is now in the public domain and is part of many vendors’ products.

Definition of Algorithm
While this study was carried out using the latest computer technology, the principles being followed have been refined over the past 500 years. For example, the definition of an algorithm, as a formula for solving a problem, dates back. to a Persian mathematician in the mid 1400s. By the ear1y 1700s, the original Arabic word had mutated into the Latin word algorithmus and was used by Leibniz (1646-1716), one of the inventors of calculus, to mean “ways of calculation”. Conceptualization of the mathematical principles (of set theory ) that underlies SQL, dates to the 1800s.

Symbolic Logic
George Boole (1815-1864) discovered the branch of mathematics known as symbolic logic. Boole’s “algebra of logic” uses formulas to symbolize logical relations. The formulas in algebraic symbols can describe the general relationships among groups of things that have certain properties. Given a question about how one group relates to another, Boolean logic allows us to could quickly manipulate the equations and produce an answer. First, Boole’s algebra classifies things and then the algebraic symbols express any relationship among the things that have been classified. The Boolean datatype is named after this mathematician. Heim, in his essay says, “Boolean logic functions as a metaphor for the computer age, since it shows how we typically interrogate the world of information.”

Venn Diagrams
The term symbolic logic first appeared in 1881 in a book by that title. The book’s author, John Venn introduced the first graphic display of Boole’s formulas. The visual display that John Venn drafted begins with empty circles. Venn noted how Boolean logic treats terms, strictly as algebraic variables and not as universal terms referring to actually existing things. That concept is somewhat analogous to the way names coupled with birth dates are treated as abstract templates for clients to be instantiated. Boole’s logic can use terms that apply to empty sets, with no actually existing members. Our modern logical point of view begins with the system rather than with the concrete, existential and unique individual thing (or person).

Referent Dataset
The referent file should be constructed so as to facilitate the use of an iterative matching algorithm, programmed in SQL (Structured Query Language). Use of the referent file will always be limited to returning rows of information.. A complete row of information that exactly matches the name and birth date being entered may be returned from the referrent file. If not exactly matched, near matches of names and birth dates (and at the user’s discretion associated rows of information) may be returned to the calling program. The referent file will be constructed so as to be return (appropriate and correct) answers at maximum speed.

(to be continued)

No comments:

Post a Comment