You are on page 1of 23

Medical Informatics: Searching for Underlying Components

Mark A. Musen Stanford Medical Informatics Stanford University School of Medicine Stanford, California 94305-5479 USA (650) 725-3390 Fax: (650) 725-7944 musen@Stanford.EDU

Yes, well, you may well ask, what is my theory? Anne Elk, in a skit by Monty Python, 1972

Objective: To discuss unifying principles that can provide a theory for the diverse aspects of work in medical informatics. If medical informatics is to have academic credibility, it must articulate a clear theory that is distinct from that of computer science or of other related areas of study. Results: The notions of reusable domain ontologies and problem-solving methods provide the foundation for current work on second-generation knowledge-based systems. These abstractions also are attractive for defining the core contributions of basic research in informatics. We can understand many central activities within informatics in terms defining, refining, applying, and evaluating domain ontologies and problem-solving methods. Conclusion: Construing work in medical informatics in terms of actions involving ontologies and problem-solving methods may move us closer to a theoretical basis for our field.

Keywords: Medical informatics, computing methodologies, artificial intelligence, academic training, professional training

How do we define our field? There is wide recognition that the management of medical information is central to the future of both health care and the worlds health. Medical professional societies everywhere point to medical informatics as an essential element of medical practice. The word informatics increasingly is used as an adjective to sell the latest health-care technology. Vendors of clinical information systems hope to impress customers by claiming that their products are based on informatics techniques and informatics methods. The recent report of the Institute of Medicine on the problem of medical errors specifically identified information technology as a large element of the needed solution [1]. For the first time in its history, the American Society for Clinical Investigationthe major honor society for academic physician-scientists in the United Statesnow recognizes medical informatics as a specific area of investigation. With the concept of informatics finally entering the mainstream, one would think that these would be heady days for our research community. My problem, however, is that I am still sure not what medical informatics is.

The term medical informatics has become a universally applied term that, sadly, remains difficult to define. The latest edition of Shortliffes textbook defines the discipline as, a field of study concerned with the broad issues in the management and use of biomedical information, including medical computing and the study of the nature of medical information [2]. Jan van Bemmel and I defined informatics as the science that studies the use and processing of data, information, and knowledge, and medical informatics as informatics applied to medicine, health care, and public health [3]. Shortliffes definition, by couching medical informatics in terms of unspecified broad issues, leaves the dimensions of the discipline intentionally vague. This lack of specificity has the advantage of defining the field to be maximally inclusive. Van Bemmel and I, in using the word science to describe informatics, implicitly make strong claims about the existence of an underlying theory and of an empirical methodology that can explore potentially defeasible hypotheses. As an academic, I very much want to believe that medical informatics is a science, or at least that it has scientific elements. It remains a challenge for our community, however, both to articulate the underlying theory of informatics and to embrace an empirical approach to validating hypotheses.

When most people talk about medical informatics, they speak not of theories or of methodologies, but simply of computer applications in medicine. Terms such as clinical computing permeate our culture. Descriptions that emphasize the computational aspects of our work are indeed much more understandable to most people than is the word informaticswhich even my spell-checker insists is not a real word. The problem for our fieldand the problem for more entrenched disciplines such as computer scienceis that the emphasis on the role of the computer in defining our discipline is that we highlight the technology that facilitates our work, rather than putting the focus squarely on the work itself. We dont call probability theory dice science; we dont call the study of literature pencil science. We easily seem to be able to call medical informatics medical computer science, however.

Medical Informatics and Computer Science Our colleagues in traditional departments of computer science remain confused about the relationship between work in our community and that in their own. My own fellow faculty members at Stanford University have struggled to understand what motivates researchers in my group that might be different from what inspires workers in the Computer Science Department. My colleagues also wonder why the rather close relationship shared by medical informatics and computer science at Stanford does not seem to be replicated at many other universities.

Traditionally, workers in medical informatics have been somewhat estranged from their colleagues in computer science. They have worked in different faculties, often quite separated geographically. The two disciplines also have been quite separated intellectually, despite the fact that both are dependent, ultimately, on computational principles. The result is that the medicalinformatics community as a whole has been extremely slow to build on the advances that have been achieved in computer science. In the 1970s, for example, the computer-science community was discovering structured programming and abstract data types; at the same time, the medicalinformatics community, struggling to build time-shared applications that would execute within a minuscule memory space, was still discovering MUMPS. In the 1980s, when the computerscience community was turning to relational database technology and distributed architectures, our community still was promoting hierarchical databases and mainframe systems. In the 1990s, when the computer science community had come to grips with the significant limitations of rule-

based architectures for building knowledge-based systems [4], the medical- informatics community was advocating the use of medical logic modules as the standard means for computer-based decision support. It is not that MUMPS and mainframe systems and medical logic modules did not make important contributions. Rather, the point is that, when the traditional computer-science community has made significant intellectual advances, the medicalinformatics community often has trailed in its incorporation of these results. Our colleagues outside our community thus have not tended to view our work as advancing the scienceor at least as advancing their science. The problem, of course, is that we in medical informatics have not done a very good job of articulating what our own science is.

It is significant, however, that there is substantial overlap in the goals of workers in medical informatics and those of researchers in the subset of the computer-science community concerned with the development of knowledge-based systems. My goal in this paper is to show how principles for building knowledge-based systems that began to be articulated by a large community of researchers in the 1990s are particularly germane to developing a theory for medical informatics. We are a long way from defining an overarching theory for our field, but these advances from the world of knowledge-based systems can offer our own research community a more solid theoretical footingfor purposes of communicating ideas both among ourselves and among our computer-science colleagues.

Although there are notable exceptions, workers in medical informatics tend not to present their work in a manner that would allow their computer-science colleagues to identify the generalizable results. It is often difficult for us to present our work detached of the contributions that we make specifically to biomedicine. Work in medical informatics, however, does lead to significant theoretical resultsparticularly when members of the medical- informatics community can elucidate the general principles and can stimulate their colleagues to build on them. A key question for our field, however, is whether medical informatics should be viewed as some admixture of generic computer science plus biomedicine. Current curricula in medical informatics suggest that much of what is traditionally taught as part of computer science is largely irrelevant to our own work. (Compiler design and computer architecture, for examples, are not at the core of what most of us define as informatics.) I believe that our goal as a

discipline should be to communicate more effectively with the academic computer-science community, not to merge with it. I believe it would be wrong to construe current work in medical informatics simply as computer science with a biomedical bent. It also would be wrong to construe medical informatics as a form of software engineering that concentrates on the development of clinical computer systems.

The Role of Building Artifacts Many laboratories for medical informatics were created during the past four decades in response to pressing needs to assist clinical information management. Early information systems for health-care organizations frequently addressed administrative and financial concerns. Rarely did such systems provide assistance in the management of clinical data. As a result, academicians stepped in to fill a niche that industry at the time was refusing to address. The need to build information systems that could acquire, store, and communicate clinical data sparked the emergence of many of the great academic groups in medical informatics. Satisfying institutional requirements for clinical data management also provided the substrate for much of the seminal research in medical informatics.

Many academic groups still justify much of their existence on the basis of the support that they provide to their institutions in the area of clinical computing. Most of those groups argue that their service role leads them to generate more practical and more meaningful results than would be possible if the groups were to perform their research and development in vitro. There is no doubt that information systems such as HELP [5], COSTAR [6], and many others have provided tremendous substrates for research at the institutions that have developed them.

I have argued elsewhere that, despite the obvious advantages of having direct access to the information infrastructure of a health-care organization, a set of service commitments no longer necessarily enhances a laboratorys work in medical informatics [7]. Fundamentally, I believe that modern software systems have become much too large and much too complex to be designed, implemented, tested, and maintained by small groups of academicians who are supported on academic budgets. It simply is inconceivable to me that an academic unit without the discipline, procedures, and personnel resources of a commercial software-engineering

organization could create and deploy a large-scale information system on the scale of HELP using todays programming conventions. It therefore seems dangerous to define medical informaticsat least in the academic sensespecifically in terms of the software artifacts that we can build for the clinical enterprise.

I recognize that many academic medical informatics groups consider their servic e role to their institutions as part of their raison dtre. I worry that this position ultimately will undermine the status of academic medical informatics in modern medical centers. Whereas once informationsystem vendors ignored the requirements of clinical data management, there now is no shortage of commercial systems that begin to address the clinical needs that sparked the growth of academic medical informatics a generation ago. There is considerable room for improvement in the capabilities of many commercial systems, but the market place is ensuring that the capabilities of clinical information systems are only getting better. Development of comprehensive electronic patient-record systems, for example, is a problem that industry is tackling with more resources (and, I believe, with more success) than is the case in any academic laboratory. Whereas once academic groups were needed to define basic structures such as HL-7 messages to assist in communication of clinical data among distributed systems and controlled medical terminologies to facilitate data entry and presentation, the expanding vendor community is now taking the lead in developing appropriate standards for data communication and management throughout the clinical enterprise. Historically, homegrown information systems, such as those at LDS Hospital [5] and Bostons Beth Israel Hospital [8], have been the bedrock of academic work in medical informatics. Inexorably, however, those homegrown systems are being unplugged and replaced by commercial installations. Even when commercial systems lack many of the features available in academic software, health-care organizations are seeking the security of installing well tested and well maintained information systems that are backed by the technical support of large corporations that have large user communities. Even if their academic mission might benefit from a service commitment, opportunities for academic groups to characterize themselves in terms of responsibilities to the information infrastructure at their own institutions are simply disappearing. Just as many health-care organizations have had to redefine their missions and strategies in the wake of a changing economic climate, academic units in medical informatics are having to redefine themselves as their affiliated health-care

organizations change. The time may soon be past when academic units in medical informatics can use the information systems of their associated clinical organizations as test beds for the development and evaluation of major software artifacts without careful negotiation with their associated clinical institutions.

The changing landscape of academic medical centers, at least in the United States, makes some of the broad issues of medical informatics to which Shortliffe [2] alludes no longer within the purview of most academic groups. As development and deployment of clinical information systems becomes squarely the province of industry, much of what used to be considered part of academic medical informatics suddenly has been taken away from university faculty. I will suggest, however, that our research contributions actually will be strengthened as we focus on the part of medical informatics that still remains solidly within the academic realm.

The Beginnings of a Science Investigators in any area of science require theories that can frame the interpretation of observations and that can provide the basis for making advances in understanding. If medical informatics is to have credibility as an academic discipline, it is essential to identify its scientific foundations. We must articulate the basic theories, and explain how our individual contributions extend those theories. It is my belief that workers in medical informatics have difficulty explaining their academic enterprise to other faculty colleaguesand to one anotherprimarily because our community has not done a good job of articulating a set of fundamental assumptions. We are excellent at describing the surface behavior of our artifacts and quite good at measuring their success. We have a long way to go, however, in being able to explain why our artifacts are successful in terms of some underlying theory or set of basic principles.

We certainly can turn to principles of computer science to justify much of our systems behavior. Theories of software engineering, of humancomputer interaction, and of computational complexity certainly are essential to understanding and evaluating computer systems built for the clinical arena. When we define the theory of medical informatics only in terms of the theory of computer science, however, medical informatics loses any claim to being a scientific discipline in its own right. If we believe that the study of medical informatics itself contributes to a basic

understanding of some aspect of the world that is intrinsically valuable, then it is imperative for us to articulate the theoretical underpinnings of medical informatics that are distinct from those of computer scienceand from those of other fields, such as information theory, biostatistics, and health-services research.

Unfortunately, our discipline has had few philosophers with a penchant for suggesting underlying theories. Blois [9] made a landmark attempt at defining a theory of medical informatics in his monograph on the nature of medical information. Blois made the claim that medicine, as an area of human endeavor, is epistemologically unique. He argued that this uniqueness arises because clinical knowledge is dependent on understanding other knowledge that can be defined only at lower levels of abstraction (e.g., that of organismal biology), which in turn can be understood only in terms of knowledge that needs to be defined at still lower levels of abstraction (e.g., that of biochemistry), and so on. Blois suggested that a vast hierarchy of informational levels creates a kind of complexity that is unknown outside of medicine. He justified medical informatics as an academic discipline on the basis of the singular need to model these different hierarchical levels when building information systems for use in clinical care.

Blois hierarchy of information levels has compelling face validity. As a theory, however, it does little to explain our successes and failures in medical informatics. It is hard to think of a specific example of where a clinical information system may have failed expressly because it neglected to consider knowledge that exists at different hierarchical levels, or could not use knowledge at one level to reason about knowledge at another level. There are many compelling examples where systems have taken good advantage of reasoning at multiple levels of abstraction [10], but demonstration of this principle does mean that the hierarchical levels of knowledge described by Blois contribute to a fundamental theory of medical informatics.

Blois clearly was on to something, however. The principal contribution in Blois work was the identification of epistemology as the core element of what makes medical informatics a cogent area for academic study. It was not the building of artifacts or the deployment of information technology in clinical settings per se that made medical informatics fitting for scientific inquiry. According to Blois, it was the elucidation of the underlying medical kno wledge required to build

such systems in the first place that demanded a theoretical foundationa foundation that clearly remains the subject of active scholarly investigation.

Most work to develop epistemological models is no longer done by philosophers, but by workers in artificial intelligence who wish to build knowledge-based systems. The very essence of creating a knowledge-based system requires the construction of a model of human problem solving and the representation of human knowledge in a computational form [11]. It is ironic that Blois himself had such doubts about the viability of much research in artificial intelligence [12]. Much of that doubt was in fact quite justified, given the overstated claims that were commonly made by many developers of expert systems at the time. In the years since Blois death, however, there has been a revolution of thought among workers in the knowledge-based systems community regarding the modeling and representation of human knowledge. Many practitioners in traditional areas of computer science are not even aware that this transformation has taken place. For workers in medical informatics, it is particularly important to understand the foundational primitives for modeling human knowledge that now are commonly used to design and implement a wide range of intelligent computer systems.

Ontologies and Problem Solving Methods It is no coincidence that, from the beginnings of both disciplines, researchers in the area of artificial intelligence have contributed much to medical informatics. The earliest knowledgebased systems, such as INTERNIST-1 and MYCIN, stimulated workers such as Blois to think critically about medical epistemology and contributed substantially to our understanding of medical knowledge representation and automated reasoning [13]. Meanwhile, as the limitations of systems such as INTERNIST-1 and MYCIN became better understood, computer scientists were inspired to develop and evaluate new ways of representing and processing knowledge within computer systems. In the past decade, the emergence of second-generation knowledgebased systems has provided more explicit and more maintainable frameworks for encoding and applying clinical knowledge [4, 14].

Most workers in the artificial intelligence community now view intelligent computer systems as comprising the following four essential conceptual components: (1) a domain ontology, which

defines the primary concepts in the application area, and the relationships among those concepts [15]; (2) a knowledge base of detailed content knowledge, which consists of a set of propositions about the world cast in terms of the domain ontology; (3) a problem-solving method, which encodes an abstract, possibly domain- independent algorithm that can automate the task for which the intelligent system has been built [16]; and (4) a set of mappings, which defines how the concepts represented in the domain ontology and corresponding knowledge base satisfy the inputoutput requirements of the particular problem-solving method [17]. Although many knowledge-based systems still are implemented using traditional rule-based shells, design methodologies such as CommonKADS [11] encourage developers to view the knowledge that such system models in terms of these four kinds of conceptual building blocks. In Europe, CommonKADS has become the de facto standard for constructing intelligent systems in industry. Many software-engineering tools for building intelligent systems, such as Protg-2000 [18], enforce the same perspective.

My own research during the past decade has followed that of the knowledge-based systems community, emphasizing how medical knowledge-based systems can be developed using these different components. For example, to build knowledge bases for the EON system for automation of guideline-based care [19], we begin with a domain ontology that characterizes the kinds of concepts that are found in typical clinical guidelines [20]. Developers instantiate that ontology to define the knowledge of particular clinical guidelines (e.g., a protocol for management of patients who have hypertension). EON includes discrete problem-solving methods that automate tasks such as (1) determining the correct therapy for a patient who is in a particular clinical situation and who is being treated according to a particular guideline or (2) reasoning about whether a particular patient might be eligible for treatment according to a given guideline. Like subroutines in a programming language, the problem-solving methods in EON have formal parameters that define the kinds of data on which the problem-solving methods operate. Each such parameter is mapped to a corresponding element of the domain ontology (e.g., the plan on which EONs therapy-determination problem-solving method operates maps to the concept of guideline in the domain ontology).

10

Ontologies and problem-solving methods potentially are highly reusable components [14]. They can be viewed as building blocks from which a variety of intelligent systems can be constructed by combining (and, when necessary, augmenting) appropriate components. In a sense, the particular ontology and problem-solving method that are used to automate a given task together define a theory for the knowledge required to solve that task. That theory is one that enumerates the domain concepts required for problem solving (the ontology) and the algorithm that must be applied to those domain concepts to achieve a solution (the problem-solving method).

The notion of viewing software as data structures and the algorithms that operate on those data structures is nothing new. The distinction between descriptions of data and specifications of procedures has been around since the early days of computer science [21]. What is new here is the view of the data descriptionsthe ontologyas having central significance and an existence independent of particular algorithms. Ontologies thus become like database schemas in that they are separable and distinct from the set of algorithms that may operate on them. Unlike traditional database schemas, however, ontologies may express extremely complex relationships among the represented conceptsemphasizing a desire to capture a rich, reusable model of the domain, rather than to provide an efficient framework for storing data instances.

It is not an overstatement to claim that the theory for constructing any clinical informationprocessing system can be understood in terms of an ontology of the information being processed and the problem-solving methods that provide the procedures by which information processing takes place. In second- generation knowledge-based systems, the ontologies and the problemsolving methods are encoded as discrete pieces of software. When building conventional software systems, the ontolo gies and problem-solving methods often exist only as conceptual entities at design time, rather than as working pieces of code (although there is typically a direct relationship between a conceptual ontology and the hierarchy of classes used to build programs in object-oriented languages such as C++ or Java). If we can shift our focus so that we view the end product of our enterprise not as the construction of software, but instead as the development of ontologies and problem-solving methods, I believe tha t we will be closer to describing a foundation for the science of informatics. We will not have defined a complete theory by any means, but we will be able to suggest what the content of such a theory might need to address.

11

When Blois claimed that the essence of medical informatics lay in understanding the hierarchical nature of clinical knowledge, he was stating that elucidating the ontology of medicine was at the heart of the informatics enterprise. Although one can quibble as to whether the essential problem is in distinguishing the various layers of knowledge described in Blois book, it is clear that a major contribution of medical informatics rests in the elucidation of the ontology needed to automate clinical tasks, as well as in the characterization of appropriate problem-solving methods that can drive the computation. If we can construe our academic discipline in terms of defining, using, and evaluating ontologies and problem-solving methods, we can move closer to articulating a theory that can allow us to present our work in terms of basic principles. Suddenly, the construction of a clinical information system is more than building a software artifactit becomes the identification and validation of an appropriate ontology and problem-solving method for the task at hand. The notion of construing the development of a clinical information system in these terms entails more than simply applying new buzz words to a familiar softwareengineering problem; by elucidating the ontology and the problem solving method, we can create a formal model of the clinical knowledge required to automate a given task, and have the potential to apply elements of that model systematically to the construction of future clinical systems.

There is a risk, however, in concentrating on these fundamental building blocks of informatics: Considerable, important work builds directly on the core elements of our discipline but itself has little to do with conceptual modeling using ontologies or problem-solving methods. Indeed, a strength of medical informatics is that it so readily can incorporate a wealth of perspectives within its interdisciplinary context. As we attempt to justify informatics as an academic enterprise in terms of its fundamental elements, it is important not to forget that basic research in our field contributes to a wide range of tools and practical applications that are worthy of study in their own right. In emphasizing basic principles, we should not diminish in any way the value of studying deployed systems. Rather, the goal simply is to clarify the elements of informatics that contribute to the basic science of our discipline and that may distinguish informatics from related areas of study.

12

Toward a Scientific Basis for Medical Informatics Casting work in medical informatics in terms of reusable ontologies and problem-solving methods gives us a helpful vocabulary for talking about the science of our discipline. Suddenly, research to create controlled clinical terminologies can be seen as work to construct new ontologies of medical descriptions that meet specific requirements [22]. Research on electronic patient record systems generally can be understood in terms of the ontologies needed to capture clinical information and to communicate that information effectively [23]. Development of improved algorithms for information retrieval, for image processing, or for signal analysis can be viewed as research to devise new problem-solving methods with enhanced performance characteristics. It may be simplistic to couch such diverse work in medical informatics in terms of these rather basic conceptual building blocks. The emphasis on underlying ontologies and problem-solving methods, however, provides a way of unifying the broad issues addressed by our discipline within a coherent framework.

It is clear that there are many elements of work in medical informatics that are not well captured by ontologies or problem-solving methods. Work to embed information systems within organizational structures and workflows, for example, seems outside of the computational elements that ontologies and problem-solving methods represent. There is no doubt that such efforts are essential to the successful deployment of clinical systems. At the same time, it is not clear what distinguishes workflow integration within clinical settings from similar activities that might be done in countless other industrial environments. The underlying theory of workflow integration is the same, whether one is deploying an information system in a hospital or a process-control system in a manufacturing plant. What is different in each case, of course, is the model of the relevant organization. Construction of that organizational model can be viewed as a problem in ontology development, and thus fits well within our framework.

One way to validate this emphasis on conceptual building blocks as fundamental to medical informatics is to identify the manner in which seminal work in our field contributes to our understanding of ontologies or problem-solving methods. There certainly is no consensus concerning what are the most significant accomplishments in medical informatics. There was considerable discussion of this topic, however, during September 2000 on the e- mail distribution

13

list of the Americ an College of Medical Informatics. Sittig [24] summarized the discussion, listing ten major achievements. Although there has been no attempt to ratify Sittigs synthesis, his list of top ten accomplishments in biomedical informatics is a convenient starting-off point for our own discussion. Let us consider each item on the list in turn.

1.

The entire MEDLARS and, more recently, MEDLINE database: The primary scientific contribution of MEDLARS is one of ontology. U.S. governmental agencies such as the Social Security Administration had begun to create and manage enormous databases before the advent of MEDLARS and MEDLINE. Work on information retrieval had been ongoing for several decades before the National Library of Medicine provided online access to the biomedical literature. What had not been done before, however, was the construction of a rich, detailed ontology (the Medical Subject Headings; MeSH) for indexing the biomedical literature; of procedures for updating and maintaining the ontology (and the associated indexes) over time; and of problem-solving methods that could use the MeSH ontology to aid information retrieval.

2.

The Unified Medical Language System: Construction of the UMLS and of each its incorporated controlled terminologies is a problem in ontology.

3.

The clinical decision support systems that work within large hospital information systems: Building a decision support system is a problem in creating an appropriate domain ontology and of linking that ontology to an appropriate problem-solving method [14]. There has been considerable, important work to evaluate the effects of such decision-support systems on the behavior of health-care workers and on patient outcomesbut the theory that underlies such investigation seems to come more from the area of health-services research than it does from informatics.

4.

MUMPS: As a programming language, MUMPS facilitated much seminal work in informatics. Construction of a programming language, however, fits more within the purview of computer science than it does informatics.

14

5.

Systems including QMR, DxPLAIN, MYCIN; software for individualizing dosage regimens of drugs: Once again, ontologies and problem-solving methods are at the core of decision-support systems [14].

6.

Fully developed electronic medical record systems: As emphasized by Kuhn and Giuse [23], the major contribution of patient record systems lies in the ontologies that allow practitioners to record clinical descriptions and to communicate those descriptions to their colleagues. There certainly are major software-engineering issues that developers of electronic medical record systems must address. The semantics of the underlying databases and of the user interfaces that capture clinical information in the first place, however, rely on extensive ontologies, development of which fits squarely within the province of informatics.

7.

The Visible Human project; the Digital Anatomist; the Slice of Life: The construction of the data set of Visible Human images was not a problem in informatics; it was a problem in gross anatomy, specimen preparation, and photography. The development of the problem-solving methods to render three-dimensional reconstructions and other visualizations of the Visible Human images clearly constitutes important work in informatics, however. The Digital Anatomist project has explored significant problem-solving methods for rendering and presenting anatomical content, and now is moving into the development of the most comprehensive, machine-processable ontology of human structure in existence [25]. The Slice of Life project, while curating and distributing an enormous library of valuable anatomical images, does not seem to be addressing fundamental research questions in informatics.

8.

The HL-7 messaging standard: The transmission of data packets on local-area networks is the stuff of electrical engineering. The description of what generic medical data look like and the development of a structure for their communication from one application to another is a problem in creating an appropriate ontology.

15

9.

Understanding the fundamental processes of medical diagnosis, therapy planning, reminder systems, uncertain reasoning (Bayesian belief nets, rule-based systems, neural networks), patients information needs, problem-oriented and time -oriented records: This item from Sittigs list lumps together a variety of major research results from different academic communities. Work in cognitive science to elucidate humanproblem solving behavior clearly informs work in medical informatics, and leads to the development of computational problem- solving methods. Research in uncertain reasoning has at its core the elaboration of problem-solving methods. Understanding patients information needs informs our ontologies. Construction of time-oriented and problem-oriented records is a problem in ontology development and evaluation.

10.

The human genome projects data: The rapid sequencing of the human genome would not have been possible without the development of new problem-solving methods to assemble long stretches of nucleotide sequences by piecing together those of relatively short fragments. Problem-solving methods to determine homologies among base sequences are central to much current work in bioinformatics. At the same time, now that most of the genome is known, the bioinformatics community recognizes the importance of building ontologies that can help to structure this information and to relate specific genes to their physiological function [26].

Informatics as an Academic Discipline At Stanford, we now teach the course that introduces first-year graduate students to the principles of biomedical informatics in terms of the development and application of domain ontologies and problem-solving methods [27]. The course begins with a discussion of controlled terminologies, introduces basic principles of knowledge representation, and then segues into several lectures on ontology development and useboth for development of clinical applications and for work in bioinformatics. We then introduce the notion of abstract problem-solving methods. We discuss computational approaches to topics such as clinical diagnosis, therapy planning, sequence searching and alignment, and molecular structure determination in terms of problem-solving methods that can operate on ontologies. We find that this approach allows us to unify many otherwise diverse ideas in medical informatics. More important, our curriculum

16

makes it clear that there is considerable methodological overlap between clinical informatics and bioinformatics. In bioinformatics, the particular ontologies and problem-solving methods may be different from those in clinical informatics, but the fundamental problems of designing and using ontologies, and of selecting and refining problem-solving methods, clearly are the same.

In recent years, computational biology and bioinformatics have captured the imagination of the scientific community and of funding agencies. Many workers in clinical informatics have expressed concern that the application of informatics to problems related to genomics and biological-structure determination will soon eclipse long-established research paradigms to develop information technology for health-care settings and for medical education. It is my conviction that the boundary between biology and medicine will become more and more blurred as genetic information becomes increasingly relevant in the treatment of patients, and as the biological functions of more and more genes become known. Regardless, basic research on ontologies and problem-solving methods is required to advance both clinical informatics and bioinformatics. As we become better able to articulate a theory of informatics (undifferentiated with respect to application area), we should see profound cross-fertilization between our work in the clinical and basic-science arenas.

When we emphasize the role of domain-specific ontologies and generic problem-solving methods in building clinical systems, we highlight a way of viewing information technology that seems surprisingly untethered to health care. We begin to beg the question of what makes medical informatics different from informatics in general. Indeed, some European universities have departments of business informatics and social-science informatics, and one can imagine the study of information technology applied to a host of professional enterprises. My perspective is that informatics (without a modifier) is a basic science that concerns the computational modeling and application of human knowledge. Informatics involves the construction of ontologies that define the concepts relevant to different aspects of human experience and the elucidation of problem-solving methods that can solve specific computational tasks. When the relevant ontologies are related to health and health care, we call the discipline medical informatics; when they are related to basic biology, we call the discipline bioinformatics; and so on. Because all subdisciplines within informatics rely on the same kinds of fundamental

17

building blocks, basic advances in one branch of informatics will enhance all the others. At the same time, the historically fierce debates concerning whether our field should be called medical informatics or health informatics seem less important when our focus is on the unifying methodology. Basic research in clinical informatics, nursing informatics, and bioinformatics is pretty much the same thing.

There are important distinctions to be made, however. Each subdiscipline of informatics is unique because it needs to model its own set of professional activities and to define its own set of ontologies. Blois was correct: Medicine is complex and the ontology of medical knowledge is multilayered and multifaceted. Medical informa tics deserves recognition as a specific discipline (distinct from other forms of informatics) because of the unusual intricacy of the ontologies that drive our systems. The practitioners of our craft need to understand not only the basic principles of informatics in general, but also the details of clinical practice that can make modeling the knowledge of health care such a thorny problem.

If there is a slogan that characterizes why informatics is different from computer science, it is ours is the discipline that cares about the content. Although software engineers of all kinds certainly need to incorporate relevant domain knowledge into their program code, the conceptual modeling and computational representation of domain knowledge and data is the centerpiece of our discipline. Computer scientists always can work in tandem with application specialists to build useful software. Workers in informatics, however, play a role that is more than that of software engineer and more than that of domain informant: Informaticians are domain modelers who are primarily driven by a desire to get the content knowledge right.

Computer scientists, particularly those who work in the area of knowledge-based systems, surely understand the importance of carefully modeling and applying content knowledge; they simply do not work at these tasks as a fulltime job. Although many academic computer scientists study questions such as the representation and management of large-scale ontologies, the use of ontologies by particular problem solvers, and the computational performance of alternative algorithms for solving specific problems, such research questions are core to the science of informatics. Research in informatics certainly includes the study of more applied questions, such

18

as those of system development, deployment, and evaluation [28]. Indeed, it often is only via systems-level evaluations that we can test the success of our basic models. If we are to claim that informatics is an academic discipline somehow distinct from computer science and information science, however, I believe that we need to highlight the conceptual modeling of content knowledge as the basic, fundamentally special element of our research agenda.

There is no doubt that the construction of the rich domain ontologies required for robust clinical systems is difficult and that many of the problem-solving methods required for many biomedical application systems represent significant computational challenges. It is precisely because the modeling work in biomedical informatics is so hard and requires so much innovation that our research has such high potential to be transferable to other areas outside of biomedicine that also require rich conceptual models. For our work to be generalizable, however, we must learn to couch our contributions in terms of primitives that transcend our particular application domain. The notions of domain-specific ontologies and of problem-solving methods are excellent initial candidates both to frame our research hypotheses and to communicate our results to other investigators.

Acknowledgments Russ Altman, Patti Brennan, Milton Corn, Charles Friedman, and Michael Kahn provided extremely helpful comments on a previous draft of this manuscript. I only wish that I could have adequately reflected on all their good ideas in this paper.

19

References

1.

Kohn, L.T., Corrigan, J.M., and Donaldson, M.S. (eds.). To Err Is Human: Building a Safer Health System. Washington: National Academy Press, 2000.

2.

Shortliffe, E.H., Perreault, L.E., Wiederhold, G., and Fagan, L.M. (eds.). Medical Informatics: Computer Applications in Health Care and Biomedicine, 2nd Edition. New York: SpringerVerlag, 2001.

3.

Van Bemmel, J.H. and Musen, M.A. (eds.) Handbook of Medical Informatics. Heidelberg: SpringerVerlag, 1997.

4.

David, J.-M., Krivine, J.-P., and Simmons, R. (eds.). Second Generation Expert Systems. Berlin: SpringerVerlag, 1993.

5.

Kuperman, G.J., Gardner, R.M., and Pryor, T.A. HELP: A Dynamic Hospital Information System. New York: SpringerVerlag, 1991.

6.

Barnett, G.O., Justice, N.S., Somand, M.E., Barclay Adams, J., Waxman, B.D., Beaman, P.D., Parent, M.S., Van Deusen, F.R., and Greelie, J.K. COSTAR system. Proceedings of the IEEE 67(9):12261237, 1979.

7.

Friedman, C.P. (organizer and moderator), Frisse, M.E., Musen, M.A., Slack, W.V., and Stead, W.W. (participants). How should we organize to do informatics? Report of the ACMI debate at the 1997 AMIA Fall Symposium. Journal of the American Medical Informatics Association 5:293304, 1998.

8.

Safran, C., Slack, W.V., and Bleich, H.L. Role of computing in patient care in two hospitals. M.D. Computing 6(3):141148, 1989.

20

9.

Blois, M.S. Information and Medicine. Berkeley: The University of California Press, 1984.

10.

Patil, R.S., Szolovits, P., Schwartz, W.B. Causal understanding of patient illness in medical diagnosis. In: Proceedings of the Seventh International Joint Conference on Artificial Intelligence. Vancouver, British Columbia, August 2428, pp. 893899, 1981.

11.

Schreiber, A.T., Akkermans, J.M., Anjewierden, A.A., De Hoog, R., Shadbolt, N.R., Van de Velde, W., and Wielinga, B.J. Knowledge Engineering and Management: The CommonKADS Methodology. Cambridge, MA: The MIT Press, 2000.

12.

Blois, M.S. Clinical judgment and computers. New England Journal of Medicine 303(4):192197, 1980.

13.

Clancey, W.R. and Shortliffe, E.H. Readings in Medical Artificial Intelligence: The First Decade. Reading, Massachusetts: Addison Wesley, 1984.

14.

Musen, M.A. Scalable software architectures for decision support. Methods of Information in Medicine 38:229238, 1999.

15.

Chandrasekaran, B., Josephson, J.R., Benjamins, V.R. What are ontologies, and why do we need them? IEEE Intelligent Systems 14(1):2026, 1999.

16.

Chadrasekaran, B., Johnson, T.R., and Smith, J.W. Task-structure analysis for knowledge modeling. Communications of the ACM. 35(9):124137, 1992.

17.

Gennari, J.H., Tu, S.W., Rothenfluh, T.E., and Musen, M.A. Mapping domains to methods in support of reuse. International Journal of HumanComputer Studies 41:399 424, 1994.

21

18.

Noy, F.N., Sintek, M., Decker, S., Crubzy, M., Fergerson, R.W., and Musen, M.A. Creating Semantic Web contents with Protg-2000. IEEE Intelligent Systems 16(2):60 71, 2001.

19.

Musen, M.A., Tu, S.W., Das, A.K., and Shahar, Y. EON: A component -based approach to automation of protocol-directed therapy. Journal of the American Medical Informatics Association 3:367388, 1996.

20.

Musen, M.A. Domain ontologies in software engineering: Use of Protg with the EON architecture. Methods of Information in Medicine 37:540550, 1998.

21.

Wirth, N. Algorithms + Data Structures = Programs. Englewood Cliffs, New Jersey: PrenticeHall, 1976.

22.

Musen, M.A., Wieckert, K.E., Miller, E.T., Campbell, K.E., and Fagan, L.M.. Development of a controlled medical terminology: Knowledge acquisition and knowledge representation. Methods of Information in Medicine 34:8595, 1995.

23.

Kuhn, K.A., and Giuse, D.A. From hospital information systems to health information systemsproblems, challenges, perspectives. Methods of Information in Medicine 40:275287, 2001.

24.

Sittig, D. Top 10 accomplishments in biomedical informatics. Electronic mail posting to acmi-discussion@mail.amia.org, September 24, 2000.

25.

Rosse, C.R., Mejino, J.L., Modayur, B.R., Jakobovits, R., Hinshaw, K.P., and Brinkley, J.F. Motivation and organizational principles for anatomical knowledge representation: The Digital Anatomist symbolic knowledge base. Journal of the American Medical Informatics Association. 5:1740, 1998.

22

26.

The gene ontology consortium. Gene ontology: tool for the unification of biology. Nature Genetics. 1:2529, 2000.

27.

Musen, M.A. Design and use of clinical ontologies: Curricular goals for the education of health-telematics professionals. In: Iakovidis, I, Maglavera, S., and Trakatellis, A. (eds.) User Acceptance of Health Telematics Applications: Education and Training in Health Telematics. Amsterdam: IOS Press, pp. 4047, 2000.

28.

Friedman, C.P. Wheres the science in medical informatics? Journal of the American Medical Informatics Association. 2: 6567, 1995.

23

You might also like