OpenClinical logo

Medical terminologies

Medical terminologies, nomenclatures, coding and classification systems: an introduction

 bullet  Introduction  bullet  Issues  bullet  Recent work  bullet  References  bullet  Links

 bullet  Details on individual vocabularies, nomenclatures, coding and classification systems

Clinical vocabularies, terminologies or coding systems, are structured list of terms which together with their definitions are designed to describe unambiguously the care and treatment of patients. Terms cover diseases, diagnoses, findings, operations, treatments, drugs, administrative items etc., and can be used to support recording and reporting a patient's care at varying levels of detail, whether on paper or, increasingly, via an electronic medical record.

A nomenclature is a relatively simple system of names; a vocabulary is a system of names with explanations of their meanings; a classification is a systematic organisation of things into classes, and a thesaurus (such as MeSH) is designed to index medical literature and support search over bibliogaphic databases. But many of the terms used in this field can prove difficult to define accurately, and their use in practice can be inconsistent. We refer readers for more detailed introductory information and discussion on medical terminologies to the tutorial by Jeremy Rogers of Manchester University (see link below), and to Part 5 (chapters 16-18) of Enrico Coiera's Guide to Health Informatics entitled 'Language, coding and classification'.

Medical coding and classification systems form part of current moves towards implementing a standardised "language for health": a common (computerized) medical language for global use. The US Institute of Medicine 2003 report, Patient Safety: Achieving a New Standard for Care, highlights the importance of terminologies to healthcare and provides the following summary of their purpose and a likely outcome of current efforts in the field:
"If health professionals are to be able to send and receive data in an understandable and usable manner, both the sender and the receiver must have common clinical terminologies for describing, classifying, and coding medical terms and concepts. Use of standardized clinical terminologies facilitates electronic data collection at the point of care; retrieval of relevant data, information, and knowledge; and reuse of data for multiple purposes (e.g., disease surveillance, clinical decision support, patient safety reporting).

"No single terminology has the depth and breadth to represent the broad spectrum of medical knowledge; thus a core group of well-integrated, non-redundant clinical terminologies will be needed to serve as the backbone of clinical information and patient safety systems." [Patient Safety: Achieving a New Standard for Care P14-15 (OC)]
Some discussion points:

  • A very large number of coding and classification systems have been developed for healthcare.
  • Many standards have been proposed but widespread adoption has been slow.
  • Current standards tend to compete.
  • Existing medical vocabularies vary in their coverage and completeness.
  • Many classifications overlap.
  • Historically, vocabulary and classification systems have been designed to meet different and specific goals. Many codes have been designed mainly to support administation (e.g. billing) so have typically included, for example, only a limited number of diagnosis codes for each encounter. Widely-used but essentially administration-oriented system, such as ICD, have been mandated by government agencies and/or payor organizations but capture clinical data at an insufficient level of detail to support clinical needs that lie outside the limited range of activities they were designed to support.
  • Systems designed to cover clinical information have tended to cover a relatively narrow subset of healthcare, such as nursing procedures or problem lists.
  • Some systems that concentrate on coding fine-grained primary clinical data have been proprietary, custom-built, limited, difficult for clinicians to use and have resulted, in some cases, in low user acceptance.
  • Coding systems can lose clinical information.
  • It can be difficult to compare clinical coding systems.
  • One stated ideal would be a system that allowed clinicians to record primary clinical data using natural language which could be automatically turned into standardized code.
  • Interoperability is a significant problem. Content, structure, completeness, detail, cross-mapping, taxonomy, definitions, clarity vary between existing vocabularies.
  • A single, comprehensive standard medical terminology which would improve the automated flow of clinical information does not exist - though remains a goal for many.
  • Many of established medical coding systems lack a precise semantic underpinning. (The recent emergence of description logic encoded medical terminologies - particularly SNOMED CT - aims to address this problem.)
  • Comprehensive clinical terminology systems are needed to help integrate patient data with health information technologies such as electronic medical records. SNOMED CT aims to help structure and computerize the medical record but needs to be used correctly and consistently to preserve data quality and maximise shareability.
  • Integration of electronic patient records and medical terminologies with decision support systems is being researched, for example by the SAGE project in the USA and as part of the DeGeL digital guideline library infrastructure (Israel/USA).
Recent work

  • Medical terminologies are evolving from relatively "simple code-name-hierarchy structures, into rich, knowledge-based ontologies of medical concepts" [Cimino, 2001]
  • Recent work has aimed to build shareable and reusable computerised vocabularies (such as GALEN)
  • Semantic tools are under development to help end users manage vocabularies.
  • The more recent emergence of Description Logic encoded medical terminologies has the potential to facilitate transition to the Semantic Web and pave the way for integration of medical computer systems into E-Grid networks.
  • Cimino [Cimino, 1998], continuing work published in [Cimino, 1989], has listed a set of desirable features for computerised medical vocabularies in the 21st Century offering the following propositions:
    • Specify multiple hierarchies (rather than lists)
    • Maintain formal definitions (replacing informal descriptions)
    • Stress structurual and knowledge representation issues (not just on expanding content)
    • Maintain systematics approaches to updating vocabularies
    • Ensure domain completeness
    • Eliminate "not elsewhere classified" terms (which may be introduced to deal with incompleteness in a vocabulary)
    • Allow for graceful evolution in vocabulary design to support inclusion of new developments in healthcare and error correction.
    • Be able to recognise redundancy where the same information is expressed in different ways.
    • Ensure concept permanence: don't delete a concept or change its meaning
    • Don't include meaningless concept identifiers
    • Support multiple granularities to meet the differing needs of different users
    • Maintain multiple consistent views: ensure consistency over different views of a hierarchy
    • Represent context specific information to maintain the relationship between a concept and teh context in which it is used.
Introductory and general references

E. Coiera. The Guide to Health Informatics (2nd Edition). Arnold, London, October 2003.

[Chapter 17 sample chapter - freely available: Healthcare Terminologies and Classification Systems]

Other relevant chapters in this book include:
  • Chapter 16: Terms, codes and classification
  • Chapter 18: The trouble with coding
Coiera, 2nd Edition, 2003 This sample chapter (17) from part of Part 5 of Coiera's book (Language, Coding and Classification) covers the history of medical classification systems and nonemclatures; discusses the characteristics, benefits and limitations of a number of specific systems (ICD, SNOMED ....), and looks at valid (and invalid) ways of comparing them.

J.H. van Bemmel, M.A. Musen (Editors). The Handbook of Medical Informatics. Springer-Verlag, New York, 1998.

[V3.3 on MIEUR website]

Wyatt JC, Liu JL. Basic concepts in medical informatics. J Epidemiol Community Health. 2002 Nov;56(11):808-12.

[PubMed]   [JECH Online]


" This glossary defines terms used in the comparatively young science of medical informatics. It is hoped that it will be of interest to both novices and professionals in the field. "

de Lusignan S. Codes, classifications, terminologies and nomenclatures: definition, development and application in practice. Inform Prim Care. 2005;13(1):65-70.

[PubMed]   []

" The Primary Care Informatics Working Group of EFMI is working to help develop the core theory of primary care informatics (PCI). Codes, classifications, terminologies and nomenclatures form an important part of the science of PCI, as they allow clinical information to be readily stored and processed in information systems. This article provides definitions and a history of the International Classification for Primary Care (ICPC), and of the Read code and the Systematized Nomenclature for Medicine (SNOMED). The Working Group wishes to encourage shared definitions and an understanding of the practical application of structured data to improve quality in clinical practice. "

Reviews, comparisons and issues
Rector AL. Clinical Terminology: Why is it so hard? Methods of Information in Medicine 1999;38:239-252.

[PubMed]   [Schattauer]

Abstract " Despite years of work, no re-usable clinical terminology has yet been demonstrated in widespread use. This paper puts forward ten reasons why developing such terminologies is hard. All stem from underestimating the change entailed in using terminology in software for 'patient centred' systems rather than for its traditional functions of statistical and financial reporting. Firstly, the increase in scale and complexity are enormous. Secondly, the resulting scale exceeds what can be managed manually with the rigour required by software, but building appropriate rigorous representations on the necessary scale is, in itself, a hard problem. Thirdly, 'clinical pragmatics'--practical data entry, presentation and retrieval for clinical tasks--must be taken into account, so that the intrinsic differences between the needs of users and the needs of software are addressed. This implies that validation of clinical terminologies must include validation in use as implemented in software. "

Rector A. Terminology, codes and classifications in perspective: the challenge of re-use. Br J Healthcare Comput Info Manage 2000; 17(3): 20–3.

Despite years of effort at designing clinical coding systems, results have been unsatisfactory. Alan Rector explains the challenges.

[PubMed ??]   [BJHC]

" Many healthcare organisations expend major resources on terminology-related problems, and the NHS is embarking on a major collaborative project. Despite efforts extending over two decades, however, there is still little consensus. Part of the explanation is that we have underestimated the magnitude of the changes implied by the strategy of deriving most information from support for patient care — from use by people to use by software, from single purpose use to multipurpose re-use, and from entry by coding staff to entry by healthcare professionals. This paper explores some of the implications of these changes and some responses to them. "
Cimino JJ. Review paper: coding systems in health care. In: van Bemmel JH, McCray AT, eds. IMIA Yearbook of Medical Informatics 1995. Stuttgart, New York: Schattauer, 1995. Reprinted in Methods of Information in Medicine; 1996;35(4-5):273-284.

[Find Paper]

"Computer-based patient data which are represented in a coded form have a variety of uses, including direct patient care, statistical reporting, automated decision support, and clinical research. No standard exists which supports all of these functions. Abstracting coding systems, such as ICD, CPT, DRGs and MeSH fail to provide adequate detail, forcing application developers to create their own coding schemes for systems. Some of these schemes have been put forward as possible standards, but they have not been widely accepted. This paper reviews existing schemes used for abstracting, electronic record systems, and comprehensive coding. It also discusses the remaining impediments to acceptance of standards and the current efforts to overcome them, including SNOMED, the Gabrieli Medical Nomenclature, the Read Clinical Codes, GALEN, and the Unified Medical Language System (UMLS). "
Cimino J.J. Desiderata for Controlled Medical Vocabularies in the Twenty-First Century. Methods Inf Med. 1998 Nov;37(4-5):394-403.


" Builders of medical informatics applications need controlled medical vocabularies to support their applications and it is to their advantage to use available standards. In order to do so, however, these standards need to address the requirements of their intended users. Over the past decade, medical informatics researchers have begun to articulate some of these requirements. This paper brings together some of the common themes which have been described, including: vocabulary content, concept orientation, concept permanence, nonsemantic concept identifiers, polyhierarchy, formal definitions, rejection of "not elsewhere classified" terms, multiple granularities, multiple consistent views, context representation, graceful evolution, and recognized redundancy. Standards developers are beginning to recognize and address these desiderata and adapt their offerings to meet them. "

Cimino JJ, Hripcsak G, Johnson SB, et al. Designing an introspective, multipurpose, controlled medical vocabulary. Proc 13th Annu Symp Comput Appl Med Care. 1989:513-8.

[]   []

Zielstorff, RD. Characteristics of a good nursing nomenclature from an informatics perspective. Online Journal of Issues in Nursing. Sept. 30, 1998

[ ]
[Online Journal of Issues in Nursing]

" The purpose for which a nomenclature is designed dictates its characteristics. Very few clinical nomenclatures have been designed for use in automated record systems. For this reason, system designers have had to adapt existing nomenclatures and classification systems for use in the automated systems they develop. Researchers have delineated the characteristics of a "good" nomenclature for purposes of structured data capture, storage, analysis, and reporting. Some of these characteristics are: domain completeness, granularity, parsimony, synonymy, non-ambiguity, non-redundancy, clinical utility, multiple axes, and combinatorial. In addition, the terms should have unique and context-free term identifiers, each term should have a definition, terms should be arranged hierarchically with the ability to have multiple parents, and it must be possible to map terms to other standard classifications. These concepts are defined and rationalized in the context of the functions expected of an automated record system. "
Campbell JR, Carpenter P, Sneiderman C et al. Phase II evaluation of clinical coding schemes: completeness, taxonomy, mapping, definitions, and clarity. CPRI Work Group on Codes and Structures. J Am Med Inform Assoc. 1997 May-Jun;4(3):238-51.

[PubMed Central]

" OBJECTIVE: To compare three potential sources of controlled clinical terminology (READ codes version 3.1, SNOMED International, and Unified Medical Language System (UMLS) version 1.6) relative to attributes of completeness, clinical taxonomy, administrative mapping, term definitions and clarity (duplicate coding rate). METHODS: The authors assembled 1929 source concept records from a variety of clinical information taken from four medical centers across the United States. The source data included medical as well as ample nursing terminology. The source records were coded in each scheme by an investigator and checked by the coding scheme owner. The codings were then scored by an independent panel of clinicians for acceptability. Codes were checked for definitions provided with the scheme. Codes for a random sample of source records were analyzed by an investigator for "parent" and "child" codes within the scheme. Parent and child pairs were scored by an independent panel of medical informatics specialists for clinical acceptability. Administrative and billing code mapping from the published scheme were reviewed for all coded records and analyzed by independent reviewers for accuracy. The investigator for each scheme exhaustively searched a sample of coded records for duplications. RESULTS: SNOMED was judged to be significantly more complete in coding the source material than the other schemes ... SNOMED also had a richer clinical taxonomy judged by the number of acceptable first-degree relatives per coded concept ... Only the UMLS provided any definitions; these were found for 49% of records which had a coding assignment. READ and UMLS had better administrative mappings ... and SNOMED had substantially more duplications of coding assignments ... associated with a loss of clarity. CONCLUSION: No major terminology source can lay claim to being the ideal resource for a computer-based patient record. However, based upon this analysis of releases for April 1995, SNOMED International is considerably more complete, has a compositional nature and a richer taxonomy. Is suffers from less clarity, resulting from a lack of syntax and evolutionary changes in its coding scheme. READ has greater clarity and better mapping to administrative schemes (ICD-10 and OPCS-4), is rapidly changing and is less complete. UMLS is a rich lexical resource, with mappings to many source vocabularies. It provides definitions for many of its terms. However, due to the varying granularities and purposes of its source schemes, it has limitations for representation of clinical concepts within a computer-based patient record. "
de Keizer NF, Abu-Hanna A, Zwetsloot-Schonk JH. Understanding terminological systems. I: Terminology and typology. Methods Inf Med 2000 Mar;39(1):16-21.

[Methods Inf Med]

" Terminological systems are an important research issue within the field of medical informatics. For precise understanding of existing terminological systems a referential framework is needed that provides a uniform terminology and typology of terminological systems themselves. In this article a uniform terminology is described by putting relevant fundamental notions and definitions used by standard organizations such as CEN and ISO into perspective, and interrelating them to arrive at a useful typology of terminological systems. This typology is illustrated by applying it to five well-known existing terminological systems. "
de Keizer NF, Abu-Hanna A. Understanding terminological systems. II: Experience with conceptual and formal representation of structure. Methods Inf Med. 2000 Mar;39(1):22-9.

[Methods Inf Med]

" This article describes the application of two popular conceptual and formal representation formalisms, as part of a framework for understanding terminological systems. A precise understanding of the structure of a terminological system is essential to assess existing terminological systems, to recognize patterns in various systems and to build new terminological systems. Our experience with the application of this framework to five well-known terminological systems is described. "
Strang N, Cucherat M, Boissel JP. Which coding system for therapeutic information in evidence-based medicine Comput Methods Programs Biomed 2002 Apr;68(1):73-85.

[Comput Methods Programs Biomed]

" The coding of information in the computer representation of clinical trials is essential both for the rationalisation of the activities involved in the production of therapeutic information for evidence-based decision support and for the integration of the messages produced by these activities with clinical information and electronic patient record systems. There is no standard coding system available, however, so building on existing evaluations, we performed a simple semi-quantitative evaluation of ICD-10, CDAM, MEDDRA, MESH, READ, SNOMED and UMLS to provide objective criteria for the choice of a coding system. Inclusion and exclusion criteria for four clinical trials recorded in TriSum constituted the corpus of evaluation texts. Criteria included coding coverage, size, integration and language coverage. The results of the comparison lead us to choose SNOMED as the most appropriate coding system for our needs. The absence of a European Medical Language System project is observed, as is the need for combinatorial as opposed to enumerative systems. "

Coonan KM. Medical informatics standards applicable to emergency department information systems: making sense of the jumble. Acad Emerg Med. 2004 Nov;11(11):1198-205.

[PubMed]   [Acad Emerg Med]

" The adoption of medical informatics standards by emergency department information systems (EDISs) is not universal, despite obvious benefits. Clinicians and administrators looking to obtain an EDIS need to know exactly what the various standards can do for them and how the systems they depend on can be integrated and extended. In addition to the standard methods for systems to communicate (chiefly Health Level 7 [HL7]) and those required for submission of claims (Current Procedural Terminology [CPT]-4, International Classification of Diseases, Ninth Revision, Clinical Modification [ICD-9-CM], and X12N), there are several other available standards that are clinically useful and can greatly improve the ability to access and exchange patient information. Major advances in the Unified Medical Language System of the National Library of Medicine have made the patient medical record information standards (Systematized Nomenclature of Medicine [SNOMED], Logical Observation Identifiers, Names, and Codes [LOINC], RxNorm) easily accessible. Detailed knowledge of the arcana associated with the technical aspects of the standards is not needed (or desired) by clinicians to use standards-based systems. However, some knowledge about the commonly used standards is helpful in choosing an EDIS, interfacing the EDIS with the other hospital information systems, extending or upgrading systems, and adopting decision support technologies. "

Methods & tools
Rector A, Rossi A, Consorti MF, Zanstra P. Practical development of re-usable terminologies: GALEN-IN-USE and the GALEN Organisation. Int J Med Inf. 1998 Feb;48(1-3):71-84.


" Medical terminology is now playing a key role in medical software. This requires new techniques with which many clinical users, classification experts and applications developers are unfamiliar. There is a conflict in that the more re-usable techniques for terminology needed to support sharing of information among many different applications are more difficult to use for any one application. A layered approach to re-use is described which combines techniques from first generation systems and relatively easily understood second generation systems with the formal rigour of third generation systems to resolve this conflict. The methodology also provides a potentially rigorous approach to defining the relationship between terminology and structure in the electronic healthcare record architecture. It provides a natural migration pathway from existing systems to powerful re-usable multilingual terminologies. "
Cimino JJ. Terminology tools: state of the art and practical lessons. Methods Inf Med 2001;40(4):298-306.


" OBJECTIVES: As controlled medical terminologies evolve from simple code-name-hierarchy arrangements, into rich, knowledge-based ontologies of medical concepts, increased demands are placed on both the developers and users of the terminologies. In response, researchers have begun developing tools to address their needs. The aims of this article are to review previous work done to develop these tools and then to describe work done at Columbia University and New York Presbyterian Hospital (NYPH). METHODS: Researchers working with the Systematized Nomenclature of Medicine (SNOMED), the Unified Medical Language System (UMLS), and NYPH's Medical Entities Dictionary (MED) have created a wide variety of terminology browsers, editors and servers to facilitate creation, maintenance and use of these terminologies. RESULTS: Although much work has been done, no generally available tools have yet emerged. Consensus on requirement for tool functions, especially terminology servers is emerging. Tools at NYPH have been used successfully to support the integration of clinical applications and the merger of health care institutions. CONCLUSIONS: Significant advancement has occurred over the past fifteen years in the development of sophisticated controlled terminologies and the tools to support them. The tool set at NYPH provides a case study to demonstrate one feasible architecture. "
Cimino JJ. From data to knowledge through concept-oriented terminologies: experience with the Medical Entities Dictionary. J Am Med Inform Assoc 2000 May-Jun;7(3):288-97.

[PubMed Central]

" Knowledge representation involves enumeration of conceptual symbols and arrangement of these symbols into some meaningful structure. Medical knowledge representation has traditionally focused more on the structure than the symbols. Several significant efforts are under way, at local, national, and international levels, to address the representation of the symbols though the creation of high-quality terminologies that are themselves knowledge based. This paper reviews these efforts, including the Medical Entities Dictionary (MED) in use at Columbia University and the New York Presbyterian Hospital. A decade's experience with the MED is summarized to serve as a proof-of-concept that knowledge-based terminologies can support the use of coded patient data for a variety of knowledge-based activities, including the improved understanding of patient data, the access of information sources relevant to specific patient care problems, the application of expert systems directly to the care of patients, and the discovery of new medical knowledge. The terminological knowledge in the MED has also been used successfully to support clinical application development and maintenance, including that of the MED itself. On the basis of this experience, current efforts to create standard knowledge-based terminologies appear to be justified. "
 bullet  Introduction to Medical Terminologies and Medical Terminology Projects - a tutorial by Jeremy Rogers, Manchester University Medical Informatics Group  bullet  Introductory material on SNOMED Clinical Terms and medical terminologies (NHS Connecting for Health)  bullet  ICD: International Classification of Diseases [OC]  bullet  LOINC: Logical Observation Identifiers, Names and Codes [OC]  bullet  MeSH - Medical Subject Headings [OC]  bullet  UMLS - Unified Medical Language System [OC]  bullet  SNOMED CT: Systematized Nomenclature of Medicine - Clinical Terms [OC]  bullet  GALEN/OpenGALEN [OC]  bullet  Medical terminology-related software available for download [OC]  bullet  Open Source medical terminology-related software available for download [OC]  bullet  Public reports: medical terminologies [OC]  bullet  SAGE project [OC]  bullet  DeGeL digital guideline library [OC]  Auf Deutsch  Klassifikationen im Gesundheitswesen - Vocabularies in Germany (DIMDI) (includes downloads)
Jeremy Rogers, Medical Informatics Group, University of Manchester. NHS Connecting for Health website.
page history
Entry on OpenClinical: 10 July 2005
Last main update: 24 July 2005

Search this site

Privacy policy User agreement Copyright Feedback

Last modified:
© Copyright OpenClinical 2002-2011