31 No. 5
Consistency and Clarity in Chemical Concepts: How to Achieve a Codified Chemical Terminology—A Pilot Study
by Ture Damhus, Peder Olesen Larsen, Bodil Nistrup Madsen, and Sine Zambach
Consistency and clarity are essential when defining chemical concepts. However, currently available glossaries in chemistry are lacking in these respects to a certain extent. General terminology methods involving the construction of concept systems and the specification of relations between concepts can be of use here. A pilot study indicates that improvements of the term definitions in enzyme and protein chemistry are possible. Our aim is to raise a discussion about these issues, hopefully resulting in further projects.
IUPAC has made enormous progress through the years in providing systematic chemical nomenclature and terminology. The efforts in terminology are collected in the Gold Book: The IUPAC Compendium of Chemical Terminology.1 However, the development of new terms via IUPAC projects concerned with delimited areas of chemistry has resulted in glossaries typically with a narrow focus, which means that the Gold Book has become, in principle, an uncritical compilation of glossaries established by various specialist groups for their specific needs. This entails a risk of creating inconsistencies in the overall terminology. We address this problem using fundamental principles of terminology2 to create concept systems, also referred to as terminological ontologies, as well as to attain clearer and mutually consistent definitions.
Two Types of Debatable Entries in the Gold Book
Users may find definitions in the Gold Book unsatisfactory for at least two reasons. A definition may be unambiguous and consistent with related definitions in the Gold Book, but may differ from what the user expects or wants as the definition. This type of entry is illustrated in Table 1, comparing definitions from the Gold Book and the Oxford Dictionary of Biochemistry and Molecular Biology,3 or ODBMB, for short. Perhaps a bit discomforting, text books and dictionaries do not agree on the definitions of a number of fundamental chemistry concepts. Gold Book definitions in those cases, however phrased, will disagree with the opinion of large numbers of practitioners. For cases like this, systematic terminology work will not solve the problem. Somebody has to make a decision, which is often more or less arbitrary, such as whether a molecule may be monoatomic or not, as illustrated in Table 1. Although important enough, this type of situation will not be dealt with in this article.
The second type of unsatisfactory definitions is those that contain inconsistencies that may be widely accepted in the scientific community. This makes it difficult to enlarge such sets by adding new terms and definitions. Also, it may confuse nonexperts like students who are trying to learn what the concepts are about. Examples of this second category will be given below.
||Definition from Gold Book
||Definition from ODBMB
||An electrically neutral entity consisting of more than one atom (n > 1). Rigorously, a molecule, in which n > 1 must correspond to a depression on the potential energy surface that is deep enough to confine at least one vibrational state.
||A structural unit of matter consisting of one or more atoms; the smallest discrete part of a specified element or compound that retains its chemical identity and exhibits all its chemical properties.
||Smallest particle still characterizing a chemical element. It consists of a nucleus of a positive charge (Z is the proton number and e the elementary charge) carrying almost all its mass (more than 99.9%) and Z electrons determining the size.
||A unit of matter consisting of a single nucleus surrounded by one or more orbital electrons. The number of electrons is normally sufficient to make the atom electrically neutral; adding or removing electrons converts the atom into a negative or positive ion, but this is regarded as a state of the same atom since the atom is characterized by its nucleus.
Table 1. Comparison of two basic definitions presented in the Gold Book and the ODBMB.
The Pilot Study
Our purpose in the pilot study is to create an ontology or ontologies and a term database for enzyme chemistry, based on available recommendations, in particular from the Gold Book and ODBMB.1,3 For a start, we have chosen to work in two narrow subfields: enzyme inhibition and protein structure. By using principles of terminology we intend to avoid creating inconsistencies when expanding the ontologies. We are using the concept modelling tool i-Model from i-Term;4 the system i-Term is a terminology and knowledge management system that combines facilities of a traditional term base with a concept modelling tool.
How to Define Terms
Instead of focusing on each definition, we work according to the terminological method with the concepts in a concept system (an ontology). Therefore, we need to formalize the relations between the concepts and to introduce characteristics delimiting related concepts (feature specifications, consisting of attribute-value pairs). On the basis of these feature specifications, subdivision criteria are introduced, which group concepts and thereby give a good overview. These methods are described in conference proceedings referenced below.5, 6
In our work, we use an iterative process: analyzing the concepts as well as placing them in draft concept systems in the form of hierarchies or networks on the basis of their characteristics, then drafting definitions, and, finally, refining concept systems as well as definitions. In this way, we arrive at consistent definitions referring to the superordinate concept (i.e., genus proximum or nearest kind) and followed by the delimiting characteristic.
In the example shown below in Figure 1, the genus proximum is inhibition, one subdivision criterion or attribute is MECHANISM, and one of the attribute values is “a product of the reaction is the inhibitor.”
The superordinate concept and the attribute of the feature specification must be the same in definitions of subordinate concepts falling under one subdivision criterion.
Inhibition as Kinetics and Mechanism
Figure 1 clarifies the differences between various subtypes of inhibition. Seven of these concepts fall within two groups according to the two subdivision criteria: KINETICS and MECHANISM.
The three concepts allosteric inhibition, substrate inhibition, and product inhibition differ with respect to MECHANISM, and, therefore, the definitions of these concepts should focus on mechanism. However, as may be seen from Table 2, the definitions from ODBMB do not clearly reflect this.
||Definitions from ODBMB
|Any inhibition of an enzyme by a negative allosteric effector.
||MECHANISM: the inhibitor binds at a place different from the active site
|The inhibition of an enzyme’s activity by its substrate by an allosteric mechanism.
||MECHANISM: the substrate itself is the inhibitor
|The inhibition of an enzymatic reaction caused by increased concentration of one or more products of that reaction
||MECHANISM: a product of the reaction is the inhibitor
Table 2. Comparison of the definitions from ODBMB and characteristic features (as shown in figure 1), for three inhibition concepts falling under the attribute ‘mechanism’.
According to the terminological principles described in the conference proceedings referenced below,5,6 two concepts with the same superordinate concept must not differ with respect to more than one characteristic, except if they belong to a “polyhierarchy,” where the concepts in question have two or more superordinate concepts belonging to different subdivision criteria. In our area of study, however, concepts are often delimited by a combination of characteristics. In Figure 1, the four subordinate concepts to the concept reversible inhibition differ with respect to KINETICS, which is a composite characteristic having two feature specifications with the attributes MICHAELIS CONSTANT and MAXIMUM RATE.
The diagram in Figure 1 also illustrates that a concept system may comprise hierarchical relations (type relations) as well as associative relations. Associative relations have a relation name and an arrow indicating the direction of the relation (e.g., the relation causes between inhibitor and inhibition).
What Is Protein Structure?
Everybody dealing with protein chemistry uses the terms primary, secondary, tertiary, and quaternary structure. Nevertheless, writing an agreeable set of definitions for them is not easy. Table 3 shows the ODBMB and Gold Book definitions for the three first concepts. We also add the definitions of the structural elements α-helix and β-pleated sheet, mentioned in both sources in their definitions of secondary structure, but only defined in ODBMB.
There are several observations to be made. First of all, the two definitions of primary structure clearly do not agree on whether one is allowed to regard cross-linking as part of primary structure. This difference is of the same kind as we noted above with the definitions of atom and molecule (see Table 1). Another observation is that in each of the two sources, the style of wording varies considerably from primary over secondary to tertiary structure. ODBMB speaks of “first order of complexity in structural organization,” then of “arrangement of . . . structure” and then “level of structure.” Gold Book has “constitutional formula . . . abbreviated to sequence,” then “conformational arrangement,” and, for tertiary structure, “spatial organization.” A third comment is that the ODBMB explanations in particular contain more than is needed for defining the term; there is commentary and additional information. However, ODBMB is a dictionary and readers will expect this kind of material to be included.
The terms α-helix and β-pleated sheet appear in both sources as examples of secondary structure elements. The definitions in ODBMB look very different in that some geometrical facts about α-helices are included that are actually a consequence of the definition of the concept and the way amino acids bind to each other. The definition of β-pleated sheet looks much briefer, but it turns out that if one looks up the term β-strand appearing in the definition, this is defined again via β-conformation, the explanation for which is just as lengthy as the one for α-helix. We note in particular that a definition may rely on a term that itself needs to be defined before the first definition is understandable. Another example of this in Table 3 is conformation and conformational, which appear in several places.
We have attempted to put the definitions on a more equal footing and to explore whether they can be satisfactorily arranged in a hierarchy. The column to the far right in Table 3 shows the definitions we suggest at the moment and the diagram proposed in Figure 2 shows the hierarchy. The concepts secondary and tertiary structure (and quaternary structure, which we have tentatively included in the diagram) have the superordinate concept conformation, while primary structure is seen as a subordinate concept to constitution; conformation and constitution are subordinate concepts to molecular structure. The characteristic feature LEVEL (inspired by one of the wordings in the sources) distinguishes the three structure terms. The job is not finished, however; we still need to resolve the issue about the disulfide bonds, and we have to deal with the definitions of conformation and of β-strand, both of which may turn out to necessitate even further definitions until we end at basic terms that may be assumed to have unique interpretations by all practitioners.
All in all, the establishment of usefully worded and consistent definitions of concepts in a hierarchy is an iterative process, in which all steps are subjected to the principles explained above.
Future Steps towards a Consistent Terminology
Among the advantages of working with concept systems is that they can clarify concepts by pruning the descriptions. This is achieved by identifying the characteristics delimiting terms derived from the same superior concepts, which makes the differences between the concepts clear (e.g, competitive inhibition versus noncompetitive inhibition and secondary structure versus tertiary structure). Using this methodology, the definitions of the concepts themselves can be clarified as well as the relations between concepts (such as causes, inhibits, or activates), which provides a better understanding of the concepts and their use.
The online edition of the Gold Book provides the possibility of seeing the relations between concepts. Figure 3 shows an example of the relations existing in Gold Book for a number of the concepts included in our work. This facility is useful for finding additional related concepts. However, it is not possible to figure out the types of relations.
Presenting concept systems with specified concept relations and concept characteristics can be useful not only for the general chemist to understand the enzyme chemist, but also for students at all levels to obtain a more conceptual understanding of the words used in chemical textbooks and articles. Just as the periodic table can help students get an overview of electro-negativity, number of electrons in each orbital shell, and atomic radii, a systematic display of the concepts used in chemistry and biochemistry may help the student toward a better understanding of the meaning of the terms involved.
Our work aims to identify and rectify inconsistencies in the two selected subject areas. We feel it is important to extend this work to larger fields. This will require cooperation among chemists and terminologists. As a first step, we are proposing a joint IUPAC-IUBMB project to extend our first results and set up guidelines for future work.
This study is still at an early stage. Therefore, we encourage readers to contact us if they want to comment or contribute and/or eventually to participate in forthcoming projects.
Annemette Wenzel and Lone Bo Sisseck, both then at the DANTERM terminology centre, contributed in the early phases of this study.
- IUPAC. Compendium of Chemical Terminology, 2nd ed. (the Gold Book). Compiled by A. D. McNaught and A.Wilkinson. Blackwell Scientific Publications, Oxford (1997). XML on-line corrected version: http://goldbook.iupac.org (2006-) created by M. Nic, J. Jirat, B. Kosata; updates compiled by A. Jenkins.
- ISO 704:2000. Terminology Work–Principles and Methods. International Organization for Standardization. This standard is published by ISO TC 37, Terminology and other language resources, and is currently under revision.
- Cammack, R. et al. (2006). The Oxford Dictionary of Biochemistry and Molecular Biology, 2nd Ed. Oxford University Press. Here referred to as ODBMB.
- www.i-Term.dk, Terminology Management System developed by DANTERMcentret, a Danish Centre for Terminology at Copenhagen Business School. A demo version is available on the website.
- Madsen, Bodil Nistrup: “Terminological Ontologies—Applications and Principles.” In: Sebastian Schaffert, York Sure: Semantic Systems From Visions to Applications, Proceedings of the conference Semantics 2006, Österreichische Computer Gesellschaft, Wien, 2006, pp. 271–282.
- Madsen, Bodil Nistrup, and Hanne Erdman Thomsen: “Terminological Ontologies and Normative Terminology Work.” Proceedings of TSTT 2006—Third International Conference on Terminology Standardization and Technology Transfer.
Ture Damhus <firstname.lastname@example.org> is a researcher at Novozymes in Bagsværd, Denmark; he is secretary of the IUPAC Chemical Nomenclature and Structure Representation Division, and also a member of the Committee for Nomenclature of the Danish Chemical Society. Peder Olesen Larsen <email@example.com> is a retired chemistry professor and also a member of the Committee for Nomenclature of the Danish Chemical Society. Bodil Nistrup Madsen <firstname.lastname@example.org> is a professor at Copenhagen Business School, Denmark, Dept. of International Language Studies and Computational Linguistics, and head of the terminology centre, DANTERMcentret; she is also chair of ISO TC 37, SC 3 (systems to manage terminology, knowledge, and content). Sine Zambach <email@example.com> is a Ph.D. fellow in Computer Science at Roskilde University. She holds a master’s degree in biochemistry and bioinformatics.
last modified 14 September 2009.
Copyright © 2003-2009 International Union of Pure and Applied Chemistry.
Questions regarding the website, please contact firstname.lastname@example.org