A termbase: What you should know
You have probably seen under terminology extraction tools some of the programs that you can either buy or use for free, or you can even build your own termbase in Access, Excel, or even Word, but you also need to have at least some basic knowledge on termbases.
A termbase is a database containing terminology and related information. (Source: Multiterm), and terms and their related information are recorded in a terminological entry. The process of collecting, defining, translating, and storing those terms is known as terminology management. By creating a termbase and effectively managing terminology you will be able to reduce duplicate work, reduce the amount of time spent in research, and reduce errors and inconsistencies, among others.
You start by reviewing documents related to a specific subject field (such as medicine) to identify term candidates that could possibly be entered in a terminology entry. Once you have a reasonable number of terms, you create a manual or electronic file where you store all the information related to every term. That information (metadata) is ordered and recorded as data categories (properties that describe the term, also called terminological fields) including, among others, the entry term, subject field, part of speech, preferred term, deprecated (prohibited) term, admitted term, examples, definition, usage comments (notes). Check out this comprehensive list on data categories “CLS Framework: Listing of ISO 12620 Data Categories”.
Uwe Muegge, in his article “Terminology Management: Neglect it at your own peril” says “ISO 12620 catalogs almost 200 possible data categories for a terminological entry. At the same time, ISO 12616 lists only three of these data categories (term, source, and date) as mandatory. For many if not most organizations, the most practical solution is a data model made up of fewer than two dozen data categories.” It is important to point out that sometimes people feel paralyzed by the overwhelming amount of information, but experts recommend that you start small and slow. Little by little, but do it right.” (Note: This presentation by Kelly Washburn illustrates some examples of data categories.)
Some of the golden rules or principles (also called best practices or requirements) for terminology management that should be followed are briefly described below.
Concept orientation (or single-concept principle): The first rule says that a terminology entry can only describe one concept and one concept only. So, one entry or record is one concept and it should include information that makes the term stand out from other terms, and what makes it “different” is determined by the information described in the field (e.g. data categories, synonyms, abbreviations, spelling variants, translation of the term). (Source: Terminogs Starter’s Guide, page 27).
There should be one entry per concept and one designator per concept. (Designator or designation is defined by Pavel as “The sign denoting a concept, such as a term, phrase, abbreviation, formula or symbol. Example: water = H2O.”). Synomys are also designators.
Concept orientation also requires that all designators for all languages are documented in one entry. (Source: TermNet). Homographs (words that are spelt the same but have two or more meanings) are treated as separate entries, while synonyms are all kept together with the concept they share in the same entry. In a termbase it also means that all languages have equal status. (Source: Terminology Starter Guide).
Univocity principle: According to the General (or classical) Theory of Terminology univocity means that a concept should always refer to one term only (a concept should not have a synonym). Rita Temmerman has criticized this principle in her article “Questioning the Univocity Ideal“, in which she also makes a comparison between the traditional and the socio-cognitive theories with respect to this term (page 67).
Term autonomy: A termbase must contain the same group of fields for each term. In other words, if an entry for the legal term “acquittal” contains 5 specific data categories then the other terms in that same termbase must have the same 5 data categories as “acquittal”. “Term autonomy guarantees that all terms including synonyms, abbreviated forms and spelling variants can be documented with all necessary term-related data categories.” (Source: Shaping Translation: A View from Termonological Research, by several authors) Note: You have to sign up to academia.eu to download this document.
Data elementarity: Every information field must contain only one type of information. For example, if it is a definition field it must only include a definition (and only one); if it’s an acronym field, it must only include the acronym; if it is a preferred term field, it should not include deprecated or admitted terms; if it is the term field it should only have the term (not it’s abbreviation, nor synonyms, nor explanatory data, etc.). If you have two definitions for the same term (homonyms), you should create two different entries for each term with it’s corresponding translation. For example, mouth of a river and mouth of an animal are two different things. (Source: Terminogs Starter’s Guide, page 28).
Other principles related to database creation:
Data granularity: The level of detail contained in a unit of data. The higher the level of detail – the lower the granularity level; conversely the reduced detail results in higher granularity. (Source: Database Glossary by nwdatabase.com). Sue Ellen Wright’s presentation gives some more info here.
Data integrity: The property of the database that ensures that the data contained in the database is as accurate and consistent as possible. (Source: Database Glossary by nwdatabase.com).
Data modeling is the formalization and documentation of existing processes and events that occur during application software design and development. Data modeling techniques and tools capture and translate complex system designs into easily understood representations of the data flows and processes, creating a blueprint for construction and/or re-engineering. (Source: SearchDataManagement).
Single repository: All the information must be recorded in one database only. However, if multiple databases are involved then a feature must be created so that searches can be made simultaneously in all databases. (Source: Terminogs Starter’s Guide, page 27). This term refers to the general database concept used in database management.
When dealing with terminology work one of the tasks is deciding if the term is going to be descriptive or prescriptive, and therefore we need to take a descriptive or prescriptive approach. During the terminology work we are likely to use mostly the descriptive approach. The difference relies on the fact that the descriptive one doesn’t make recommendations, it just describes the terms; while in the prescriptive one (also known as normative approach) we have to go through an approval stage in which the stakeholders (see section on Game Players) decide which is the preferred term or terms. That could even go to a further stage which is deciding if the term should be standardized or not. Read more here.
When you work as a terminologist you will probably encounter two situations: You receive a request to look for a specific term or terms, that is, on a needs basis. That’s what we call an ad-hoc activity or approach. Or you could also carry out the terminological work systematically when you get involved in translation projects, or read a corpus to perform terminological analysis (systematic activity or approach). You will be probably be faced with both scenarios. For example, a translator could ask you to look for the term ‘big data’ (ad-hoc), or you could get a project on ‘information architecture’ where you will need to look for all the terminology that appears in the documentation provided. (systematic ). The latter is also known as thematic approach because you collect and describe the terminology of a given field.
There are two terms from the lexicology field that are commonly mentioned when doing terminology work: semasiology and onomasiology. In terminology, the semasiological principle refers to the activity in which you have a term to which you have to apply all the terminology steps (research, look for metadata, and (if working with a multilingual termbase) translate. So, you go from term to concept, so to speak. This approach is used when creating dictionaries. In the onomasiological approach you have a concept and then you have to find a term. In the words of the wikipedia, you start with a concept and then ask for its names. It might be worth mentioning here that the General Theory of Terminology speaks in favor of a onomasiological approach, but in practice the semasiological approach is more commonly used. The thesis Towards terminology research as a practical philosophy of information: the terminology of radical constructivism as a case in point, by Philipp B. Neubauer provides more information. [I published a separate post on semasiology and onomasiology after writing this. You can read it here.
1. Are they worthy? What terms belong in a termbase by Hanne Smaadahl. Read here
2. Source: Notes to Temmerman (1997,2000)