I am sharing with you this multilingual (five languages) terminological resource. I was captivated by this quote on their page: “Language is not an abstract construction of the learned, or of dictionary makers, but is something arising out of the work, needs, ties, joys, affections, tastes, of long generations of humanity, and has its bases broad and low, close to the ground”. Noah Webster (1758-1843).
• Compiling documents for term identification and extraction
• Uploading these terms into a collaborative platform that allows for public access and collaboration.
Once again, I couldn’t pass the opportunity to give an overview of Prof. Schmitz’s webinar organized by SDL. As in past summaries (read here) I think the initiative by SDL Trados is worth praising, not only because it’s something that we need to see more often, but also because the instructors are top-class terminologists who are sharing with us their knowledge and passion for terminology.
Watch the 50 min. webinar by clicking on this link: http://www.sdl.com/video/guidelines-for-termbase-design/116603/. For some of the terms used in this post, I have added a link to related posts in this blog.
Prof. Schmitz started by explaining the importance of terminology for technical writers, linguists, companies, and organizations in terms of improving communication and consistency and reducing costs. Also, a termbase has to be carefully designed as correcting it later is a very arduous process and the objective of having a well-design termbase is to allow for data exchange and interoperability. He made an analogy with a messy closet. If you have a well-organized closet you can find things more easily, whereas a disorganized closet makes you waste a lot of time searching for them.
Before designing a termbase you need to (1) analyze the needs and objectives, (2) specify the user groups, tasks, and workflow, (3) define the terminological data categories, (4) take into account the basic modeling principles, (5) model the terminology entry, and (6) select, adapt, or develop the software.
His presentation focused on 12 aspects for termbase design, as follows: Read More
Although POS is not a mandatory category when working with termbases, (ISO mandatory categories are term, source, and date) it has been recognized as a highly suggested category, as pointed out recently by Kara Warburton during her webinar “Getting value of your Excel glossaries”. In her own words, “as soon as there are different parts of speech there are different concepts”.
According to Terminorgs, “the most important non-mandatory data category is the part of speech” and it is required for the following purposes:
- “To differentiate homonyms. For instance, port is actually two terms in English: a noun, and a verb, each of which should be recorded in its own entry. Without a part of speech value in the entry, it can be difficult to determine which term the entry represents, and therefore, how to translate it.
- To permit automated processing. The part of speech is required for automated tasks such as importing a set of entries into an existing termbase, applying grammatical filters to facilitate search and export of data, and providing the terminology as a resource to other applications such as spell checking applications.
- To enable interchange. When there is no part of speech value, it becomes necessary to discuss many of the entries with the originator in order to disambiguate their content.”
As you can see, having a clear understand of how the POS works is key to having a coherent and efficient database. Make sure that you use it appropriately to increase the quality of your termbase, for example, to avoid writing a definition for a noun and setting the POS as a verb.
Fourth webinar summary is here! Robert Muirhead was the moderator for the last webinar by SDL Trados, and the speaker was Tom Imhof from localix.bix. After three amazing webinars, it was hard to keep up with the others, but he made a great presentation that turned out to be one of my favorites. The information presented by Tom was extremely useful. A good refresher’s course for all of us, whether you use Trados or not!
Here is the link to the full video: https://sdl.webex.com/sdl/lsr.php?RCID=5c8a1d42845b4e49ab3c0f318b8f3006
I am aware that my recent posts summarizing these webinars have been way too long, but I truly believe that they all contain valuable information, particularly for beginners, as a great introduction to terminology work. In this webinar, Tom covered basic theory up until minute 18, and then he moved to a practical example by creating a termbase with MultiTerm 2015. So this post will only cover the theory and then you can watch the rest starting on minute 18.
While I was writing this blog post, my friend and subscriber, Simona Tigris (and PhD in Philology), reminded me about homophones in a comment on my post on synonymy and polysemy. I couldn’t agree more with her. I had thought about writing a separate blog post, but when we talk about doublettes, one of the main causes seems to be the mishandling of homophones. So, after reading this blog post, I also recommend you take some time to read the links below.
Doublettes (the technical terms for duplicate entries) are quite common unless you have an elephant’s memory. During a recent webinar, terminologist Barbara Inge Karsh mentioned that between 5%-10% of entries need adjusting during maintenance. In order to keep that percentage at a minimum, we should try to do an efficient job in the early stages of term entry creation by following terminological principles. Read More
When we deal with concepts, we also deal with terms in different forms. If we think of dictionaries, they put all concepts in one entry, while in a termbase we register each concept on separate term entries, a key difference between lexicography and terminology.
The terminologist carries out a ranking exercise, so to speak, in which s/he has to classify synonyms as “preferred”, “admitted”, or “deprecated” terms, making sure that they are all kept in a single entry to avoid doublettes. For example, which term should you use: shortcut key, hotkey, keyboard shortcut, access key, accelerator key, keyboard accelerator?
And if a termbase is well maintained, s/he might have to replace some of them with updated forms and register the previous form as “obsolete”. For example, at one point “periodontosis” was dropped in dentistry in favor of “periodontal disease”. Read More
I recently was approached by the developer of this new mutilingual database (company based in Ireland) and since I work at a bank I thought I’d give it a try and so far so good. You have a 3-month free subscription, so you can sign in and try it before buying. Check it out: http://www.linguafin.com. According to their website, it includes: Read More
I was recently invited by Memsource to write this post for their blog. I hope you find it useful. Feel free to comment and share.
Read the original post here: http://blog.memsource.com/tackling-terminology-in-a-new-field-a-practical-case/
I was recently asked to do terminology work and translation on a topic that was new to me: SAP (Systems, Applications & Products in Data Processing), the enterprise software that manages business operations and customer relations. I want to share with you the steps I followed to make sure I got my terminology right.
- Sign up to expert online forums and groups: To start on the right foot, search for online groups in which experts come together and talk about what they do. To get help on SAP, I signed up for every professional forum I could find and I was surprised to see that people actually were willing to help me. One expert sent me a 16.000 word multilingual glossary!
- Follow expert blogs: I found so many blogs on SAP both in English and Spanish and in one case one of the experts was also a trainer and was helpful in answering a few questions. She also had a great glossary online that I saved in my Favorites, along with the other online glossaries that I found.
- Use social media: We are lucky to live in these times when we can make new contacts and friends by actively using social media. The most valuable offer of help came from an expert in Spain who I contacted via Linkedin. He helped me revise a 250+ term list that I had made in Excel based on the definitions for each term. The best part was that he only made a few suggestions for changes, which meant I was indeed doing my homework right.
- Use Google Custom Search: I put all of my SAP links in a folder that allowed me to do simultaneous searches in my favorite SAP pages (help.sap, supportsap, etc.) using Google’s Custom Search.
- Gather your resources: Use Google’s advanced features (such as filetype:pdf or site:mundosap.com) to find reliable information. I found PhD and Master theses, manuals, articles, etc. As always when dealing with the Internet, make sure your documents are written by subject-matter experts.
- Confirm you terms through corpus analysis: Convert your collected documents to .txt to analyze in a corpus analysis tool. I cannot emphasize this enough. Doing corpus analysis is critical in your terminology work. It is a great way to look at concordances and initially confirming your terms before they get validated by the expert. In some cases, when you have the same reference document available in your working languages, you can align them and create a translation memory to use as a corpus.
- Use your CAT tool to reuse your terms for future translations: The purpose of managing your terminology effectively is being able to reuse it. Regardless of the CAT tool you use, it is key to create a termbase for your terminology. I’m sure you don’t want to see those long hours researching your terminology going to waste.
Although I can’t say I’m an expert in SAP terminology, I can assure you that following these steps made me feel confident about the final product delivered to the client.
To better understand granularity, think of coarse grains and fine grains, each one of them containing more or less matter depending on how much material they contain. The same applies to data granules. The Wikipedia provides a very easy example to understand granularity: recording your home address in one category would be a coarse granularity: recording it in several categories (street address, city, postal code, country) would be a finer granularity, and recording it under more categories (apartment number, state, postal code add-on) would be even finer.
So, if we transfer that to our term entries, we could decide to add more, or less, data fields to document our terminological information. Sue Ellen Wright gives the following example applied to term entries:
- /grammar/ m,n,s (masculine noun singular) has low (coarse) granularity,
but if we divide information units into finer categories such as
- /part of speech/ = noun
- /grammatical gender/ = masculine
- /number/ = singular
then, we have high (or fine) granularity. Read More
Writing your first terminological definition might be a bit overwhelming. So how do you start? Well, many authors seem to agree that the most widely used type of definition is the intensional definition. I recommend you consult the sources below for more information as this is just a brief introduction to the subject, especially if you are going to be writing a lot of definitions for your termbase.
First, let’s review the ideas of superordinate, subordinate, and coordinate concepts. Let’s say we have three levels: The top level is superordinate and refers to the general topic (e.g. energy), the second level is subordinate and refers to those specific concepts under the general topic (e.g., (i) renewable energy or (2) nonrenewable energy) and the third level is coordinate and refers to same-level concepts (e.g., (1) wind, solar, bio, geothermal energies, etc. or (2) fossil fuels, coal, petroleum, etc.).
When working with termbases it might be confusing at first to remember which code to use. Don’t get confused! ISO has two lists of codes (well, actually more than two, but let’s keep it simple): the language codes called ISO 639-1:1988 “Code for the representation of names of languages” Part 1 Alpha-2 code and ISO 3166 “Code for the representation of names of countries”.
Both consist of two letters. The language code is written in lowercase while the country code is written in uppercase. However, both ISO classifications may be combined to differentiate regional languages.
US: United States
GB: Great Britain
en US: American English
en GB: British English
fr FR: French France
fr CA: French Canada
TermTerm is a freely accessible multilingual terminology database containing about 1.600 terms: central concepts of terminology work and definitions taken from relevant terminology standards, in German (1350 terms), English (1.900 terms), French (950 terms), and Greek (1.100 terms). It is available in SDL MultiTerm Online and quickTerm.
The original data is the result of a collaboration between students of the MA program “Terminology and Language Engineering” of the Cologne University of Applied Sciences in Germany, the Hellenic Society for Terminology (ELETO), and elcat (an innovative e-learning system for terminology launched by the Cologne University of Applied Sciences, in collaboration with selected industry partners and the International Network for Terminology -TermNet).
Over the course of several projects, these institutions found it necessary to clarify the terminology of terminology and to prepare this data in a terminological database.
Also added to my page “Terminology Terms“.