Conversion tools and difference checkers for language lovers

convert-buttonHere is a collection of useful tools. I have not included the usual PDF to Word or Word to PDF converters because you can easily find them online.

I have added them to my blog’s Delicious list ( so that you may have immediate access to them. Enjoy!

Conversion tools:

TBX convert: On this page, you can convert between several glossary filetypes: UTX-Simple, GlossML, TBX-Glossary, OLIF. TBX (TermBase eXchange) is a family of XML-based languages for the interchange of terminological information (called TMLs, for Terminological Markup Language; also informally called “dialects” of TBX). All of TBX shares a core structure, in which information is represented on one of three structural levels: concept, language, and term.

UTF-16 to UTF-8 Converter Read More

From the terminologist’s toolbox: BootCaT Frontend

sbafo CAT

Sbafo, the mascot

This corpus builder takes your links and extracts text that can be analyzed in a corpus analysis tool such as AntConc.

There is a very easy tutorial that you can follow (see link below) in which they provide screen shots explaining every step of the process. After you give a name to your project and pick the language, you choose from three different ways to capture your text. The first option is the Simple Mode: You give the tool the terms (called seeds) and BootCaT generates “tuples” (different combinations of those terms) and it automatically collects the URLs related to the topic. To avoid thousands of hits that might not be interesting, you can limit the search to one Internet domain. The final step is to build the corpus which the tool saves in your computer.

You can see the corpus built by BootCaT in Notepad++. Apparently the regular Windows Notepad cannot open the full corpus but they provide the link to download it:

Read More

From the Terminologist’s toolbox: Datamundi’s term extractor tool for PDFs

Terminology extractionThis tool was recently shared by Gert Van Assche () in Twitter and in The Open Mic (click here).
Read more about this tool on Datamundi’s page and download the executable file here:
This is the description provided when you download it: “How does this tool work? This tool is a front-end to a term extraction engine running on our server. You’re using this front-end tool to instruct the engine what to collect for you and to extract the text from the PDF. This happens only when the PDF file is not encrypted or protected against text extraction. This extracted text is uploaded via FTP to our server. On this server the term extraction happens according to your wishes (with or without frequency, only multi-word terms or not, with or without generating an interactive term cloud). When done, the terms (up to 500!) are mailed to you.” But read the links to learn more.

Please note this tool only works for English. 

So give it a try and let me know how it worked for you.

Happy extracting!

(Please note my posts do not endorse any company or person. I share information in my educational blog that I think my readers would find useful).

Terminology in the Microsoft Manual of Style for technical communicators


This is a great reference document, freely available online in PDF format (click here to download the PDF directly to your computer or google it as filetype:PDF).

The 2012 4th edition is “your everyday guide to usage, terminology, and style for professional technical communications” and includes several sections such as “Terminology and word choice” and “Acceptable Terminology”.

Besides being an excellent guide, it gives you a general idea of how terminology is handled–particularly if you are into localization–to keep the “Microsoft voice” as they call it when referring to the importance of consistency.

I have extracted from their manual a few examples of the terminology-related principles of Microsoft style, which may serve as a guide for your own terminology management activities. Read More

From the Terminologist’s toolbox: Okapi’s Rainbow

Rainbow3I came up with this tool recently and downloaded it to my laptop and played a little with it for term extraction. It seems to be working fine, so I thought I’d share it with you. You can download it for Mac, Windows and Linux. If you have used it yourself, I would love to hear how it works for you.

You can download it here:

Here’s a features list from their wiki: Read More

InMyOwnTerms goes Delicious. 180 links to add to your Favorites!

Delicious_Textlogo.svgI finally did it! I organized all of my links for resources in English, French, Portuguese, and Spanish, as well as dictionaries, glossaries, corpora, blogs, and much more.
Check it out here:
I have deleted the contents of my section “TermFinder” and added the link to IMOT’s Delicious in case you need it in the future.
It was getting really hard to organize so I spent the holidays sorting them out. I have included most of the links that used to be under TermFinder and also added the links in my sections in French, Portuguese, and Spanish.
This is a great tool to handle your links and you can add IMOT’s Delicious page to your Favorites, so as to have immediate access to them. Obviously, I would want you to keep visiting my blog and not forget about it!
Delicious is very easy to use and you can click on the TAGS or the TAG BUNDLES to see the topics that I have used. Please let me know if you ever find a broken link or send me a note if you think there’s a mistake or if you want me to add a special link. If you have a Delicious page let me know, so that I can add it to the section “Network” (I’m currently following two translators).
If you decide to use Delicious, you may use the Delicious App in your computer to add new links automatically from your favorite pages.
Let me know if you find it useful. Happy searching!


Image Source

TermSciences: Multidisciplinary Terminological Portal


Click here to read a description

In case you missed it, TermCoord recently published a post on TermSciences, a comprehensive termbase in English, French, Spanish, and German which is the result of a joint effort between the Institut de l’information scientifique et technique (INIST), the Laboratoire lorrain de recherche en informatique et ses applications (LORIA) and the Analyse et Traitement Informatique de la Langue Française (ATILF).

TermSciences aims to offer “an integrated vision of scientific terminology, federate the work of skilled partners in the constitution and management of terminological resources, and offer a platform with tools and services for different communities such as researchers, the Natural Language Processing (NLP) community, etc.”

Read More

From the terminologist toolbox: Lupas Rename

Lupas renameLupas Rename, developed by Ivan Anton, is an easy-to-use tool to organize your technical files, especially if you are preparing them to create a corpus on a specific technical subject.

You only need to download the .exe file and upload a selection of files. Then you only need to choose if you want to fully change the name of your files or just add a label before or after your file name. You can also number them!

You can download it for Windows here:

For MAC alternatives visit this page:

Watch this 3-minute tutorial in English:

Or this 2-minute tutorial in Spanish:

Also, if you are a power user and want more options, here is a great post by Gizmo’s Freeware that provides other free file renaming tools.


Image source

Readings, tools, and useful links for corpus analysis

corpuslinguisticsThe following list is a result of collaboration by participants of Lancaster’s recent MOOC on Corpus Linguistics. This is a selection of the links that I considered more relevant for those who might want to start exploring this field. If you want to share other links, feel free to add a comment or send me a message and I will add it here. I will keep you posted on the next CL course by Lancaster University. This post complements previous posts on corpora lists, GraphColl, and AntConc.


An Introduction of Corpus Linguistics – G. Bennet

Corpus Linguistics: What It Is and How It Can Be Applied to Teaching – D. Krieger

Corpus Linguistics 2015. Abstract book – F. Formato and A. Hardie (Lancaster:UCREL)

Corpus annotation – R. Garside, G. Leech, T. McEnery

A critical look at software tools in corpus linguistics – L. Anthony

Corpora and Language Teaching: Just a fling or wedding bells? – C. Gabrielatos Read More

Create your first corpus and analyze it with AntConc (and related links to explore!)


Click here to visit Prof. Anthony’s page

I think many of us might feel a bit intimidated when we first approach a new tool, but Laurence Anthony (Professor in the Faculty of Science and Engineering at Waseda University, Japan) developed AntConc so skillfully that once you start using it you’ll be hooked for life. It’s so easy to use that it’s almost child’s play, and Professor Anthony created short but detailed videos so you can start using it right away.

I really don’t want to go into much detail because I believe Professor Anthony videos are very clear and there are guides to get you started on the right foot, but here is a 7-step guide to get you going.

Read More

Twitter handles and hashtags relevant to Terminology

Before I give you my list, you probably know that both handle and hashtag in Twitter are good examples of  Terminologization, if you remember my post on that topic. So first things first, here is a short history of their origin.

CB radio

CB radio

The term handle comes from the CB radio (Citizens Band radio) that originated in the US in 1945 as personal radio services to permit citizens a radio band for personal communication. It was the slang word for a user’s radio name (alias). CB was the social network at one time.


Messina’s tweet

The term hashtag was used in 1988 on the Internet Relay Chat (IRC) to categorize items such as images, messages, videos and other contents into groups so that users could find them more easily. It was Chris Messina who first proposed to use it for Twitter groups. Although Twitter rejected his idea saying that it was for nerds, it was Stowe Boyd (as he himself claims it in one blog post) who was the first to use the term hashtag to denote those “channels” of communication.

Ok. Enough of that and here is what I found terminologists are using as preferred hashtags to share content.

Hashtags: Read More

Collection of Electronic Resources in Translation Technologies (CERTT)

CERTT imageAlthough some resources in this site are restricted to students and professors at the University of Ottawa (Canada) there are still quite a few resources you can explore. Website is available in English and French. Click here.

What kinds of tools are included in CERTT?

Computer tools can help translators in analyzing texts for terminological description, specialized translation, discourse analysis, and the analysis of translation choices, among many other applications. Tools currently covered in CERTT include term banks, terminology managers, term extractors, mono-/bilingual concordancers and corpus analyzers, translation memories, machine translation systems, localization tools and even general office tools” They also invite you to suggest other tools. Read More