KWIC, KWAC, and KWOC (not a knock-knock joke!)

corpuslinguisticsYou might have heard about at least one of these. They are abbreviations used by concordancers used in indexing. So you will also hear about kwic index, kwac index or kwoc index which contain keywords used as “access” terms in such indexes.

KWIC stands for “key word in context”. It is the most common format in concordancing and was coined by Hans Peter Luhn. It dates back to the sixties, when scholars started using computer programs to search for key words and generate lists of words in alphabetical order, enclosed by the context in which they occurred. These were known as KWIC indexes, which were used not only for information retrieval but also for content analysis. There are a few KWIC programs that you might want to check out (besides the one featured in AntConc and other programs):

  1. KWIC CONCORDANCE PROGRAM, “The KWIC Concordance is a corpus analytical tool for making word frequency lists, concordances, and collocation tables from electronic text files. This program offers the capability of handling markup schemes, such as COCOA, SGML, the Helsinki corpus, the Penn-Helsinki Parsed Corpus of Middle English (Phase 1) (Phase 2) etc.”
  2. KWIC CONCORDANCER by Jeremy Whistle: “This is available as a downloadable ZIP file from something called the Online English Network. It seems to be accompanied by good documentation.”
  3. CONC: Conc produces concordances of texts. A concordance consists of a list of the words in the text with a short section of the context that precedes and follows each word. Conc also produces an index, consisting of a list of the distinct words in the text, each with the number of times it occurs and a list of the places where it occurs. Conc displays the original text, the concordance, and the index each in its own window. Clicking on a word in any one of the three windows causes the other two windows to display the entries for the same word.
  4. py, developed by New Mexico Tech, is a Python module to generate a Key Word In Context (KWIC) index.

KWAC stands for “keyword alongside context” and KWOC stands for “keyword out of context”. They are modifications of KWIC. As defined by Birger Hjørland of the Lifeboat for Knowledge Organization of the University of Copenhagen, just like KWIC they are “simple, mechanical term extraction indexes for text (usually titles) which retain some of the context (i.e. adjacent words).”

KWAC is also known as “keyword and context” and “key-word augmented-in-context”. KWAC provides additional keywords taken either from the abstract or from the original text of the document and are inserted into the title to give further index entries, for example:

indexes

in information retrieval. Standard guidelines for

information

retrieval. Standard guidelines for indexes in

retrieval

Standard guidelines for indexes in information

KWOC also displays the access term on the left, but the word pairs are not preserved in the alphanumeric sequence of keywords, for example:

information
guidelines for indexes in information retrieval
indexes in information retrieval. Standard
Standard guidelines for indexes in information retrieval.

Sources:

KWIC Programs and Concordances by Professor Emeritus of Political Science, Kenneth Janda, of Northwester University (Illinois, USA).

Lifeboat for Knowledge Organization by Birger Hjørland of the University of Copenhagen

KWIC by the Wikipedia https://en.wikipedia.org/wiki/Key_Word_in_Context

Dictionary for Library and Information Science by Joan Reitz

Leave a Reply

Your email address will not be published.