by Marco Tomatis (,

Università degli Studi di Torino.

A Clitic Recognizer

"ClitRec" is a tool for recognizing the enclitic part of an Italian word in a tokenized, but still untagged, corpus. Since the system is founded on linguistic rules only, it needs to access an Italian lexicon file. This file, which should be as large as possible, should not contain words with enclitics and should have only one word per line.
This software needs "GAWK" (Gnu AWK) to work properly. You can download it directly from the Free Software Foundations web site. For Microsoft systems it is possible to download the last version of GAWK from the Sourceforge web site.

License terms

