2. Related Work ¶
Gensim¶
Annif is a Subject Indexer tool for helping librarians index books in general. However, because the tool uses a number of interesting libraries (NLTK and Voikko for tokennpzation, stemming, lemmatising or Gensim for computing TF-IDF) for tasks perfectly fitting with reaching our objective, we investigate it. Two observations follow from our investigations: (i) the library is fast but (ii) the approach for similarity is limited to the extraction of documents for which a number of terms exactly much terms in the query.