Linguistics Archives - Page 5 of 9

Fresh Memory is an education application for studying languages with Spaced Repetition method and flashcards. Its primary purpose is to study and repeat vocabulary of foreign languages. But other disciplines can be studied as well: history, geography, medicine, mathematics. The study material is stored as collections of flashcards. The flashcards may have several fields, and […]

This is the Nigerian component of the International Corpus of English, a one million word corpus of written and spoken Nigerian English for linguistic research. It can be used as a stand-alone corpus or in conjunction with other components of the International Corpus of English (such as ICE-GB, ICE-India, etc.) to compare international varieties of […]

FREJ stands for “Fuzzy Regular Expressions for Java” – it is a command-line tool and library which allow you easily compare strings with patterns disregarding nasty typos and considering several variants (like “Barack Obama”, “B.H.Obama” etc.) Project sources are moved to github: https://github.com/RodionGork/FREJ 1 of 5 2 of 5 3 of 5 4 of 5 […]

THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit/ The Multiword Expressions toolkit aids in the automatic identification and extraction of multiword units in running text. These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc. Even though it focuses on multiword expresisons, the framework is quite complete and can also be […]

AFEWC corpus is a multilingual comparable text articles in Arabic, French, and English languages. Each triple article is related to the same topic (aligned at article level). AFEWC corpus is collected from Wikipedia. The corpus is available for free for research purposes only. It is composed of 40K aligned articles, 91.3M English words, 57.8M French […]

ATTENTION! Morfologik is now at GitHub:https://github.com/morfologik/ You’re probably paying too much for cell phone service. Wirefly compares hundreds of plans to help you save. Enter what you need (minutes, data, texts) into Wirefly’s innovative plan comparison tools and see your savings instantly.

The Scheme Natural Language Toolkit (S-NLTK) is a Scheme R6RS library for language and text processing, and various tasks related to symbolic and statistical analysis of language data. If you are like the rest of our user community, your IT team is busy. With pressure to deliver on-time projects, you don’t have a lot of […]

Virastyar is a free and open-source (FOSS) spell checker for Persian. It stands upon the shoulders of many free/libre/open-source (FLOSS) libraries developed for Persian text processing. Contributors: Omid Kashefi Azadeh Zamanifar Masoumeh Mashaiekhi Meisam Pourafzal Reza Refaei (former member) Mohammad Hedayati (former member) Kamiar Kanani (former member) Mehrdad Senobari (former member) Sina Iravanin (former member) […]

The JINSECT toolkit is a Java-based toolkit and library that supports and demonstrates the use of n-gram graphs within Natural Language Processing applications, ranging from summarization and summary evaluation to text classiﬁcation and indexing. You’re probably paying too much for cell phone service. Wirefly compares hundreds of plans to help you save. Enter what you […]

Grammar-multi is most useful for languages which words have many forms («more» inflected languages), and for which grammatical agreement (and other syntactic connections) in a sentence is «more» important and «obvious». Need a help of linguists. Program is not for every-day use, but to show Grammar is working. If you want your language Grammar version […]