Coder Social home page Coder Social logo

giellalt / lang-sma Goto Github PK

View Code? Open in Web Editor NEW
2.0 25.0 3.0 129.71 MB

Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Southern Sami language

Home Page: https://giellalt.uit.no

License: GNU General Public License v3.0

Makefile 2.31% Shell 2.01% M4 1.57% JavaScript 0.02% Perl 0.26% Regular Expression 0.65% XML 0.10% YAML 28.97% Text 63.76% TeX 0.33%
finite-state-transducers constraint-grammar minority-language nlp language-resources indigenous-languages giellalt-langs proofing-tools maturity-prod geo-nordic

lang-sma's Introduction

The South Sámi morphology and tools

Maturity Lemma count GitHub issues License Doc Build Status CI/CD Build Status

Download nightly / CI/CD installation packages for testing (contains the core zhfst file(s)):

Windows MacOS Mobile

NB!! Note that the nightly / CI/CD installation packages are not tested for language quality, and might contain regressions and errors.

This repository contains finite state source files for the South Sámi language, for building morphological analysers, proofing tools and dictionaries. The data and implementation are licenced under GNU GPL v3 licence, also detailed in the LICENSE. The authors named in the AUTHORS file are available to grant other licencing choices.

Install proofing tools and keyboards for the South Sámi language by using the Divvun Installer (some languages are only available via the nightly channel).

Spell-checker accuracy:

Speller Accuracy Spell-checking accuracy development graph

Download and test speller files

The speller files downloadable at the top of this page (the *.bhfst files) can be used with divvunspell, to test their performance. These files are the exact same ones as installed on users' computers and mobile phones. Desktop and mobile speller files differ from each other in the error model and should be tested separately — thus also two different downloads.

Documentation

Documentation can be found at:

Core dependencies

In order to compile and use South Sámi language morphology and dictionaries, you need:

To install VislCG3 and HFST, just copy/paste this into your Terminal on macOS:

curl https://apertium.projectjj.com/osx/install-nightly.sh | sudo bash

or terminal on Ubuntu, Debian or Windows Subsystem for Linux:

wget https://apertium.projectjj.com/apt/install-nightly.sh -O - | sudo bash
sudo apt-get install cg3 hfst

or terminal on RedHat, Fedora, CentOS or Windows Subsystem for Linux:

wget https://apertium.projectjj.com/rpm/install-nightly.sh -O - | sudo bash
sudo dnf install cg3 hfst

Alternatively, the Apertium wiki has good instructions on how to install the dependencies for Mac OS X and how to install the dependencies on linux

Further details and dependencies are described on the GiellaLT Getting Started pages.

Downloading

Using Git:

git clone https://github.com/giellalt/lang-sma

Using Subversion:

svn checkout https://github.com/giellalt/lang-sma.git/trunk lang-sma

Building and installation

INSTALL describes the GNU build system in detail, but for most users it is the usual:

./autogen.sh # This will automatically clone or check out other GiellaLT dependencies
./configure
make
(as root) make install

Citing

If you use language data from more than one GiellaLT language, consider citing our LREC 2022 article on whole infra:

Linda Wiechetek, Katri Hiovain-Asikainen, Inga Lill Sigga Mikkelsen, Sjur Moshagen, Flammie Pirinen, Trond Trosterud, and Børre Gaup. 2022. Unmasking the Myth of Effortless Big Data - Making an Open Source Multi-lingual Infrastructure and Building Language Resources from Scratch. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 1167–1177, Marseille, France. European Language Resources Association.

If you use bibtex, following is as it is on ACL anthology:

@inproceedings{wiechetek-etal-2022-unmasking,
    title = "Unmasking the Myth of Effortless Big Data - Making an Open Source
    Multi-lingual Infrastructure and Building Language Resources from Scratch",
    author = "Wiechetek, Linda  and
      Hiovain-Asikainen, Katri  and
      Mikkelsen, Inga Lill Sigga  and
      Moshagen, Sjur  and
      Pirinen, Flammie  and
      Trosterud, Trond  and
      Gaup, B{\o}rre",
    booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation
    Conference",
    month = jun,
    year = "2022",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    url = "https://aclanthology.org/2022.lrec-1.125",
    pages = "1167--1177"
}

lang-sma's People

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

lang-sma's Issues

substantiver som ikke gir analyse (Bugzilla Bug 961)

This issue was created automatically with bugzilla2github

Bugzilla Bug 961

Date: 2011-03-08T23:33:56+01:00
From: Lene Antonsen <<lene.antonsen>>
To: Sjur Nørstebø Moshagen <<sjur.n.moshagen>>
CC: lene.antonsen, maja.l.kappfjell, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2012-11-06T09:51:03+01:00

error in morphology båetedh (Bugzilla Bug 725)

This issue was created automatically with bugzilla2github

Bugzilla Bug 725

Date: 2008-09-18T16:02:14+02:00
From: Joseph Fjellgren <<joseph.fjellgren>>
To: Sjur Nørstebø Moshagen <<sjur.n.moshagen>>
CC: trond.trosterud

Last updated: 2008-09-19T07:54:11+02:00

NYSTØ-plc gir -e i Sg Nom (Bugzilla Bug 1198)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1198

Date: 2011-10-25T16:05:50+02:00
From: Lene Antonsen <<lene.antonsen>>
To: Thomas Omma <<thomas.omma>>
CC: lene.antonsen, sjur.n.moshagen, trond.trosterud

Last updated: 2011-10-25T17:14:39+02:00

væjkeles -> væjkales (Bugzilla Bug 947)

This issue was created automatically with bugzilla2github

Bugzilla Bug 947

Date: 2011-02-20T11:18:04+01:00
From: Lene Antonsen <<lene.antonsen>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2011-02-22T11:18:42+01:00

Plural of placenames used a little too eagerly: Bathh (Bugzilla Bug 1366)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1366

Date: 2012-05-29T17:26:11+02:00
From: Sjur Nørstebø Moshagen <<sjur.n.moshagen>>
To: Thomas Omma <<thomas.omma>>
CC: berit.a.baal, lene.antonsen, sjur.n.moshagen, trond.trosterud

Last updated: 2012-09-25T14:05:11+02:00

473 propernouns in lexicon without an analysis (Bugzilla Bug 986)

This issue was created automatically with bugzilla2github

Bugzilla Bug 986

Date: 2011-04-23T11:52:19+02:00
From: Trond Trosterud <<trond.trosterud>>
To: Thomas Omma <<thomas.omma>>
CC: lene.antonsen, sjur.n.moshagen, trond.trosterud

Last updated: 2011-06-22T10:25:35+02:00

Sublexicon is mentioned but not defined. (ø_Ø_EVEN) (Bugzilla Bug 1667)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1667

Date: 2013-05-15T10:04:29+02:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2014-05-02T12:51:14+02:00

Superlativene ammes/-ommes (Bugzilla Bug 1472)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1472

Date: 2012-10-17T11:25:32+02:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Duplicates: #1467
Last updated: 2012-10-17T12:44:04+02:00

stoere + orre (Bugzilla Bug 1463)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1463

Date: 2012-10-09T10:27:39+02:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: berit.a.baal, berit.nystad.eskonsipo, lene.antonsen, ritva.nystad, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2013-03-12T14:44:10+01:00

feil i substantivleksikon (Bugzilla Bug 1382)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1382

Date: 2012-07-01T15:12:52+02:00
From: Lene Antonsen <<lene.antonsen>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, maja.l.kappfjell, sjur.n.moshagen, trond.trosterud

Last updated: 2012-09-20T15:21:04+02:00

dåankah and skïelhkes (Bugzilla Bug 1433)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1433

Date: 2012-09-19T14:13:56+02:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2013-03-12T14:41:09+01:00

Oversikt adj (Bugzilla Bug 1521)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1521

Date: 2012-11-20T14:44:52+01:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2012-11-26T10:31:54+01:00

oeh_AN_ODD + ah_AN_ODD = SANT? (Bugzilla Bug 1515)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1515

Date: 2012-11-15T14:38:11+01:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2012-11-20T13:47:46+01:00

Superlativet : ammes (Bugzilla Bug 1467)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1467

Date: 2012-10-15T14:30:03+02:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2012-11-28T09:52:21+01:00

Prt av BÅETEDH verb - langform vs. kortform (Bugzilla Bug 965)

This issue was created automatically with bugzilla2github

Bugzilla Bug 965

Date: 2011-03-09T22:40:08+01:00
From: Lene Antonsen <<lene.antonsen>>
To: Thomas Omma <<thomas.omma>>
CC: maja.l.kappfjell, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2011-04-27T09:31:22+02:00

Problemer med rïektes (Bugzilla Bug 1492)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1492

Date: 2012-10-31T18:57:50+01:00
From: Lene Antonsen <<lene.antonsen>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2012-11-15T13:48:28+01:00

implementere normeringsvedtak i leksikon (Bugzilla Bug 1030)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1030

Date: 2011-05-20T18:10:50+02:00
From: Lene Antonsen <<lene.antonsen>>
To: Thomas Omma <<thomas.omma>>
CC: lene.antonsen, sjur.n.moshagen, trond.trosterud

Last updated: 2011-06-22T14:02:49+02:00

sma: analyse av numeral gir pluss foran tall (Bugzilla Bug 950)

This issue was created automatically with bugzilla2github

Bugzilla Bug 950

Date: 2011-02-20T21:50:00+01:00
From: Lene Antonsen <<lene.antonsen>>
To: Thomas Omma <<thomas.omma>>
CC: maja.l.kappfjell, tomi.k.pieski, trond.trosterud

Last updated: 2011-02-21T10:44:31+01:00

verbgr V presens: feil vokal (Bugzilla Bug 974)

This issue was created automatically with bugzilla2github

Bugzilla Bug 974

Date: 2011-04-02T16:16:17+02:00
From: Lene Antonsen <<lene.antonsen>>
To: Thomas Omma <<thomas.omma>>
CC: lene.antonsen, maja.l.kappfjell, sjur.n.moshagen, trond.trosterud

Last updated: 2011-04-06T11:05:58+02:00

svn-mail (Bugzilla Bug 1526)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1526

Date: 2012-11-28T10:39:45+01:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Børre Gaup <<borre.gaup>>
CC: borre.gaup, lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2013-01-14T03:07:58+01:00

transitivitet (Bugzilla Bug 955)

This issue was created automatically with bugzilla2github

Bugzilla Bug 955

Date: 2011-02-20T23:02:56+01:00
From: Lene Antonsen <<lene.antonsen>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2011-03-08T00:25:40+01:00

Proper nouns forsvunnet fra fst (Bugzilla Bug 1516)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1516

Date: 2012-11-16T20:17:32+01:00
From: Lene Antonsen <<lene.antonsen>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2012-11-19T10:42:36+01:00

Attribute forms missing from oahpa adjectives (Bugzilla Bug 956)

This issue was created automatically with bugzilla2github

Bugzilla Bug 956

Date: 2011-02-22T10:13:49+01:00
From: Trond Trosterud <<trond.trosterud>>
To: Thomas Omma <<thomas.omma>>
CC: lene.antonsen, maja.l.kappfjell, sissel.jama, sjur.n.moshagen, trond.trosterud

Last updated: 2011-04-27T09:32:08+02:00

adv får ikke Hyph (Bugzilla Bug 951)

This issue was created automatically with bugzilla2github

Bugzilla Bug 951

Date: 2011-02-20T22:02:45+01:00
From: Lene Antonsen <<lene.antonsen>>
To: Thomas Omma <<thomas.omma>>
CC: sjur.n.moshagen, thomas.omma, trond.trosterud

Duplicates: #961
Last updated: 2011-06-23T10:15:33+02:00

Adjektiv er forsvunnet fra sørsamisk fst (Bugzilla Bug 1404)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1404

Date: 2012-08-24T11:45:39+02:00
From: Lene Antonsen <<lene.antonsen>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2013-03-12T14:38:13+01:00

govledh with no analysis (Bugzilla Bug 1009)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1009

Date: 2011-05-06T12:40:49+02:00
From: Trond Trosterud <<trond.trosterud>>
To: Thomas Omma <<thomas.omma>>
CC: lene.antonsen, sjur.n.moshagen, trond.trosterud

Last updated: 2011-08-02T13:32:28+02:00

abpa og ammes (Bugzilla Bug 1553)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1553

Date: 2013-01-08T09:48:20+01:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2015-06-08T15:25:16+02:00

ijve (Bugzilla Bug 1437)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1437

Date: 2012-09-20T12:27:34+02:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2012-09-20T15:12:58+02:00

+Hom1 og +Hom2? (Bugzilla Bug 1633)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1633

Date: 2013-03-12T15:10:48+01:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2013-05-13T11:29:04+02:00

error in morphology vuejnedh (Bugzilla Bug 726)

This issue was created automatically with bugzilla2github

Bugzilla Bug 726

Date: 2008-09-18T16:10:49+02:00
From: Joseph Fjellgren <<joseph.fjellgren>>
To: Sjur Nørstebø Moshagen <<sjur.n.moshagen>>
CC: trond.trosterud

Duplicates: #725
Last updated: 2008-09-19T07:50:36+02:00

These oahpa adjectives do not have +Pred forms (Bugzilla Bug 957)

This issue was created automatically with bugzilla2github

Bugzilla Bug 957

Date: 2011-02-22T10:16:31+01:00
From: Trond Trosterud <<trond.trosterud>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, maja.l.kappfjell, sissel.jama, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2011-04-27T09:30:15+02:00

Subbe komparasjon av stoere og orre (Bugzilla Bug 1471)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1471

Date: 2012-10-17T11:18:37+02:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Duplicates: #1463
Last updated: 2012-10-18T00:56:27+02:00

Lemma-form i leksikonet bør være norm (Bugzilla Bug 968)

This issue was created automatically with bugzilla2github

Bugzilla Bug 968

Date: 2011-03-12T15:47:29+01:00
From: Lene Antonsen <<lene.antonsen>>
To: Thomas Omma <<thomas.omma>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2011-08-01T20:53:00+02:00

Plese, rett opp denne! gïenghke (Bugzilla Bug 1478)

This issue was created automatically with bugzilla2github

Bugzilla Bug 1478

Date: 2012-10-22T12:56:54+02:00
From: Maja Lisa Kappfjell <<maja.l.kappfjell>>
To: Maja Lisa Kappfjell <<maja.l.kappfjell>>
CC: lene.antonsen, sjur.n.moshagen, thomas.omma, trond.trosterud

Last updated: 2012-11-15T13:50:05+01:00

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.