Coder Social home page Coder Social logo

lexibank / bowernpny Goto Github PK

View Code? Open in Web Editor NEW
0.0 8.0 1.0 12.65 MB

CLDF dataset derived from Bowern and Atkinson's "Internal Structure of Pama-Nyungan" from 2012

License: Creative Commons Attribution 4.0 International

Python 63.61% TeX 36.39%
clics3 clts lexibank1

bowernpny's People

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

bonniemclean

bowernpny's Issues

Error in profile

We found an error in orthographic profile (git blame says otherwise, but I'm 99% sure it is my fault): μ is mapped to m, but it should be ɳ. Confirmation by the author via Twitter: https://twitter.com/anggarrgoon/status/1496151913284489217

I can make a PR, but I am not sure if it is the right workflow anymore, especially in terms of regenerating the CLDF.

Orthography profile: non meaningful contrasts ?

Hi,

I have the impression that there might be spurious contrasts in the orthography profile, in particular voicing contrasts in occlusives (p/b, g/k, t/d). I also suspect that the ɹ/r contrast is maybe not meaningful.

I suggest that we figure out precisely which contrasts are due to variation in descriptive practice, and which are truly contrasts imputable to sound change etc, and neutralize meaningless contrasts.

As to how to normalize, we have three other datasets with languages from these families, and should make sure we are using the same notations:

Lexibank dataset Sounds found
bowernpny + _ a aː ã b bː c cʷ cː d dʒ dʱ dː d̪ e eː f g gʷ gː h i iː j k kʷ kː l lʷ lː l̪ m mː n nʲ nː n̪ n̪ː o oː p pː q qː r rː s t tʃ tʲ tː t̪ t̪ʷ t̪ː u uː ũ v w x yː z æ ð ø ŋ œ ɐ ɑː ɔ ɖ ə ɛ ɛː ɜ ɣ ɤ ɨ ɪ ɭ ɲ ɳ ɹ ɽ ɾ ʀ ʈ ʊ ʒ ʔ ʔʲ ˀb ˀd ˀdʒ ˀk ˀm ˀn ˀr ˀt ˀt̪ ˀw ˀɭ β θ
johanssonsoundsymbolic + a aː c i j k l l̪ m n n̪ p r rː t t̪ u w ŋ ɭ ɲ ɳ ɽ ʎ
joophonosemantic a aː i j k l l̻ m n n̻ p r t t̻ u uː w ŋ ɭ ɳ ɽ ʈ
wold + _ a g i iː j k l m n p r t u w y ŋ ɲ

As you can see, the other ones use p-k-t, not b-g-d, and have a single /r/ sound. https://github.com/lexibank/wold does have a k/g contrast (if it is also meaningless, we should change it there).

@erichround, @chirila, could you chime in on whether these contrasts should be kept here ? Are there other contrasts that should be neutralized in the list above ? @tresoldi, it looks like the orthography profile was from you, do you remember if there was specific motivations for these contrasts ?

For a closer look, the list of sounds with counts can be found in the TRANSCRIPTION file: https://github.com/lexibank/bowernpny/blob/master/TRANSCRIPTION.md

Having non meaningful contrasts causes issues with downstream analyses of the data, especially in the sound correspondence study.

Use of thou for you_pl

I noticed that you seem to be using 'thou' for the plural second person pronoun, but in concepticon thou is meant to be you_sg.

Fix tokenization

For some reason, none of the segments are recognized. Maybe this will go away when moving from CLPA to CLTS anyway.

Zenodo release

Hiya,

looks like the code/repo is ready for a release but it hasn't been released/pushed to Zenodo yet. Should I go ahead and do that @xrotwang or will you continue working on this repository?

Remap Wirangu-Nauo

Wirangu-Nauo is currently mapped to a book-keeping code, this must be fixed according to the instructions provided by @xrotwang.

Check cross-concept cognates

Opt-in for cross-concept cognates after checking? The cognate sets here seem quite big/inclusive.

  • bad/evil-belly-big-child-intestines-night-...
  • night-old-rain-red-...
  • big-child-small-sand-spear-thunder-...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.