Coder Social home page Coder Social logo

Comments (8)

sdruskat avatar sdruskat commented on June 18, 2024

This has to do with the "history" of the entity type. I had a person type originally with fields last names, first names, middle names but sacrificed higher degree of specificity for the general use "entity" type. I didn't want entity to have both "name" and "first name", "last name", etc. for fear of misuse of the "name" field for person names (which would make the parse more difficult). So I stuck with having just "name" and introduce a very simple syntax. I didn't use the comma as a separator because a lot of non-person names will have a comma included, same for colon and some other obvious choices.

Always open for suggestions though ;).

from citation-file-format.

rgaiacs avatar rgaiacs commented on June 18, 2024

I didn't use the comma as a separator because a lot of non-person names will have a comma included, same for colon and some other obvious choices.

I imagine this was the case.

I didn't want entity to have both "name" and "first name", "last name", etc. for fear of misuse of the "name" field for person names (which would make the parse more difficult).

What about we use name and have apa-name as a helper? We would primarily use name when generating a reference but apa-name would be used when the reference style ask for "Author's Surname, Name Initial. Other Initial." which is the big problem because (1) some surnames has a preposition, e.g. German surnames with von are common and Portuguese surnames with de/do/da are common, (2) in some languages the surname is the first name and (3) in some cultures you have more than two names.

Example 1

name: Jonh Doe
apa-name: Doe, J.

Example 2

name: Ben von Berger
apa-name: von Berger, J.

Example 3 (Chinise)

name: Liú Zhìfēng
apa-name: Liú, Z.

Example 4

name: Pedro Pereira Porto
apa-name: Porto, P. P.

Example 5 (Bonus)

This is your nightmare. Would not be impossible to find someone named "Ana Maria Francisca Carvalho da Conceição Sales". The mother's maiden name is "Carvalho" and the father's surname is "da Conceição". The mother wanted "Ana Maria" as the first name and the father wanted "Francisca" so the child ended up with "Ana Maria Francisca Carvalho da Conceição". She adopted her husband surname so now her name is "Ana Maria Francisca Carvalho da Conceição Sales". Because there are so many ways that she could split her name into "APA" style I will not try to list them.

from citation-file-format.

bast avatar bast commented on June 18, 2024

I agree with not having a field for first name, last name, middle name - too difficult to be consistent. I like Raniere's suggestion. I think there will always be tricky cases (Example 5).

from citation-file-format.

sdruskat avatar sdruskat commented on June 18, 2024

I think that having two fields for names is unnecessary duplication. Also, the apa-name field basically has the same syntax as my original proposal, the only difference being the delimiter, :: instead of just ,.

I think that the issue with my proposal is perhaps that it treats the most common usage of the name field as somewhat of a special case by using a custom delimiter that most people wouldn't be able to (linguistically) parse easily.

So how about instead of using two name fields, or an unfamiliar identifier, we use just name and specify that if a comma is present (as in APA style, so all cases, including no. 5 are covered), the field will be parsed as in APA, and if not it will be parsed as a named entity string? Commas in named entity strings would then have to be escaped with \.

As for example 5 (and all others), it will always be the task of the CFF file author to split person names correctly, we just have to supply a decent model that can represent first & last names (with middle names being a suffix to the first names). So: name: Carvalho da Conceição Sales, Ana Maria Francisca vs name: Ana Maria Francisca Carvalho da Conceição Sales\, Inc.

from citation-file-format.

bast avatar bast commented on June 18, 2024

You convinced me - also not a fan of code duplication. I like your suggestion: if unescaped comma present, use it but give people the flexibility to not set a comma if it is difficult to set.

from citation-file-format.

sdruskat avatar sdruskat commented on June 18, 2024

Yes, so third case of example 5 would simply be name: Ana Maria Francisca Carvalho da Conceição Sales and let downstream actors deal with the name parse, as I think is also the case in BibTeX.

The reference implementation of the parser should be able to represent APA'd info, so should have a first name and a last name type in the data model. Otherwise exporting to BibTeX, etc. will be difficult.

from citation-file-format.

rgaiacs avatar rgaiacs commented on June 18, 2024

A colleague at work send me https://www.w3.org/International/questions/qa-personal-names. If you look at Strategies for splitting up names, they say

It may be better to ask separately, when setting up a profile for example, how that person would like you to address them.

I think that we need to make a decision if we care about the full name of the author or not. If we care, I think we should ask two questions. If we don't, we should stay with one name field and add "how you would like to be address in citations that follow a given style", where the given style should be the one with a comma and names instead of initials.

from citation-file-format.

sdruskat avatar sdruskat commented on June 18, 2024

Very interesting read, thanks for the link.

I think I now understand better where the issue with names could be.
Previously I had thought to just follow the general splittable requirement as put forward by a number of citation styles and indeed ORCiD (which, against best practices as it seems, provide "First Name" and "Last Name" fields for accounts).

Additionally, there's the complication that perhaps the standard for citation info (BibTeX) defines four fields for names (cf. this SO answer, this BibTeX help page on nwalsh.com):

  • First name (including middle names)
  • "von part" ("de la", "van der", and similar components)
  • Last name
  • "Jr. part", for name suffixes like "Jr.", "III", etc.

And I think this categorization is still pretty anglo/euro-centric.

I do think we should definitely ask for names to be recorded so that they not only represent how a person would like to be addressed in citations, but also so that they support a failsafe conversion to BibTeX (and other formats), and support downstream creation of metadata, e.g., indices, which is where the "von part" becomes important (Éamon de Valera: list as "de Valera, Éamon" or as "Valera, Éamon de"?).

This means we have two or three options:

  1. Ask four questions for entity objects, and provide a way to ask for, e.g., company names as well, so five:
    • given name(s), incl. middle/other names
    • nobiliary particle/preposition
    • family name(s)
    • suffixes ("Jr.", "III")
    • "entity name"
  2. Specify a syntax for providing that information.
  3. Specify a separate person type in addition to the entity type (which is used for affiliation entities, companies, conferences, etc.)

1.: Concerns are not separated (mixing persons/entities, possible misuse of the name field, cf. https://github.com/sdruskat/citation-file-format/issues/10#issuecomment-332865963).
2. This can be a nightmare for us to specify, and users to learn.
3.: Authorship by a group, or mixed authorship would be enabled by allowing authors to contain objects of both types.

So I suggest dividing the entity type once again by separating person-related fields into a new person type. This will restore Separation of Concerns but will go against DRY, as a lot of the fields are duplicated across person and entity. But I guess that's not a high price to pay for being able to correctly record person names?

The person type should then have the following four fields for names, following the W3 suggestions, and mixing in what BibTeX can do:

  1. family-names for family names (including: one-word given+patronymic forms such as Guðmundsdóttir, bin Osman (although in this case it would be up to the author to define "bin" as preposition which would go in field 2); double names with or without hyphen (Leutheusser-Schnarrenberger, Arantxa Sánchez Vicario); etc.)*,**
  2. name-particle for nobiliary particles and prepositions (Ludwig van Beethoven, Rafael van der Vaart)
  3. given-names for given names including middle names, etc.**
  4. name-suffix for suffixes such as Jr. or III

* Family names can always include prepositions (especially if they occur in between two family names such as in some Spanish- or Portuguese-origin names, e.g., Firstname Márquez de Vila, etc.). if the person chooses it to be so.
** It's up to the creator of the file where to put, e.g., the Chinese generation name, although I guess that it would probably go into family-names in most cases.

from citation-file-format.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.