Comments (8)
This has to do with the "history" of the entity type. I had a person type originally with fields last names, first names, middle names but sacrificed higher degree of specificity for the general use "entity" type. I didn't want entity to have both "name" and "first name", "last name", etc. for fear of misuse of the "name" field for person names (which would make the parse more difficult). So I stuck with having just "name" and introduce a very simple syntax. I didn't use the comma as a separator because a lot of non-person names will have a comma included, same for colon and some other obvious choices.
Always open for suggestions though ;).
from citation-file-format.
I didn't use the comma as a separator because a lot of non-person names will have a comma included, same for colon and some other obvious choices.
I imagine this was the case.
I didn't want entity to have both "name" and "first name", "last name", etc. for fear of misuse of the "name" field for person names (which would make the parse more difficult).
What about we use name
and have apa-name
as a helper? We would primarily use name
when generating a reference but apa-name
would be used when the reference style ask for "Author's Surname, Name Initial. Other Initial." which is the big problem because (1) some surnames has a preposition, e.g. German surnames with von are common and Portuguese surnames with de/do/da are common, (2) in some languages the surname is the first name and (3) in some cultures you have more than two names.
Example 1
name: Jonh Doe
apa-name: Doe, J.
Example 2
name: Ben von Berger
apa-name: von Berger, J.
Example 3 (Chinise)
name: Liú Zhìfēng
apa-name: Liú, Z.
Example 4
name: Pedro Pereira Porto
apa-name: Porto, P. P.
Example 5 (Bonus)
This is your nightmare. Would not be impossible to find someone named "Ana Maria Francisca Carvalho da Conceição Sales". The mother's maiden name is "Carvalho" and the father's surname is "da Conceição". The mother wanted "Ana Maria" as the first name and the father wanted "Francisca" so the child ended up with "Ana Maria Francisca Carvalho da Conceição". She adopted her husband surname so now her name is "Ana Maria Francisca Carvalho da Conceição Sales". Because there are so many ways that she could split her name into "APA" style I will not try to list them.
from citation-file-format.
I agree with not having a field for first name, last name, middle name - too difficult to be consistent. I like Raniere's suggestion. I think there will always be tricky cases (Example 5).
from citation-file-format.
I think that having two fields for names is unnecessary duplication. Also, the apa-name
field basically has the same syntax as my original proposal, the only difference being the delimiter, ::
instead of just ,
.
I think that the issue with my proposal is perhaps that it treats the most common usage of the name
field as somewhat of a special case by using a custom delimiter that most people wouldn't be able to (linguistically) parse easily.
So how about instead of using two name fields, or an unfamiliar identifier, we use just name
and specify that if a comma is present (as in APA style, so all cases, including no. 5 are covered), the field will be parsed as in APA, and if not it will be parsed as a named entity string? Commas in named entity strings would then have to be escaped with \
.
As for example 5 (and all others), it will always be the task of the CFF file author to split person names correctly, we just have to supply a decent model that can represent first & last names (with middle names being a suffix to the first names). So: name: Carvalho da Conceição Sales, Ana Maria Francisca
vs name: Ana Maria Francisca Carvalho da Conceição Sales\, Inc.
from citation-file-format.
You convinced me - also not a fan of code duplication. I like your suggestion: if unescaped comma present, use it but give people the flexibility to not set a comma if it is difficult to set.
from citation-file-format.
Yes, so third case of example 5 would simply be name: Ana Maria Francisca Carvalho da Conceição Sales
and let downstream actors deal with the name parse, as I think is also the case in BibTeX.
The reference implementation of the parser should be able to represent APA'd info, so should have a first name and a last name type in the data model. Otherwise exporting to BibTeX, etc. will be difficult.
from citation-file-format.
A colleague at work send me https://www.w3.org/International/questions/qa-personal-names. If you look at Strategies for splitting up names, they say
It may be better to ask separately, when setting up a profile for example, how that person would like you to address them.
I think that we need to make a decision if we care about the full name of the author or not. If we care, I think we should ask two questions. If we don't, we should stay with one name
field and add "how you would like to be address in citations that follow a given style", where the given style should be the one with a comma and names instead of initials.
from citation-file-format.
Very interesting read, thanks for the link.
I think I now understand better where the issue with names could be.
Previously I had thought to just follow the general splittable requirement as put forward by a number of citation styles and indeed ORCiD (which, against best practices as it seems, provide "First Name" and "Last Name" fields for accounts).
Additionally, there's the complication that perhaps the standard for citation info (BibTeX) defines four fields for names (cf. this SO answer, this BibTeX help page on nwalsh.com):
- First name (including middle names)
- "von part" ("de la", "van der", and similar components)
- Last name
- "Jr. part", for name suffixes like "Jr.", "III", etc.
And I think this categorization is still pretty anglo/euro-centric.
I do think we should definitely ask for names to be recorded so that they not only represent how a person would like to be addressed in citations, but also so that they support a failsafe conversion to BibTeX (and other formats), and support downstream creation of metadata, e.g., indices, which is where the "von part" becomes important (Éamon de Valera: list as "de Valera, Éamon" or as "Valera, Éamon de"?).
This means we have two or three options:
- Ask four questions for
entity
objects, and provide a way to ask for, e.g., company names as well, so five:- given name(s), incl. middle/other names
- nobiliary particle/preposition
- family name(s)
- suffixes ("Jr.", "III")
- "entity name"
- Specify a syntax for providing that information.
- Specify a separate
person
type in addition to theentity
type (which is used for affiliation entities, companies, conferences, etc.)
1.: Concerns are not separated (mixing persons/entities, possible misuse of the name
field, cf. https://github.com/sdruskat/citation-file-format/issues/10#issuecomment-332865963).
2. This can be a nightmare for us to specify, and users to learn.
3.: Authorship by a group, or mixed authorship would be enabled by allowing authors
to contain objects of both types.
So I suggest dividing the entity
type once again by separating person-related fields into a new person
type. This will restore Separation of Concerns but will go against DRY, as a lot of the fields are duplicated across person
and entity
. But I guess that's not a high price to pay for being able to correctly record person names?
The person
type should then have the following four fields for names, following the W3 suggestions, and mixing in what BibTeX can do:
family-names
for family names (including: one-word given+patronymic forms such as Guðmundsdóttir, bin Osman (although in this case it would be up to the author to define "bin" as preposition which would go in field 2); double names with or without hyphen (Leutheusser-Schnarrenberger, Arantxa Sánchez Vicario); etc.)*,**name-particle
for nobiliary particles and prepositions (Ludwig van Beethoven, Rafael van der Vaart)given-names
for given names including middle names, etc.**name-suffix
for suffixes such as Jr. or III
* Family names can always include prepositions (especially if they occur in between two family names such as in some Spanish- or Portuguese-origin names, e.g., Firstname Márquez de Vila, etc.). if the person chooses it to be so.
** It's up to the creator of the file where to put, e.g., the Chinese generation name, although I guess that it would probably go into family-names
in most cases.
from citation-file-format.
Related Issues (20)
- Accompanying paper citations HOT 2
- WhatsApp HOT 1
- Bump actions version in workflow files HOT 1
- Anonymize test examples HOT 1
- Update testing section in README.dev.md HOT 1
- How should CFF store different parts of someone's name? HOT 3
- Should `contact` really be named `contacts`?
- Consider merging `license-url` into `license`
- Consider dropping the url based root keys in favor of `identifiers`
- Review README files used in testing
- Review changelog HOT 2
- Consider supporting only OSI-approved licenses HOT 3
- Consider letting a reference's version be able to express an applicable version range HOT 1
- Add support for `resourceType` from DataCite in Zenodo
- Consider allowing internal >1 whitespaces in `strictish-string` HOT 1
- Consider the hardwiring of `cff-version` values in the schema HOT 4
- Schema extension for Open Educational Resources
- Add link to cff2toml on README.md
- Cross-referencing a `.cff` for the reference section HOT 3
- TypeScript based website to convert BibTeX into GitHub ready CFF file
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from citation-file-format.