Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language tags and 880 fields #70

Open
kiegel opened this issue Dec 19, 2017 · 3 comments
Open

Language tags and 880 fields #70

kiegel opened this issue Dec 19, 2017 · 3 comments

Comments

@kiegel
Copy link

kiegel commented Dec 19, 2017

In regard to internationalization, the logic for applying language tags needs work for parallel-script fields (880), e.g. with translations or parallel titles.

Incorrect Language Tags and Script Subtags
For example, problems crop up with OCLC #271414, an English translation of a Russian work.

<http://lib.washington.edu/ld/test/99114652250001452#Work880-45> a bf:Work ;
    rdfs:label "Евгений Онегин."@en-cyrl ;

The label is Cyrillic but in Russian, not English.

Work [ a bflc:Relationship ;
            bflc:relation [ a bflc:Relation ;
                    rdfs:label "Container of (expression)"@en-cyrl ] ;
            bf:relatedTo <http://lib.washington.edu/ld/test/99114652250001452#Work880-44> ].

The label is English but not Cyrillic. In general, it is vanishingly rare for a string to be both in the English language and in the Cyrillic script.

OCLC # 793950140, a Chinese translation of a Japanese work.

<http://lib.washington.edu/ld/test/99131426860001452#Work> a bf:Text,
        bf:Work ;
    rdfs:label "Inō Kanori no Taiwan tōsa nikki. Chinese",
        "伊能嘉矩の臺湾踏柤日記. Chinese"@zh-hani .

The title in the label is Japanese, not Chinese.

OCLC # 893875561, a Latvian book with a parallel title in Russian.

[ a bf:ParallelTitle,
                bf:Title,
                bf:VariantTitle ;
            rdfs:label "Заяц и его друзья : латышские народные сказки о животных"@lv-cyrl ;
            bf:mainTitle "Заяц и его друзья"@lv-cyrl ;
            bf:subtitle "латышские народные сказки о животных"@lv-cyrl ]

The title in the label, mainTitle and subtitle is Russian, not Latvian.

Compliance with IETF RFC 5646
Use of language tags should follow the practices given in IETF RFC 5646 [1]. Concerning the script subtag, on page 12 it states “[it] SHOULD be omitted when it adds no distinguishing value to the tag or when the primary or extended language subtag's record in the subtag registry includes a 'Suppress-Script' field listing the applicable script subtag”.

For example, for OCLC # 1779370:

<http://lib.washington.edu/ld/test/99129152590001452#Agent880-32> a bf:Agent,
        bf:Jurisdiction ;
    rdfs:label "Russia. Министерство народнаго просвѣщенія."@ru-cyrl .

Russian has the Suppress-Script field so a script subtag for Cyrillic is prohibited.

Not Good Practice
Using a language tag for numeric data in bf:part is not wrong but probably not a good practice.

<http://lib.washington.edu/ld/test/99129152590001452#Instance880-38> a bf:Instance ;
    bf:part "1825-29"@ru-cyrl ;
    bf:title [ a bf:Title ;
            rdfs:label "Записки"@ru-cyrl ] .

[1] https://tools.ietf.org/html/bcp47

@kirkhess
Copy link
Contributor

This is complicated since I think some of this is bad data vs bad conversion. We'll investigate and report back.

@osma
Copy link
Contributor

osma commented Dec 22, 2017

I've also seen the converter create @ru-cyrl language tags where the -cyrl is redundant and forbidden by BCP 47. I've chosen to ignore them for now.

@kirkhess
Copy link
Contributor

kirkhess commented Jan 5, 2018

The specs are going to be updated - pretty sure the best solution is to stop adding tags based on 008+$6.

If the marc included the language with the script it would be different and is technically possible, we were also going to look into that as well.

@kirkhess kirkhess assigned jodiw01 and unassigned wafschneider Apr 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants