Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document fallback order #28

Open
brawer opened this issue Jun 1, 2016 · 18 comments
Open

Document fallback order #28

brawer opened this issue Jun 1, 2016 · 18 comments

Comments

@brawer
Copy link

brawer commented Jun 1, 2016

For Noto to work as a pan-Unicode font, systems must implement a specific fallback ordering. (Look for glyphs first in Noto-Foo, if not found try Noto-Bar, if not found try Noto-Baz, etc.) A partial version of this list must already exist somewhere in source.android.com, but we should document this ordering part of the Noto repository and make sure it’s kept current. Then, other clients than Android will have an easier way to package Noto into their systems.

One way to document the fallback ordering could be to publish it in CFR format, which afaik is a very simple and already ISO-standardized XML format that would allow for expressing this ordering.

@jungshik
Copy link

jungshik commented Jun 6, 2016

I've just read 5.5 (Components) 5.6.( LanguagePrefrredList and 5.7 ( LanguagePreferredComponentDef).

I expected LanguagePreferredList to have 'lang' instead of (or in addition to) LanguagePreferredComponentDef (as shown below) That way, a language can be a parameter passed to a single CFR to get lang-dependent fallback order. IIUC, it looks like we need to have separate CFR definitions for different locales. Am I missing anything?

<LanguagePreferredList lang="ja">
<LanguagePreferredComponentDef <ComponentDef name="Tokyo">>
<LanguagePreferredComponentDef <ComponentDef name="Taipei">>
<LanguagePreferredComponentDef <ComponentDef name="Beijing">>
</LanguagePreferredList>

<LanguagePreferredList lang="zh-Hans">
<LanguagePreferredComponentDef <ComponentDef name="Beijing">>
<LanguagePreferredComponentDef <ComponentDef name="Taipei">>
<LanguagePreferredComponentDef <ComponentDef name="Tokyo">>
</LanguagePreferredList>

<LanguagePreferredList lang="zh-Hant">
<LanguagePreferredComponentDef <ComponentDef name="Taipei">>
<LanguagePreferredComponentDef <ComponentDef name="Beijing">>
<LanguagePreferredComponentDef <ComponentDef name="Tokyo">>
</LanguagePreferredList>

In case of Indic,

<LPL lang="hi">
  <LPCD <CD name="PanIndic">, lang="hi">   
</LPL>
<LPL lang="bn">
  <LPCD <CD name="PanIndic", lang="bn">   // Bengali glyphs/locl etc will be used for common codepoints 
</LPL> 

@brawer
Copy link
Author

brawer commented Jun 7, 2016

@kenlunde and @behdad

We’re wondering if we could publish a CFR file so that systems can expose a single “Noto Sans” font to users, instead of all those various script-specific shards. (Obviously, this would only work on systems that actually implement CFR; but adding CFR support to fontconfig seems doable). However, we’re not sure how to structure this CFR file. Would the following work?

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE PosingFont SYSTEM "http://blogs.adobe.com/CCJKType/files/2012/04/iso14496-28-dtd.txt">
<PosingFont name="NotoSans-Regular" version="1.004">
  <Name type="16" string="Noto Sans" language="en"/>
  <Name type="17" string="Regular" language="en"/>
  <Components>
    <ComponentDef name="NotoSansLatinGreekCyrillic-Regular"/>
    <ComponentDef name="NotoSansDevanagari-Regular"/>
    <ComponentDef name="NotoSansBengali-Regular"/>
    <ComponentDef name="NotoSansMarathi-Regular"/>
    <ComponentDef name="NotoSansThai-Regular"/>
    <ComponentDef name="NotoSansMyanmar-Regular"/>
    <ComponentDef name="NotoSansGlagolitic-Regular"/>
    <!-- many, many others -->

    <LanguagePreferredList>
      <LanguagePreferredComponentDef>
        <Language string="ja"/>
        <ComponentDef name="NotoSansCJKjp-Regular" locationHint="NotoSansCJK.ttc"/>
      </LanguagePreferredComponentDef>
    </LanguagePreferredList>

    <LanguagePreferredList>
      <LanguagePreferredComponentDef>
        <Language string="zh-Hans"/>
        <ComponentDef name="NotoSansCJKsc-Regular" locationHint="NotoSansCJK.ttc"/>
        <ComponentDef name="NotoSansCJKtc-Regular" locationHint="NotoSansCJK.ttc"/>
      </LanguagePreferredComponentDef>
    </LanguagePreferredList>

    <LanguagePreferredList>
      <LanguagePreferredComponentDef>
        <Language string="zh-Hant"/>
        <ComponentDef name="NotoSansCJKtc-Regular" locationHint="NotoSansCJK.ttc"/>
        <ComponentDef name="NotoSansCJKsc-Regular" locationHint="NotoSansCJK.ttc"/>
      </LanguagePreferredComponentDef>
    </LanguagePreferredList>

    <LanguagePreferredList>
      <LanguagePreferredComponentDef>
        <Language string="ko"/>
        <ComponentDef name="NotoSansCJKkr-Regular" locationHint="NotoSansCJK.ttc"/>
      </LanguagePreferredComponentDef>
    </LanguagePreferredList>

    <!-- If we still have not found a suitable font, try the following fonts. -->
    <!-- For example, the text could be marked as French but still contain 水, -->
    <!-- and we should be able to render something. -->
    <ComponentDef name="NotoSansCJKjp-Regular" locationHint="NotoSansCJK.ttc"/>
    <ComponentDef name="NotoSansCJKsc-Regular" locationHint="NotoSansCJK.ttc"/>
    <ComponentDef name="NotoSansCJKtc-Regular" locationHint="NotoSansCJK.ttc"/>
    <ComponentDef name="NotoSansCJKkr-Regular" locationHint="NotoSansCJK.ttc"/>

  </Components>
</PosingFont>

By the way, from reading the spec, I’m not sure what kind of language matching can be expected when writing CFR files. For example, Chinese zh is a macrolanguage according to the IANA language subtag registry. Hakka hak is a concrete language within the Chinese macrolanguage family. With a CFR file like the above, which component font files will be searched in what order when the text is marked with (say) lang="hak-TW"?

@kenlunde
Copy link

kenlunde commented Jun 7, 2016

@brawer: Unless things have changed recently, the only environment of which I am aware that can consume CFR objects is OS X Version 10.8 or later.

Anyway, given that the Unicode coverage of NotoSansCJKsc-Regular and NotoSansCJKtc-Regular is identical, I am not sure why both would be listed under both declarations for Chinese, zh-Hans and zh-Hant. For the example from the ISO standard, the specified fonts have different Unicode coverage, so it makes sense to include multiple fonts in a preferred order for the declared language.

Language matching is really up to the consumer of the CFR object. CFR objects should simply specify BCP 47 language tags.

@behdad
Copy link

behdad commented Jun 7, 2016

I don't know much about CFR. Ken does.

@brawer
Copy link
Author

brawer commented Jun 8, 2016

@kenlunde: In a CFR file for Noto, would you recommend putting CJK into LanguagePreferredLists? Or would the following be enough?

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE PosingFont SYSTEM "http://blogs.adobe.com/CCJKType/files/2012/04/iso14496-28-dtd.txt">
<PosingFont name="NotoSans-Regular" version="1.004">
  <Name type="16" string="Noto Sans" language="en"/>
  <Name type="17" string="Regular" language="en"/>
  <Components>
    <ComponentDef name="NotoSansLatinGreekCyrillic-Regular"/>
    <ComponentDef name="NotoSansCJKjp-Regular" locationHint="NotoSansCJK.ttc"/>
    <ComponentDef name="NotoSansCJKsc-Regular" locationHint="NotoSansCJK.ttc"/>
    <!-- Not listing NotoSansCJKtc because it has the same glyphs as NotoSansCJKsc -->
    <ComponentDef name="NotoSansCJKkr-Regular" locationHint="NotoSansCJK.ttc"/>
    <ComponentDef name="NotoSansDevanagari-Regular"/>
    <ComponentDef name="NotoSansBengali-Regular"/>
    <ComponentDef name="NotoSansMarathi-Regular"/>
    <ComponentDef name="NotoSansThai-Regular"/>
    <ComponentDef name="NotoSansMyanmar-Regular"/>
    <ComponentDef name="NotoSansGlagolitic-Regular"/>
    <!-- many, many others -->
  </Components>
</PosingFont>

Probably I’m missing something? Just wondering, because:

  • to use LanguagePreferredList, clients need to pass the language to the text rendering stack;
  • but if clients pass the language, the shaper will substitute localized letterforms in NotoSansCJK;
  • so what would we gain from using LanguagePreferredList in a prospective CFR file for Noto?

@kenlunde
Copy link

@brawer: Correct, if language is being passed by the client, and if the 'locl' GSUB feature is being invoked, the result will be the same as using LanguagePreferredList but cleaner. This means that only one of the four NotoSansCJK{jp,kr,sc,tc}-Regular fonts needs to be specified in the CFR object, because all four have the same Unicode coverage.

@brawer
Copy link
Author

brawer commented Jun 24, 2016

@roozbehp, is Android’s fallback order for Noto in the AOSP tree? The Android.mk file in noto-fonts splits the fonts by footprint. But isn’t there a file somewhere that defines an ordering when looking for a specific glyph?

Context: We’re thinking about publishing the fallback order as part of the Noto package, possibly in CFR format like this. So that other platforms can easier package Noto in its entirety, exposing it to users as “Noto Sans”, “Noto Serif” etc. Today, CFR adoption is very limited, but perhaps that can be changed. Also, why not using a standard format for documenting this (when it already exists) instead of coming up with a new ad-hoc format.

@kenlunde
Copy link

@brawer: Isn't the fallback order listed in the fallback_fonts.xml file? If memory serves, it is in the /system/etc/ directory of the file system, or similar location.

Anyway, I completely agree about supporting CFR objects for this purpose, because it has a lot of functionality to offer, such as the ability to specify a transformation matrix that can be used to adjust component fonts to better harmonize with one another. You can also specify Unicode ranges to limit which glyphs are used from a particular component font. Only macOS (formerly OS X) supports CFR objects, from Version 10.8, and getting Adobe apps to support them would require another platform, such as Android, to support them. At that point, it is no longer an enhancement, but rather a bug in that our products cannot consume a font resource that two other platforms support.

I can actually give you a real-world use case for CFR objects. When Samsung released Android Version 6.0.1 to its devices earlier this year, one of the new fonts was called SECHans-Regular.otf, which is actually NotoSansSC-Regular.otf, but the entire font was scaled to 95%. This is what I refer to as wagging a very large dog with its tiny tail: it would have been far easier to scale Roboto, and no one would be able to notice because these devices are effectively closed environments. This scaling could have been implemented via a transformation matrix that is associated with the component font in a CFR object, meaning that the actual font resource is unchanged. (As a side note, Samsung didn't bother to scale NotoSansJP-Regular.otf and NotoSansTC-Regular.otf, which means mixed-language content has the potential to look odd.)

@roozbehp
Copy link

Android's fallback chain is in the AOSP tree at frameworks/base/data/fonts/fonts.xml. It has changed slightly for N, but not by much. Check the file from an N developer preview devices (or the Android internal tree) for the latest. /cc @raphlinus @nona-google

The fallback_fonts.xml file that @kenlunde refers to is deprecated since Android L, IIRC, and should not be used. It's been removed from Android N.

But that file is not all there is to it. There's also code and logic in Minikin that overrides the logic for various characters, combinations, and contexts. This has been spec-ed over time by heuristics we came up with and has been heavily re-spec-ed and rewritten for Android N. There's still a lot to do there, especially regarding i18n boundaries. Most of that code is in frameworks/minikin/libs/minikin/FontCollection.cpp (again, you can look at the AOSP version, or the Android internal tree). For example, we have a concept of sticky characters, which we try to keep with the previous font run, and we have some complex code for handling emoji, variation selectors, combining marks, etc.

@roozbehp
Copy link

@kenlunde re scaling or replacing Roboto, Android devices who want to participate in certain parts of the ecosystem are not allowed to do that after certain version of Android, IIRC. Because if they do, they will break the layout in several apps.

@kenlunde
Copy link

@roozbehp: Makes sense. This is then yet another reason to use CFR objects.

BTW, it seems that Samsung did modify Roboto. I see SECRobotoLight-Regular.ttf and SECRobotoLight-Bold.ttf on my device's file system.

@roozbehp
Copy link

@kenlunde Interesting! I would like to know the differences.

@kenlunde
Copy link

@roozbehp Check your email.

@brawer
Copy link
Author

brawer commented Jun 27, 2016

Filed https://bugs.freedesktop.org/show_bug.cgi?id=96693 for adding support for CFR to fontconfig.

@twardoch
Copy link
Collaborator

@brawer, @behdad, @roozbehp

Microsoft Windows (Vista, 7, 8, 10) implements a different custom mechanism (which predates .cfr) called .CompositeFont. The files with the extension .CompositeFont reside in C:\Windows\Fonts like any other font.

Windows 10 ships with four such fonts, GlobalMonospace.CompositeFont, GlobalSansSerif.CompositeFont, GlobalSerif.CompositeFont, GlobalUserInterface.CompositeFont.

Perhaps it might be useful to build both .cfr and .CompositeFont files for the Noto project, with logically equivalent contents, so users on Windows, Mac OS X and Linux/Android (via potential fontconfig extension) could get the same functional results?

The family names Noto Sans Global, Noto Serif Global, Noto Mono Global and Noto UI Global might be appropriate, similar to how Microsoft structured these.

Below is the contents of GlobalSansSerif.CompositeFont. It’s fairly easy to follow.

<FontFamily
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/composite-font"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:System="clr-namespace:System;assembly=mscorlib"
    Baseline="0.9"
    LineSpacing="1.2">

    <!-- Name mapping -->
    <FontFamily.FamilyNames>
        <System:String x:Key="en-US">Global Sans Serif</System:String>
    </FontFamily.FamilyNames>

    <!-- Faces to report in font chooser UI -->
    <FontFamily.FamilyTypefaces>
        <FamilyTypeface
            Weight="Normal" Stretch="Normal" Style="Normal"
            UnderlinePosition="-0.1" UnderlineThickness="0.05"
            StrikethroughPosition="0.3" StrikethroughThickness="0.05"
            CapsHeight="0.5" XHeight="0.3" />

        <FamilyTypeface
            Weight="Bold" Stretch="Normal" Style="Normal"
            UnderlinePosition="-0.1" UnderlineThickness="0.05"
            StrikethroughPosition="0.3" StrikethroughThickness="0.05"
            CapsHeight="0.5" XHeight="0.3" />
    </FontFamily.FamilyTypefaces>

    <!-- Character to family lookups (in lookup order) -->
    <FontFamily.FamilyMaps>

        <!--
            Basic Latin                 0000-007F
            Latin-1 Supplement          0080-00FF
            Latin Extended-A            0100-017F
            Latin Extended-B            0180-024F
            IPA Extensions              0250-02AF
            Spacing Modifier Letters    02B0-02FF 
            Combining Diacritics Marks  0300-036F 
            Greek and Coptic            0370-03FF
            Cyrillic                    0400-04FF 
            Cyrillic Supplement         0500-052F 
            Hebrew                      0590-05FF 
            Arabic                      0600-06FF
            Arabic Supplement           0750-077F
            Phonetic Extensions         1D00-1D7F
            Phonetic Extensions Sup.    1D80-1DBF - Unicode 4.1, supported in Vista fonts
            Combining Diacritical M. S. 1DC0-1DFF - Unicode 4.1, supported in Vista fonts
            Latin Extended Additional   1E00-1EFF
            Latin Extended Additional   1E00-1EFF
            Greek Extended              1F00-1FFF
            Alpha Pres Forms Latin      FB00-FB0F
            Alpha Pres Forms Hebrew     FB1D-FB4F
            Arabic Pres Forms-A         FB50-FDCF
            Arabic Pres Forms-A         FDF0-FDFF
            Combining Half Marks        FE20-FE2F
            Arabic Pres Forms-B         FE70-FEFE -->
        <!-- CHS -->    
        <FontFamilyMap
            Unicode="0590-06FF, 0750-077F, FB1D-FDCF, FDF0-FDFF, FE70-FEFE"
            Language="zh-Hans"
            Target="Microsoft Uighur, Arial"
            Scale="1.0" />
        <!-- Other -->
        <FontFamilyMap
            Unicode="0000-052F, 0590-06FF, 0750-077F, 1D00-1FFF, FB00-FB0F, FB1D-FBFF"
            Target="Arial, Microsoft Sans Serif, Lucida Sans Unicode"
            Scale="1.0" />
        <FontFamilyMap
            Unicode="FC00-FDCF, FDF0-FDFF, FE20-FE2F, FE70-FEFE"
            Target="Arial, Simplified Arabic, Traditional Arabic"
            Scale="1.0" />

        <!--
            Armenian                    0530-058F    
            Georgian (Mkhedruli)        10D0-10FF
            Alpha Pres Forms (Armenian) FB10-FB1C -->
        <FontFamilyMap
            Unicode="0530-058F, 10D0-10FF, FB10-FB1C"
            Target="Sylfaen"
            Scale="1.0" />

        <!-- Syriac                      0700-074F -->
        <FontFamilyMap
            Unicode="0700-074F"
            Target="Estrangelo Edessa"
            Scale="1.0" />

        <!-- Thaana                      0780-07BF -->
        <FontFamilyMap
            Unicode="0780-07BF"
            Target="MV Boli"
            Scale="1.0" />

        <!-- Devanagari                  0900-097F -->
        <FontFamilyMap
            Unicode="0900-097F"
            Target="Mangal"
            Scale="1.0" />

        <!-- Bengali                     0980-09FF -->
        <FontFamilyMap
            Unicode="0980-09FF"
            Target="Vrinda"
            Scale="1.0" />

        <!-- Gurmukhi                    0A00-0A7F -->
        <FontFamilyMap
            Unicode="0A00-0A7F"
            Target="Raavi"
            Scale="1.0" />

        <!-- Gujarati                    0A80-0AFF -->
        <FontFamilyMap
            Unicode="0A80-0AFF"
            Target="Shruti"
            Scale="1.0" />

        <!-- Oriya                       0B00-0B7F -->
        <FontFamilyMap
            Unicode="0B00-0B7F"
            Target="Kalinga"
            Scale="1.0" />

        <!-- Tamil                       0B80-0BFF -->
        <FontFamilyMap
            Unicode="0B80-0BFF"
            Target="Latha"
            Scale="1.0" />

        <!-- Telugu                      0C00-0C7F -->
        <FontFamilyMap
            Unicode="0C00-0C7F"
            Target="Gautami"
            Scale="1.0" />

        <!-- Kannada                     0C80-0CFF -->
        <FontFamilyMap
            Unicode="0C80-0CFF"
            Target="Tunga"
            Scale="1.0" />

        <!-- Malayalam                   0D00-0D7F -->
        <FontFamilyMap
            Unicode="0D00-0D7F"
            Target="Kartika"
            Scale="1.0" />

        <!-- Sinhala                     0D80-0DFF -->
        <FontFamilyMap
            Unicode="0D80-0DFF"
            Target="Iskoola Pota"
            Scale="1.0" />

        <!-- Thai                        0E00-0E7F -->
        <FontFamilyMap
            Unicode="0E00-0E7F"
            Target="Cordia New"
            Scale="1.4" />

        <!-- Lao                         0E80-0EFF -->
         <FontFamilyMap  
            Unicode="0E80-0EFF"
            Target="DokChampa"
            Scale="1.0"/>

        <!-- Tibetan                     0F00-0FFF -->
        <FontFamilyMap
            Unicode="0F00-0FFF"
            Target="Microsoft Himalaya"
            Scale="1.0" />

        <!-- 
            Myanmar                     1000-109F
            Georgian (Khutsuri)         10A0-10CF
            Georgian Supplement         2D00-2D2F -->
        <!-- No font -->

        <!--
            Hangul Jamo                 1100-11FF  
            Hangul Compatibility Jamo   3130-318F
            Enc. CJK Paren Hangul       3200-321F
            Enc. CJK Circled Hangul     3260-327F
            Hangul Syllables            AC00-D7AF -->
        <FontFamilyMap
            Unicode="1100-11FF, 3130-318F, 3200-321F, 3260-327F, AC00-D7AF"
            Target="Malgun Gothic, Gulim"
            Scale="1.0" />

        <!--
            Ethiopic                   1200-137F
            Ethiopic Supplement        1380-139F - Unicode 4.1, NOT supported in Vista fonts!
            Ethiopic Extended          2D80-2DDF - Unicode 4.1, NOT supported in Vista fonts! -->
        <FontFamilyMap
            Unicode="1200-137F"
            Target="Nyala"
            Scale="1.0" />

        <!-- Cherokee                    13A0-13FF -->
        <FontFamilyMap
            Unicode="13A0-13FF"
            Target="Plantagenet Cherokee"
            Scale="1.0" />

        <!-- Canadian Aboriginals        1400-167F -->
        <FontFamilyMap
            Unicode="1400-167F"
            Target="Euphemia"
            Scale="1.0" />

        <!--
            Ogham                       1680-169F
            Runic                       16A0-16FF
            Tagalog                     1700-171F
            Hanunoo                     1720-173F
            Buhid                       1740-175F
            Tagbanwa                    1760-177f -->
        <!-- No font -->

        <!--
            Khmer                       1780-17FF
            Khmer Symbols               19E0-19FF -->
        <FontFamilyMap
            Unicode="1780-17FF, 19E0-19FF"
            Target="DaunPenh"
            Scale="1.0" />

        <!-- Mongolian                   1800-18AF -->
        <FontFamilyMap
            Unicode="1800-18AF"
            Target="Mongolian Baiti"
            Scale="1.0" />

        <!--
            Limbu                       1900-194F
            Tai Le                      1950-197F
            New Tai Lue                 1980-19DF
            Buginese                    1A00-1A1F -->
        <!-- No font -->

        <!-- NNBSP                       202F -->
        <!-- Always use Mongolian font to preserve Mongolian shaping -->
        <FontFamilyMap  
            Unicode          = "202F" 
            Target           = "Mongolian Baiti"
            Scale            = "1.0"/>

        <!--
            General Punctuation         2000-202E, 2030-206F
            Superscripts and Subscripts 2070-209F
            Currency Symbols            20A0-20CF
            Letterlike Symbols          2100-214F
            Number Forms                2150-218F
            Arrows                      2190-21FF
            Mathematical Operators      2200-22FF
            Miscelllaneous Technical    2300-23FF
            Enclosed Alphanumerics      2460-24FF
            Box Drawing                 2500-257F
            Block Elements              2580-259F
            Geometric Shapes            25A0-25FF
            Miscellaneous Symbols       2600-26FF
            Dingbats                    2700-27BF
            Misc Mathematical Symbols-B 2980-29FF -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="2000-202E, 2030-20CF, 2100-23FF, 2460-27BF, 2980-29FF"
            Language="zh-Hans"
            Target="Microsoft YaHei, Meiryo, SimSun, MS Gothic, Arial"
            Scale="1.0" />
        <!-- CHT -->
        <FontFamilyMap
            Unicode="2000-202E, 2030-20CF, 2100-23FF, 2460-27BF, 2980-29FF"
            Language="zh-Hant"
            Target="Microsoft JhengHei, Meiryo, MingLiU, MS Gothic, Arial"
            Scale="1.0" />
        <!-- JA -->
        <FontFamilyMap
            Unicode="2000-202E, 2030-20CF, 2100-23FF, 2460-27BF, 2980-29FF"
            Language="ja"
            Target="Meiryo, MS Gothic, Arial"
            Scale="1.0" />
        <!-- KO -->
        <FontFamilyMap
            Unicode="2000-202E, 2030-20CF, 2100-23FF, 2460-27BF, 2980-29FF"
            Language="ko"
            Target="Malgun Gothic, Gulim, MS Gothic, Arial"
            Scale="1.0" />
        <!-- Other -->
        <FontFamilyMap
            Unicode="2000-202E, 2030-20CF, 2100-23FF, 2460-27BF, 2980-29FF"
            Target="Arial, Meiryo, MS Gothic"
            Scale="1.0" />

        <!--
            Combining Diacritical Marks 20D0-20FF
            Control Pictures            2400-243F
            OCR                         2440-245F
            Misc Mathematical Symbols-A 27C0-27EF
            Supplemental Arrows-A       27F0-27FF
            Braille Patterns            2800-28FF
            Supplemental Arrows-B       2900-297F
            Supplemental Math Operators 2A00-2AFF
            Misc Symbols and Arrows     2B00-2BFF
            Glagolitic                  2C00-2C5F
            Coptic                      2C80-2CFF
            Tifinagh                    2D30-2D7F
            Supplemental Punctuation    2E00-2E7F -->
        <!-- No font -->

        <!-- CJK Radicals Supplement    2E80-2EFF -->
        <!-- CHS -->
        <FontFamilyMap  
            Unicode="2E80-2EFF"
            Language="zh-Hans"
            Target="Microsoft YaHei, SimSun, Meiryo, MingLiu"
            Scale="1.0"/>
        <!-- CHT -->
        <FontFamilyMap  
            Unicode="2E80-2EFF"
            Language="zh-Hant"
            Target="Microsoft YaHei, MingLiU, Meiryo, SimSun"
            Scale="1.0"/>
        <!-- Other (include JA and KO) -->    
        <FontFamilyMap  
            Unicode="2E80-2EFF"
            Target="Meiryo, Microsoft YaHei, MingLiU, SimSun"
            Scale="1.0"/>

        <!-- Kangxi Radicals             2F00-2FDF -->
        <FontFamilyMap  
            Unicode="2F00-2FDF"
            Target="Meiryo"
            Scale="1.0"/>

        <!-- Ideogr Description Char     2FF0-2FFF -->
        <FontFamilyMap
            Unicode="2FF0-2FFF"
            Target="SimSun"
            Scale="1.0" />

        <!--
            Symbols and Punctuation     3000-303F
            Hiragana                    3040-309F
            Katakana                    30A0-30FF
            Katakana Phonetic Ext.      31F0-31FF -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="3000-30FF, 31F0-31FF"
            Language="zh-Hans"
            Target="Microsoft YaHei, MS Gothic, SimSun"
            Scale="1.0" />
        <!-- CHT -->
        <FontFamilyMap
            Unicode="3000-30FF, 31F0-31FF"
            Language="zh-Hant"
            Target="Microsoft JhengHei, MS Gothic, MingLiu"
            Scale="1.0" />
        <!-- KO -->
        <FontFamilyMap
            Unicode="3000-30FF, 31F0-31FF"
            Language="ko"
            Target="Malgun Gothic, Meiryo, Microsoft YaHei, Gulim, MS Gothic, MingLiu"
            Scale="1.0" />
        <!-- Other (include JA) -->
        <FontFamilyMap
            Unicode="3000-30FF, 31F0-31FF"
            Target="Meiryo, Microsoft YaHei, MS Gothic, MingLiu"
            Scale="1.0" />

        <!--
            Bopomofo                    3100-312F
            Bopomofo Extended           31A0-31BF -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="3100-312F, 31A0-31BF"
            Language="zh-Hans"
            Target="Microsoft YaHei, SimSun"
            Scale="1.0" />
        <!-- Other (include CHT, JA and KO) -->
        <FontFamilyMap
            Unicode="3100-312F, 31A0-31BF"
            Target="Microsoft JhengHei, MingLiu"
            Scale="1.0" />

        <!-- Kanbun                      3190-319F -->
        <FontFamilyMap
            Unicode="3190-319F"
            Target="Microsoft YaHei, MingLiU"
            Scale="1.0" />

        <!-- CJK Strokes                 31C0-31EF -->
        <FontFamilyMap  
            Unicode          = "31C0-31EF"
            Target           = "MingLiU"
            Scale            = "1.0"/>

        <!-- 
            Enclosed CJK Han            3220-324F
            Enclosed CJK Numbers 21-35  3251-325F
            Enclosed CJK (Circled Ideog)3280-32B0
            Enclosed CJK Numbers 36-50  32B1-32BF
            Enclosed CJK Month          32C0-32CB
            Enclosed CJK Katakana       32D0-32FF
            CJK Comp Square Katakana    3300-3357
            CJK Comp Hours              3358-3370
            CJK Comp Latin Abr (hPa-PC) 3371-3376
            CJK Comp Ja era and corp    337B-337F
            CJK Comp Days               33E0-33FF -->
        <FontFamilyMap
            Unicode="3220-324F,3251-325F, 3280-32CB, 32D0-3376, 337B-337F, 33E0-33FF"
            Target="Meiryo, MS Gothic"
            Scale="1.0" />

        <!--
            Enclosed CJK PTE                  3250
            Enclosed CJK (Sq Lat Ab Hg to LTD)32CC-32CF
            CJK Comp (Latin Abr DM-IU)        3377-337A -->
        <!-- No font -->

        <!-- CJK Comp Square Latin Abr   3380-33DF -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="3380-33DF"
            Language="zh-Hans"
            Target="Microsoft YaHei, MS Gothic"
            Scale="1.0" />
        <!-- CHT -->
        <FontFamilyMap
            Unicode="3380-33DF"
            Language="zh-Hant"
            Target="Microsoft JhengHei, MS Gothic"
            Scale="1.0" />
        <!-- KO -->
        <FontFamilyMap
            Unicode="3380-33DF"
            Language="ko"
            Target="Malgun Gothic, Meiryo, Gulim, MS Gothic"
            Scale="1.0" />
        <!-- Other (include JA) -->
        <FontFamilyMap
            Unicode="3380-33DF"
            Target="Meiryo, MS Gothic"
            Scale="1.0" />

        <!--
            CJK Unified Ext A           3400-4DBF
            CJK Unified                 4E00-9FBB -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="3400-4DBF, 4E00-9FBB"
            Language="zh-Hans"
            Target="Microsoft YaHei, SimSun, SimSun-18030, SimSun-ExtB"
            Scale="1.0" />
        <!-- Hong Kong -->
        <FontFamilyMap
            Unicode="3400-4DBF, 4E00-9FBB"
            Language="zh-HK"
            Target="Microsoft JhengHei, Microsoft YaHei, MingLiU_HKSCS, MingLiU, SimSun-18030, SimSun-ExtB, Simsun"
            Scale="1.0" />
        <!-- CHT -->
        <FontFamilyMap
            Unicode="3400-4DBF, 4E00-9FBB"
            Language="zh-Hant"
            Target="Microsoft JhengHei, Microsoft YaHei, MingLiU, SimSun-18030, SimSun-ExtB, Simsun"
            Scale="1.0" />
        <!-- JA -->
        <FontFamilyMap
            Unicode="3400-4DBF, 4E00-9FBB"
            Language="ja"
            Target="Meiryo, MS Gothic, Microsoft YaHei, SimSun"
            Scale="1.0" />
        <!-- KO -->
        <FontFamilyMap
            Unicode="3400-4DBF, 4E00-9FBB"
            Language="ko"
            Target="Gulim, Microsoft YaHei, MS Gothic, Simsun"
            Scale="1.0" />
        <!-- Other -->
        <FontFamilyMap
            Unicode="3400-4DBF, 4E00-9FBB"
            Target="Meiryo, Microsoft YaHei, MS Gothic, Simsun"
            Scale="1.0" />

        <!-- Yijing Hexagram Symbols     4DC0-4DFF -->
        <!-- No font -->

        <!--
            Yi Syllables                   A000-A48F
            Yi Radicals                    A490-A4CF  -->
        <FontFamilyMap
            Unicode="A000-A4CF"
            Target="Microsoft Yi Baiti, SimSun-18030, SimSun-ExtB"
            Scale="1.0" />

        <!-- 
            Modifier Tone Letters       A700-A71F
            Syloti Nagri                A800-A82F -->
        <!-- No font -->

        <!-- CHS CJK Compatibility Ideographs F900-FAFF -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="F900-FAFF"
            Language="zh-Hans"
            Target="Microsoft YaHei, Microsoft JhengHei, Meiryo, MS Gothic, Gulim"
            Scale="1.0" />
        <!-- CHT -->
        <FontFamilyMap
            Unicode="F900-FAFF"
            Language="zh-Hant"
            Target="Microsoft JhengHei, Meiryo, MS Gothic, Gulim"
            Scale="1.0" />
        <!-- KO -->
        <FontFamilyMap
            Unicode="F900-FAFF"
            Language="ko"
            Target="Meiryo, Microsoft JhengHei, Gulim, MS Gothic"
            Scale="1.0" />
        <!-- Other (include JA) -->
        <FontFamilyMap
            Unicode="F900-FAFF"
            Target="Meiryo, Microsoft JhengHei, MS Gothic, Gulim"
            Scale="1.0" />

        <!-- Variation Selectors         FE00-FE0F -->
        <!-- No font -->

        <!-- Vertical Forms              FE10-FE1F -->
        <!-- CHS -->
        <FontFamilyMap  
            Unicode="FE10-FE1F"
            Language="zh-Hans"
            Target="Microsoft YaHei"
            Scale="1.0"/>
        <!-- KO -->
        <FontFamilyMap  
            Unicode="FE10-FE1F"
            Language="ko"
            Target="Malgun Gothic, Microsoft JhengHei"
            Scale="1.0"/>
        <!-- Other (include CHT and JA) -->
        <FontFamilyMap  
            Unicode="FE10-FE1F"
            Target="Microsoft JhengHei"
            Scale="1.0"/>

        <!--
            CJK Compatibility Forms     FE30-FE4F
            Small Form Variants         FE50-FE6F -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="FE30-FE6F"
            Language="zh-Hans"
            Target="Microsoft YaHei, SimSun"
            Scale="1.0" />
        <!-- CHT -->
        <FontFamilyMap
            Unicode="FE30-FE6F"
            Language="zh-Hant"
            Target="Microsoft JhengHei, MingLiU"
            Scale="1.0" />
        <!-- JA -->
        <FontFamilyMap
            Unicode="FE30-FE6F"
            Language="ja"
            Target="Meiryo, Microsoft JhengHei, MingLiU"
            Scale="1.0" />
        <!-- KO -->
        <FontFamilyMap
            Unicode="FE30-FE6F"
            Language="ko"
            Target="Malgun Gothic, Microsoft JhengHei, MingLiU"
            Scale="1.0" />
        <!-- Other -->
        <FontFamilyMap
            Unicode="FE30-FE6F"
            Target="Microsoft JhengHei, MingLiu"
            Scale="1.0" />

        <!-- Character FEFF              FEFF -->
        <!-- No font -->

        <!-- Halfw and Fullw Forms Latin FF00-FF60 -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="FF00-FF60"
            Language="zh-Hans"
            Target="Microsoft YaHei, MS Gothic"
            Scale="1.0" />
        <!-- CHT -->
        <FontFamilyMap
            Unicode="FF00-FF60"
            Language="zh-Hant"
            Target="Microsoft JhengHei, MS Gothic"
            Scale="1.0" />
        <!-- KO -->
        <FontFamilyMap
            Unicode="FF00-FF60"
            Language="ko"
            Target="Malgun Gothic, Meiryo, Gulim, MS Gothic"
            Scale="1.0" />
        <!-- Other (include JA) -->
        <FontFamilyMap
            Unicode="FF00-FF60"
            Target="Meiryo, MS Gothic"
            Scale="1.0" />

        <!--
            Half Full Forms CJK Punct   FF61-FF64
            Half Full Forms Katakana    FF65-FF9F -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="FF61-FF9F"
            Language="zh-Hans"
            Target="Microsoft YaHei, MingLiU"
            Scale="1.0" />
        <!-- CHT -->
        <FontFamilyMap
            Unicode="FF61-FF9F"
            Language="zh-Hans"
            Target="Microsoft JhengHei, MingLiU"
            Scale="1.0" />
        <!-- Other (include JA and KO) -->
        <FontFamilyMap
            Unicode="FF61-FF9F"
            Target="Meiryo, MS Gothic"
            Scale="1.0" />

        <!-- Half Full Forms Hangul      FFA0-FFDC -->
        <!-- No font -->

        <!-- Half Full (Symbol Variants) FFE0-FFEE -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="FFE0-FFEE"
            Language="zh-Hans"
            Target="Microsoft YaHei, MS Gothic"
            Scale="1.0" />
        <!-- CHT -->
        <FontFamilyMap
            Unicode="FFE0-FFEE"
            Language="zh-Hant"
            Target="Microsoft JhengHei, MS Gothic"
            Scale="1.0" />
        <!-- JA -->
        <FontFamilyMap  
            Unicode="FFE0-FFEE"
            Language="ja"
            Target="Meiryo, Microsoft JhengHei, MS Gothic"
            Scale="1.0"/>
        <!-- KO -->
        <FontFamilyMap
            Unicode="FFE0-FFEE"
            Language="ko"
            Target="Malgun Gothic, Meiryo, Microsoft JhengHei, Gulim, MS Gothic"
            Scale="1.0" />
        <!-- Other -->
        <FontFamilyMap
            Unicode="FFE0-FFEE"
            Target="Microsoft JhengHei, MS Gothic"
            Scale="1.0" />

        <!-- Specials                    FFF0-FFFD -->
        <FontFamilyMap
            Unicode="FFF0-FFFD"
            Target="Arial"
            Scale="1.0" />

        <!--
            Linear B Syllabary          10000-1007F
            Linear B Ideograms          10080-100FF
            Aegean Numbers              10100-1013F
            Old Italic                  10300-1032F
            Gothic                      10330-1034F
            Ugaritic                    10380-1039F
            Old Persian                 103A0-103DF
            Deseret                     10400-1044F
            Shavian                     10450-1047F
            Osmanya                     10480-104AF
            Cypriot Syllabary           10800-1083F
            Kharoshthi                  10A00-10A5F
            Byzantine Musical Symbols   1D000-1D0FF
            Musical Symbols             1D100-1D1FF
            Ancient Greek Musical Not.  1D200-1D24F
            Tai Xuan Jing Symbols       1D300-1D35F
            Math Alphanumeric Symbols   1D400-1D7FF -->
        <!-- No font -->

        <!-- CJK Unified Ideographs ExtB 20000-2A6DF -->
        <!-- CHS -->
        <FontFamilyMap
            Unicode="20000-2A6DF"
            Language="zh-Hans"
            Target="SimSun-ExtB"
            Scale="1.0" />
        <!-- Hong Kong -->
        <FontFamilyMap
            Unicode="20000-2A6DF"
            Language="zh-HK"
            Target="MingLiU_HKSCS-ExtB, MingLiU-ExtB"
            Scale="1.0" />
        <!-- JA -->
        <FontFamilyMap
            Unicode="20000-2A6DF"
            Language="ja"
            Target="Meiryo, MS Gothic, MingLiU-ExtB"
            Scale="1.0" />
        <!-- Other (include CHT and KO) -->
        <FontFamilyMap
            Unicode="20000-2A6DF"
            Target="MingLiU-ExtB"
            Scale="1.0" />

        <!-- CJK Comp Ideog Supplement      2F800-2FA1F -->
        <!-- CHS -->
        <FontFamilyMap  
            Unicode          = "2F800-2FA1F"
            Language         = "zh-Hans"
            Target           = "SimSun-ExtB"
            Scale            = "1.0"/>
        <!-- Hong Kong -->
        <FontFamilyMap  
            Unicode          = "2F800-2FA1F"
            Language         = "zh-HK"
            Target           = "MingLiU_HKSCS-ExtB, MingLiU-ExtB"
            Scale            = "1.0"/>
        <!-- CHT -->
        <FontFamilyMap  
            Unicode          = "2F800-2FA1F"
            Language         = "zh-Hant"
            Target           = "MingLiU-ExtB"
            Scale            = "1.0"/>
        <!-- Other (include JA and KO) -->
        <FontFamilyMap  
            Unicode          = "2F800-2FA1F"
            Target           = "Meiryo, MingLiU-ExtB"
            Scale            = "1.0"/>

    </FontFamily.FamilyMaps>

</FontFamily>

@twardoch
Copy link
Collaborator

In a related manner, there could be a template CSS that uses locally-sources Noto fonts with unicode-range properties set so that altogether, they would produce HTML+CSS composite font families, akin to:

@font-face {
  font-family: 'Noto Sans Global';
  src: local('Noto Sans');
  unicode-range: U+0025-00FF;
}

@hrhatada
Copy link

Have we documented this issue yet?

@kenlunde
Copy link

@hrhatada: CFR objects are documented in ISO/IEC 14496-28, which is available at no charge. As far as I am aware, all other font fallback mechanisms are undocumented, and doing so is likely to be a non-trivial effort.

@simoncozens simoncozens transferred this issue from notofonts/noto-fonts Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants