public final class UScript
extends Object
java.lang.Object | |
↳ | android.icu.lang.UScript |
Constants for ISO 15924 script codes, and related functions.
The current set of script code constants supports at least all scripts that are encoded in the version of Unicode which ICU currently supports. The names of the constants are usually derived from the Unicode script property value aliases. See UAX #24 Unicode Script Property (http://www.unicode.org/reports/tr24/) and http://www.unicode.org/Public/UCD/latest/ucd/PropertyValueAliases.txt .
Starting with ICU 3.6, constants for most ISO 15924 script codes are included, for use with language tags, CLDR data, and similar. Some of those codes are not used in the Unicode Character Database (UCD). For example, there are no characters that have a UCD script property value of Hans or Hant. All Han ideographs have the Hani script property value in Unicode.
Private-use codes Qaaa..Qabx are not included.
Starting with ICU 55, script codes are only added when their scripts have been or will certainly be encoded in Unicode, and have been assigned Unicode script property value aliases, to ensure that their script names are stable and match the names of the constants. Script codes like Latf and Aran that are not subject to separate encoding may be added at any time.
Nested classes |
|
---|---|
enum |
UScript.ScriptUsage Script usage constants. |
Constants |
|
---|---|
int |
AFAKA ISO 15924 script code |
int |
AHOM ISO 15924 script code |
int |
ANATOLIAN_HIEROGLYPHS ISO 15924 script code |
int |
ARABIC Arabic |
int |
ARMENIAN Armenian |
int |
AVESTAN ISO 15924 script code |
int |
BALINESE ISO 15924 script code |
int |
BAMUM ISO 15924 script code |
int |
BASSA_VAH ISO 15924 script code |
int |
BATAK ISO 15924 script code |
int |
BENGALI Bengali |
int |
BLISSYMBOLS ISO 15924 script code |
int |
BOOK_PAHLAVI ISO 15924 script code |
int |
BOPOMOFO Bopomofo |
int |
BRAHMI ISO 15924 script code |
int |
BRAILLE Braille Script in Unicode 4 |
int |
BUGINESE Script in Unicode 4.1 |
int |
BUHID Buhid |
int |
CANADIAN_ABORIGINAL Unified Canadian Aboriginal Symbols |
int |
CARIAN ISO 15924 script code |
int |
CAUCASIAN_ALBANIAN ISO 15924 script code |
int |
CHAKMA ISO 15924 script code |
int |
CHAM ISO 15924 script code |
int |
CHEROKEE Cherokee |
int |
CIRTH ISO 15924 script code |
int |
COMMON Common |
int |
COPTIC Coptic |
int |
CUNEIFORM ISO 15924 script code |
int |
CYPRIOT Cypriot Script in Unicode 4 |
int |
CYRILLIC Cyrillic |
int |
DEMOTIC_EGYPTIAN ISO 15924 script code |
int |
DESERET Deseret |
int |
DEVANAGARI Devanagari |
int |
DUPLOYAN ISO 15924 script code |
int |
EASTERN_SYRIAC ISO 15924 script code |
int |
EGYPTIAN_HIEROGLYPHS ISO 15924 script code |
int |
ELBASAN ISO 15924 script code |
int |
ESTRANGELO_SYRIAC ISO 15924 script code |
int |
ETHIOPIC Ethiopic |
int |
GEORGIAN Georgian |
int |
GLAGOLITIC Script in Unicode 4.1 |
int |
GOTHIC Gothic |
int |
GRANTHA ISO 15924 script code |
int |
GREEK Greek |
int |
GUJARATI Gujarati |
int |
GURMUKHI Gurmukhi |
int |
HAN Han |
int |
HANGUL Hangul |
int |
HANUNOO Hanunooo |
int |
HARAPPAN_INDUS ISO 15924 script code |
int |
HATRAN ISO 15924 script code |
int |
HEBREW Hebrew |
int |
HIERATIC_EGYPTIAN ISO 15924 script code |
int |
HIRAGANA Hiragana |
int |
IMPERIAL_ARAMAIC ISO 15924 script code |
int |
INHERITED Inherited |
int |
INSCRIPTIONAL_PAHLAVI ISO 15924 script code |
int |
INSCRIPTIONAL_PARTHIAN ISO 15924 script code |
int |
INVALID_CODE Invalid code |
int |
JAPANESE ISO 15924 script code |
int |
JAVANESE ISO 15924 script code |
int |
JURCHEN ISO 15924 script code |
int |
KAITHI ISO 15924 script code |
int |
KANNADA Kannada |
int |
KATAKANA Katakana |
int |
KATAKANA_OR_HIRAGANA Script in Unicode 4.0.1 |
int |
KAYAH_LI ISO 15924 script code |
int |
KHAROSHTHI Script in Unicode 4.1 |
int |
KHMER Khmer |
int |
KHOJKI ISO 15924 script code |
int |
KHUDAWADI ISO 15924 script code |
int |
KHUTSURI ISO 15924 script code |
int |
KOREAN ISO 15924 script code |
int |
KPELLE ISO 15924 script code |
int |
LANNA ISO 15924 script code |
int |
LAO Lao |
int |
LATIN Latin |
int |
LATIN_FRAKTUR ISO 15924 script code |
int |
LATIN_GAELIC ISO 15924 script code |
int |
LEPCHA ISO 15924 script code |
int |
LIMBU Limbu Script in Unicode 4 |
int |
LINEAR_A ISO 15924 script code |
int |
LINEAR_B Linear B Script in Unicode 4 |
int |
LISU ISO 15924 script code |
int |
LOMA ISO 15924 script code |
int |
LYCIAN ISO 15924 script code |
int |
LYDIAN ISO 15924 script code |
int |
MAHAJANI ISO 15924 script code |
int |
MALAYALAM Malayalam |
int |
MANDAEAN ISO 15924 script code |
int |
MANDAIC ISO 15924 script code |
int |
MANICHAEAN ISO 15924 script code |
int |
MATHEMATICAL_NOTATION ISO 15924 script code |
int |
MAYAN_HIEROGLYPHS ISO 15924 script code |
int |
MEITEI_MAYEK ISO 15924 script code |
int |
MENDE Mende Kikakui ISO 15924 script code |
int |
MEROITIC ISO 15924 script code |
int |
MEROITIC_CURSIVE ISO 15924 script code |
int |
MEROITIC_HIEROGLYPHS ISO 15924 script code |
int |
MIAO ISO 15924 script code |
int |
MODI ISO 15924 script code |
int |
MONGOLIAN Mangolian |
int |
MOON ISO 15924 script code |
int |
MRO ISO 15924 script code |
int |
MULTANI ISO 15924 script code |
int |
MYANMAR Myammar |
int |
NABATAEAN ISO 15924 script code |
int |
NAKHI_GEBA ISO 15924 script code |
int |
NEW_TAI_LUE Script in Unicode 4.1 |
int |
NKO ISO 15924 script code |
int |
NUSHU ISO 15924 script code |
int |
OGHAM Ogham |
int |
OLD_CHURCH_SLAVONIC_CYRILLIC ISO 15924 script code |
int |
OLD_HUNGARIAN ISO 15924 script code |
int |
OLD_ITALIC Old Itallic |
int |
OLD_NORTH_ARABIAN ISO 15924 script code |
int |
OLD_PERMIC ISO 15924 script code |
int |
OLD_PERSIAN Script in Unicode 4.1 |
int |
OLD_SOUTH_ARABIAN ISO 15924 script code |
int |
OL_CHIKI ISO 15924 script code |
int |
ORIYA Oriya |
int |
ORKHON ISO 15924 script code |
int |
OSMANYA Osmanya Script in Unicode 4 |
int |
PAHAWH_HMONG ISO 15924 script code |
int |
PALMYRENE ISO 15924 script code |
int |
PAU_CIN_HAU ISO 15924 script code |
int |
PHAGS_PA ISO 15924 script code |
int |
PHOENICIAN ISO 15924 script code |
int |
PHONETIC_POLLARD ISO 15924 script code |
int |
PSALTER_PAHLAVI ISO 15924 script code |
int |
REJANG ISO 15924 script code |
int |
RONGORONGO ISO 15924 script code |
int |
RUNIC Runic |
int |
SAMARITAN ISO 15924 script code |
int |
SARATI ISO 15924 script code |
int |
SAURASHTRA ISO 15924 script code |
int |
SHARADA ISO 15924 script code |
int |
SHAVIAN Shavian Script in Unicode 4 |
int |
SIDDHAM ISO 15924 script code |
int |
SIGN_WRITING ISO 15924 script code for Sutton SignWriting |
int |
SIMPLIFIED_HAN ISO 15924 script code |
int |
SINDHI ISO 15924 script code |
int |
SINHALA Sinhala |
int |
SORA_SOMPENG ISO 15924 script code |
int |
SUNDANESE ISO 15924 script code |
int |
SYLOTI_NAGRI Script in Unicode 4.1 |
int |
SYMBOLS ISO 15924 script code |
int |
SYRIAC Syriac |
int |
TAGALOG Tagalog |
int |
TAGBANWA Tagbanwa |
int |
TAI_LE Tai Le Script in Unicode 4 |
int |
TAI_VIET ISO 15924 script code |
int |
TAKRI ISO 15924 script code |
int |
TAMIL Tamil |
int |
TANGUT ISO 15924 script code |
int |
TELUGU Telugu |
int |
TENGWAR ISO 15924 script code |
int |
THAANA Thana |
int |
THAI Thai |
int |
TIBETAN Tibetan |
int |
TIFINAGH Script in Unicode 4.1 |
int |
TIRHUTA ISO 15924 script code |
int |
TRADITIONAL_HAN ISO 15924 script code |
int |
UCAS Unified Canadian Aboriginal Symbols (alias) |
int |
UGARITIC Ugaritic Script in Unicode 4 |
int |
UNKNOWN ISO 15924 script code |
int |
UNWRITTEN_LANGUAGES ISO 15924 script code |
int |
VAI ISO 15924 script code |
int |
VISIBLE_SPEECH ISO 15924 script code |
int |
WARANG_CITI ISO 15924 script code |
int |
WESTERN_SYRIAC ISO 15924 script code |
int |
WOLEAI ISO 15924 script code |
int |
YI Yi syllables |
Public methods |
|
---|---|
static final boolean |
breaksBetweenLetters(int script) Returns true if the script allows line breaks between letters (excluding hyphenation). |
static final int[] |
getCode(ULocale locale) Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. |
static final int[] |
getCode(String nameOrAbbrOrLocale) Gets the script codes associated with the given locale or ISO 15924 abbreviation or name. |
static final int[] |
getCode(Locale locale) Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. |
static final int |
getCodeFromName(String nameOrAbbr) Returns the script code associated with the given Unicode script property alias (name or abbreviation). |
static final String |
getName(int scriptCode) Returns the long Unicode script name, if there is one. |
static final String |
getSampleString(int script) Returns the script sample character string. |
static final int |
getScript(int codepoint) Gets the script code associated with the given codepoint. |
static final int |
getScriptExtensions(int c, BitSet set) Sets code point c's Script_Extensions as script code integers into the output BitSet. |
static final String |
getShortName(int scriptCode) Returns the 4-letter ISO 15924 script code, which is the same as the short Unicode script name if Unicode has names for the script. |
static final UScript.ScriptUsage |
getUsage(int script) Returns the script usage according to UAX #31 Unicode Identifier and Pattern Syntax. |
static final boolean |
hasScript(int c, int sc) Do the Script_Extensions of code point c contain script sc? If c does not have explicit Script_Extensions, then this tests whether c has the Script property value sc. |
static final boolean |
isCased(int script) Returns true if in modern (or most recent) usage of the script case distinctions are customary. |
static final boolean |
isRightToLeft(int script) Returns true if the script is written right-to-left. |
Inherited methods |
|
---|---|
![]() java.lang.Object
|
int ANATOLIAN_HIEROGLYPHS
ISO 15924 script code
Constant Value: 156 (0x0000009c)
int BASSA_VAH
ISO 15924 script code
Constant Value: 134 (0x00000086)
int BLISSYMBOLS
ISO 15924 script code
Constant Value: 64 (0x00000040)
int BOOK_PAHLAVI
ISO 15924 script code
Constant Value: 124 (0x0000007c)
int BRAILLE
Braille Script in Unicode 4
Constant Value: 46 (0x0000002e)
int CANADIAN_ABORIGINAL
Unified Canadian Aboriginal Symbols
Constant Value: 40 (0x00000028)
int CAUCASIAN_ALBANIAN
ISO 15924 script code
Constant Value: 159 (0x0000009f)
int CUNEIFORM
ISO 15924 script code
Constant Value: 101 (0x00000065)
int CYPRIOT
Cypriot Script in Unicode 4
Constant Value: 47 (0x0000002f)
int DEMOTIC_EGYPTIAN
ISO 15924 script code
Constant Value: 69 (0x00000045)
int EASTERN_SYRIAC
ISO 15924 script code
Constant Value: 97 (0x00000061)
int EGYPTIAN_HIEROGLYPHS
ISO 15924 script code
Constant Value: 71 (0x00000047)
int ESTRANGELO_SYRIAC
ISO 15924 script code
Constant Value: 95 (0x0000005f)
int GLAGOLITIC
Script in Unicode 4.1
Constant Value: 56 (0x00000038)
int HARAPPAN_INDUS
ISO 15924 script code
Constant Value: 77 (0x0000004d)
int HIERATIC_EGYPTIAN
ISO 15924 script code
Constant Value: 70 (0x00000046)
int IMPERIAL_ARAMAIC
ISO 15924 script code
Constant Value: 116 (0x00000074)
int INSCRIPTIONAL_PAHLAVI
ISO 15924 script code
Constant Value: 122 (0x0000007a)
int INSCRIPTIONAL_PARTHIAN
ISO 15924 script code
Constant Value: 125 (0x0000007d)
int KATAKANA_OR_HIRAGANA
Script in Unicode 4.0.1
Constant Value: 54 (0x00000036)
int KHAROSHTHI
Script in Unicode 4.1
Constant Value: 57 (0x00000039)
int KHUDAWADI
ISO 15924 script code
Constant Value: 145 (0x00000091)
int LATIN_FRAKTUR
ISO 15924 script code
Constant Value: 80 (0x00000050)
int LATIN_GAELIC
ISO 15924 script code
Constant Value: 81 (0x00000051)
int LINEAR_B
Linear B Script in Unicode 4
Constant Value: 49 (0x00000031)
int MANICHAEAN
ISO 15924 script code
Constant Value: 121 (0x00000079)
int MATHEMATICAL_NOTATION
ISO 15924 script code
Constant Value: 128 (0x00000080)
int MAYAN_HIEROGLYPHS
ISO 15924 script code
Constant Value: 85 (0x00000055)
int MEITEI_MAYEK
ISO 15924 script code
Constant Value: 115 (0x00000073)
int MENDE
Mende Kikakui ISO 15924 script code
Constant Value: 140 (0x0000008c)
int MEROITIC_CURSIVE
ISO 15924 script code
Constant Value: 141 (0x0000008d)
int MEROITIC_HIEROGLYPHS
ISO 15924 script code
Constant Value: 86 (0x00000056)
int NABATAEAN
ISO 15924 script code
Constant Value: 143 (0x0000008f)
int NAKHI_GEBA
ISO 15924 script code
Constant Value: 132 (0x00000084)
int NEW_TAI_LUE
Script in Unicode 4.1
Constant Value: 59 (0x0000003b)
int OLD_CHURCH_SLAVONIC_CYRILLIC
ISO 15924 script code
Constant Value: 68 (0x00000044)
int OLD_HUNGARIAN
ISO 15924 script code
Constant Value: 76 (0x0000004c)
int OLD_NORTH_ARABIAN
ISO 15924 script code
Constant Value: 142 (0x0000008e)
int OLD_PERMIC
ISO 15924 script code
Constant Value: 89 (0x00000059)
int OLD_PERSIAN
Script in Unicode 4.1
Constant Value: 61 (0x0000003d)
int OLD_SOUTH_ARABIAN
ISO 15924 script code
Constant Value: 133 (0x00000085)
int OSMANYA
Osmanya Script in Unicode 4
Constant Value: 50 (0x00000032)
int PAHAWH_HMONG
ISO 15924 script code
Constant Value: 75 (0x0000004b)
int PALMYRENE
ISO 15924 script code
Constant Value: 144 (0x00000090)
int PAU_CIN_HAU
ISO 15924 script code
Constant Value: 165 (0x000000a5)
int PHOENICIAN
ISO 15924 script code
Constant Value: 91 (0x0000005b)
int PHONETIC_POLLARD
ISO 15924 script code
Constant Value: 92 (0x0000005c)
int PSALTER_PAHLAVI
ISO 15924 script code
Constant Value: 123 (0x0000007b)
int RONGORONGO
ISO 15924 script code
Constant Value: 93 (0x0000005d)
int SAMARITAN
ISO 15924 script code
Constant Value: 126 (0x0000007e)
int SAURASHTRA
ISO 15924 script code
Constant Value: 111 (0x0000006f)
int SHAVIAN
Shavian Script in Unicode 4
Constant Value: 51 (0x00000033)
int SIGN_WRITING
ISO 15924 script code for Sutton SignWriting
Constant Value: 112 (0x00000070)
int SIMPLIFIED_HAN
ISO 15924 script code
Constant Value: 73 (0x00000049)
int SORA_SOMPENG
ISO 15924 script code
Constant Value: 152 (0x00000098)
int SUNDANESE
ISO 15924 script code
Constant Value: 113 (0x00000071)
int SYLOTI_NAGRI
Script in Unicode 4.1
Constant Value: 58 (0x0000003a)
int TRADITIONAL_HAN
ISO 15924 script code
Constant Value: 74 (0x0000004a)
int UCAS
Unified Canadian Aboriginal Symbols (alias)
Constant Value: 40 (0x00000028)
int UGARITIC
Ugaritic Script in Unicode 4
Constant Value: 53 (0x00000035)
int UNWRITTEN_LANGUAGES
ISO 15924 script code
Constant Value: 102 (0x00000066)
int VISIBLE_SPEECH
ISO 15924 script code
Constant Value: 100 (0x00000064)
int WARANG_CITI
ISO 15924 script code
Constant Value: 146 (0x00000092)
int WESTERN_SYRIAC
ISO 15924 script code
Constant Value: 96 (0x00000060)
boolean breaksBetweenLetters (int script)
Returns true if the script allows line breaks between letters (excluding hyphenation). Such a script typically requires dictionary-based line breaking. For example, Hani and Thai.
Parameters | |
---|---|
script |
int : script code |
Returns | |
---|---|
boolean |
true if the script allows line breaks between letters |
int[] getCode (ULocale locale)
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"
Parameters | |
---|---|
locale |
ULocale : ULocale |
Returns | |
---|---|
int[] |
The script codes array. null if the the code cannot be found. |
int[] getCode (String nameOrAbbrOrLocale)
Gets the script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"
Note: To search by short or long script alias only, use getCodeFromName(String)
instead. That does a fast lookup with no access of the locale data.
Parameters | |
---|---|
nameOrAbbrOrLocale |
String : name of the script or ISO 15924 code or locale |
Returns | |
---|---|
int[] |
The script codes array. null if the the code cannot be found. |
int[] getCode (Locale locale)
Gets a script codes associated with the given locale or ISO 15924 abbreviation or name. Returns MALAYAM given "Malayam" OR "Mlym". Returns LATIN given "en" OR "en_US"
Parameters | |
---|---|
locale |
Locale : Locale |
Returns | |
---|---|
int[] |
The script codes array. null if the the code cannot be found. |
int getCodeFromName (String nameOrAbbr)
Returns the script code associated with the given Unicode script property alias (name or abbreviation). Short aliases are ISO 15924 script codes. Returns MALAYAM given "Malayam" OR "Mlym".
Parameters | |
---|---|
nameOrAbbr |
String : name of the script or ISO 15924 code |
Returns | |
---|---|
int |
The script code value, or INVALID_CODE if the code cannot be found. |
String getName (int scriptCode)
Returns the long Unicode script name, if there is one. Otherwise returns the 4-letter ISO 15924 script code. Returns "Malayam" given MALAYALAM.
Parameters | |
---|---|
scriptCode |
int : int script code |
Returns | |
---|---|
String |
long script name as given in PropertyValueAliases.txt, or the 4-letter code |
Throws | |
---|---|
IllegalArgumentException |
if the script code is not valid |
String getSampleString (int script)
Returns the script sample character string. This string normally consists of one code point but might be longer. The string is empty if the script is not encoded.
Parameters | |
---|---|
script |
int : script code |
Returns | |
---|---|
String |
the sample character string |
int getScript (int codepoint)
Gets the script code associated with the given codepoint. Returns UScript.MALAYAM given 0x0D02
Parameters | |
---|---|
codepoint |
int : UChar32 codepoint |
Returns | |
---|---|
int |
The script code |
int getScriptExtensions (int c, BitSet set)
Sets code point c's Script_Extensions as script code integers into the output BitSet.
UNKNOWN
code is put into the set and also returned. Some characters are commonly used in multiple scripts. For more information, see UAX #24: http://www.unicode.org/reports/tr24/.
The Script_Extensions property is provisional. It may be modified or removed in future versions of the Unicode Standard, and thus in ICU.
Parameters | |
---|---|
c |
int : code point |
set |
BitSet : set of script code integers; will be cleared, then bits are set corresponding to c's Script_Extensions |
Returns | |
---|---|
int |
negative number of script codes in c's Script_Extensions, or the non-negative single Script value |
String getShortName (int scriptCode)
Returns the 4-letter ISO 15924 script code, which is the same as the short Unicode script name if Unicode has names for the script. Returns "Mlym" given MALAYALAM.
Parameters | |
---|---|
scriptCode |
int : int script code |
Returns | |
---|---|
String |
short script name (4-letter code) |
Throws | |
---|---|
IllegalArgumentException |
if the script code is not valid |
UScript.ScriptUsage getUsage (int script)
Returns the script usage according to UAX #31 Unicode Identifier and Pattern Syntax. Returns NOT_ENCODED
if the script is not encoded in Unicode.
Parameters | |
---|---|
script |
int : script code |
Returns | |
---|---|
UScript.ScriptUsage |
script usage |
See also:
boolean hasScript (int c, int sc)
Do the Script_Extensions of code point c contain script sc? If c does not have explicit Script_Extensions, then this tests whether c has the Script property value sc.
Some characters are commonly used in multiple scripts. For more information, see UAX #24: http://www.unicode.org/reports/tr24/.
The Script_Extensions property is provisional. It may be modified or removed in future versions of the Unicode Standard, and thus in ICU.
Parameters | |
---|---|
c |
int : code point |
sc |
int : script code |
Returns | |
---|---|
boolean |
true if sc is in Script_Extensions(c) |
boolean isCased (int script)
Returns true if in modern (or most recent) usage of the script case distinctions are customary. For example, Latn and Cyrl.
Parameters | |
---|---|
script |
int : script code |
Returns | |
---|---|
boolean |
true if the script is cased |
boolean isRightToLeft (int script)
Returns true if the script is written right-to-left. For example, Arab and Hebr.
Parameters | |
---|---|
script |
int : script code |
Returns | |
---|---|
boolean |
true if the script is right-to-left |