Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
302 views
in Technique[技术] by (71.8m points)

ios - How to express Strings in Swift using Unicode hexadecimal values (UTF-16)

I want to write a Unicode string using hexadecimal values in Swift. I have read the documentation for String and Character so I know that I can use special Unicode characters directly in strings like the following:

var variableString = "Cat???" // "Cat" + Double Exclamation + cat emoji

But I would like to do it using the Unicode code points. The docs (and this question) show it for characters, but are not very clear about how to do it for strings.

(Note: Although the answer seems obvious to me now, it wasn't obvious at all just a short time ago. I am answering my own question below as a means of learning how to do this and also to help myself understand Unicode terminology and how Swift Characters and Strings work.)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Updated for Swift 3

Character

The Swift syntax for forming a hexadecimal code point is

u{n}

where n is a hexadecimal number up to 8 digits long. The valid range for a Unicode scalar is U+0 to U+D7FF and U+E000 to U+10FFFF inclusive. (The U+D800 to U+DFFF range is for surrogate pairs, which are not scalars themselves, but are used in UTF-16 for encoding the higher value scalars.)

Examples:

// The following forms are equivalent. They all produce "C". 
let char1: Character = "u{43}"
let char2: Character = "u{0043}"
let char3: Character = "u{00000043}"

// Higher value Unicode scalars are done similarly
let char4: Character = "u{203C}" // ? (DOUBLE EXCLAMATION MARK character)
let char5: Character = "u{1F431}" // ?? (cat emoji)

// Characters can be made up of multiple scalars
let char7: Character = "u{65}u{301}" // é = "e" + accent mark
let char8: Character = "u{65}u{301}u{20DD}" // é? = "e" + accent mark + circle

Notes:

String

Strings are composed of characters. See the following examples for some ways to form them using hexadecimal code points.

Examples:

var string1 = "u{0043}u{0061}u{0074}u{203C}u{1F431}" // Cat???

// pass an array of characters to a String initializer
let catCharacters: [Character] = ["u{0043}", "u{0061}", "u{0074}", "u{203C}", "u{1F431}"] // ["C", "a", "t", "?", "??"]
let string2 = String(catCharacters) // Cat???

Converting Hex Values at Runtime

At runtime you can convert hexadecimal or Int values into a Character or String by first converting it to a UnicodeScalar.

Examples:

// hex values
let value0: UInt8  = 0x43     // 97
let value1: UInt16 = 0x203C   // 22823
let value2: UInt32 = 0x1F431  // 127822

// convert hex to UnicodeScalar
let scalar0 = UnicodeScalar(value0)
// make sure that UInt16 and UInt32 form valid Unicode values
guard
    let scalar1 = UnicodeScalar(value1),
    let scalar2 = UnicodeScalar(value2) else {
    return
}

// convert to Character
let character0 = Character(scalar0) // C
let character1 = Character(scalar1) // ?
let character2 = Character(scalar2) // ??

// convert to String
let string0 = String(scalar0) // C
let string1 = String(scalar1) // ?
let string2 = String(scalar2) // ??

// convert hex array to String
let myHexArray = [0x43, 0x61, 0x74, 0x203C, 0x1F431] // an Int array
var myString = ""
for hexValue in myHexArray {
    if let scalar = UnicodeScalar(hexValue) {
        myString.append(Character(scalar))
    }
}
print(myString) // Cat???

Further reading


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...