Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
616 views
in Technique[技术] by (71.8m points)

encoding - Emoji value range

I was trying to take out all emoji chars out of a string (like a sanitizer). But I cannot find a complete set of emoji values.

What is the complete set of emoji chars' UTF16 values?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The Unicode standard's Unicode? Technical Report #51 includes a list of emoji (emoji-data.txt):

...
21A9 ;  text ;  L1 ;    none ;  j   # V1.1 (?) LEFTWARDS ARROW WITH HOOK
21AA ;  text ;  L1 ;    none ;  j   # V1.1 (?) RIGHTWARDS ARROW WITH HOOK
231A ;  emoji ; L1 ;    none ;  j   # V1.1 (?) WATCH
231B ;  emoji ; L1 ;    none ;  j   # V1.1 (?) HOURGLASS
...

I believe you would want to remove each character listed in this document which had a Default_Emoji_Style of emoji.

There is no way, other than reference to a definition list like this, to identify the emoji characters in Unicode. As the reference to the FAQ says, they are spread throughout different blocks.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...