I would suggest using the members of string, but with an explicit encoding:
byte[] bytes = text.getBytes("UTF-8");
String text = new String(bytes, "UTF-8");
By using an explicit encoding (and one which supports all of Unicode) you avoid the problems of just calling text.getBytes()
etc:
- You're explicitly using a specific encoding, so you know which encoding to use later, rather than relying on the platform default.
- You know it will support all of Unicode (as opposed to, say, ISO-Latin-1).
EDIT: Even though UTF-8 is the default encoding on Android, I'd definitely be explicit about this. For example, this question only says "in Java or Android" - so it's entirely possible that the code will end up being used on other platforms.
Basically given that the normal Java platform can have different default encodings, I think it's best to be absolutely explicit. I've seen way too many people using the default encoding and losing data to take that risk.
EDIT: In my haste I forgot to mention that you don't have to use the encoding's name - you can use a Charset
instead. Using Guava I'd really use:
byte[] bytes = text.getBytes(Charsets.UTF_8);
String text = new String(bytes, Charsets.UTF_8);
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…