I need to clean a string that comes (copy/pasted) from various Microsoft Office suite applications (Excel, Access, and Word), each with its own set of encoding.
I'm using json_encode for debugging purposes in order to being able to see every single encoded character.
I'm able to clean everything I found so far (
) with str_replace, but with u00a0 I have no luck.
$string = 'mail@mail.comu00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0;mail@mail.com'; //this is the output from json_encode
$clean = str_replace("u00a0", "",$string);
returns:
mail@mail.comu00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0 u00a0;mail@mail.com
That is exactly the same; it completely ignores u00a0.
Is there a way around this? Also, I'm feeling I'm reinventing the wheel, is there a function/class that completely strips EVERY possibile char of EVERY possible encoding?
____EDIT____
After the first two replies I need to clarify that my example DOES work, because it's the output from json_encode, not the actual string!
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…