Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
573 views
in Technique[技术] by (71.8m points)

php - multibyte strtr() -> mb_strtr()

Does anyone have written multibyte variant of function strtr() ? I need this one.

Edit 1 (example of desired usage):

Example:
$from = '??????yáí??ň??'; // these chars are in UTF-8
$to   = 'llsctzyai?dnao';

// input - in UTF-8
$str  = 'K?de? ?at?ov u?í koňa ?ra? k?ru.';
$str  = mb_strtr( $str, $from, $to );

// output - str without diacritic
// $str = 'Krdel datlov uci kona zrat koru.';
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I believe strtr is multi-byte safe, either way since str_replace is multi-byte safe you could wrap it:

function mb_strtr($str, $from, $to)
{
  return str_replace(mb_str_split($from), mb_str_split($to), $str);
}

Since there is no mb_str_split function you also need to write your own (using mb_substr and mb_strlen), or you could just use the PHP UTF-8 implementation (changed slightly):

function mb_str_split($str) {
    return preg_split('~~u', $str, null, PREG_SPLIT_NO_EMPTY);;

}

However if you're looking for a function to remove all (latin?) accentuations from a string you might find the following function useful:

function Unaccent($string)
{
    return preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml|caron);~i', '$1', htmlentities($string, ENT_QUOTES, 'UTF-8'));
}

echo Unaccent('??????yáí??ň?'); // llsctzyairdna
echo Unaccent('I?t?rnati?nàliz?ti?n'); // Internationalizaetion

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...