Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
822 views
in Technique[技术] by (71.8m points)

php - file_get_contents() converts UTF-8 to ISO-8859-1

I am trying to get search results from yahoo.com.

But file_get_contents() converts UTF-8 charset (charset, that yahoo uses) content to ISO-8859-1.

Try:

$filename = "http://search.yahoo.com/search;_ylt=A0oG7lpgGp9NTSYAiQBXNyoA?p=naj%C5%A1%C5%A5astnej%C5%A1%C3%AD&fr2=sb-top&fr=yfp-t-701&type_param=&rd=pref";

echo file_get_contents($filename);

Scripts as

header('Content-Type: text/html; charset=UTF-8');

or

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

or

$er = mb_convert_encoding($filename , 'UTF-8');

or

$s2 = iconv("ISO-8859-1","UTF-8",$filename );

or

echo utf8_encode(file_get_contents($filename));

NOT help, because after getting web content speciall characters as ? ? ? are replaced with question marks ???

I would appreciate any kind of help.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This seems to be a content negotiation problem as file_get_contents probably sends a request that only accepts ISO 8859-1 as character encoding.

You can create a custom stream context for file_get_contents using stream_context_create that explicitly states that you accept UTF-8:

$opts = array('http' => array('header' => 'Accept-Charset: UTF-8, *;q=0'));
$context = stream_context_create($opts);

$filename = "http://search.yahoo.com/search;_ylt=A0oG7lpgGp9NTSYAiQBXNyoA?p=naj%C5%A1%C5%A5astnej%C5%A1%C3%AD&fr2=sb-top&fr=yfp-t-701&type_param=&rd=pref";
echo file_get_contents($filename, false, $context);

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...