Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
181 views
in Technique[技术] by (71.8m points)

php - Bad Request. Connecting to sites via curl on host and system

I have this cURL code in php.

curl_setopt($ch, CURLOPT_URL, trim("http://stackoverflow.com/questions/tagged/java")); 
curl_setopt($ch, CURLOPT_PORT, 80); //ignore explicit setting of port 80
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); 
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_ENCODING, "");
curl_setopt($ch, CURLOPT_HTTPHEADER, $v);
curl_setopt($ch, CURLOPT_VERBOSE, true);

The contents of HTTPHEADER are ;

Proxy-Connection: Close
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1017.2 Safari/535.19
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: __qca=blabla
Connection: Close

Each of them individual items in the array $v.

When I upload the file on my host and run the code, what I get is :

400 Bad request

Your browser sent an invalid request.

But when I run it on my system using command line PHP, what I get is

< HTTP/1.1 200 OK
< Vary: Accept-Encoding
< Cache-Control: private
< Content-Type: text/html; charset=utf-8
< Content-Encoding: gzip
< Date: Sat, 03 Mar 2012 21:50:17 GMT
< Connection: close
< Set-Cookie: buncha cokkies; path=/; HttpOnly
< Content-Length: 22151
< 
* Closing connection #0

.

It's not only on stackoverflow, this happens, it happens also on 4shared, but works on google and others.

Thanks for any help.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is more a comment than an answer: From your question it's not clear what specifically triggers the 400 error nor what especially means it or more concrete: the source of it.

Is that the output by your server? Is that some feedback (the curl response) that you output with your script?

To better debug things, I've come up with a slightly different form of configuration you might be interested in when using the curl extension. There is a nice function called curl_setopt_array which allows you to set multiple options at once. It will return false if one of the options fails. It allows you to configure your request in complete in front. So you can more easily inject and replace it wiht a second (debug) configuration:

$curlDefault = array(
    CURLOPT_PORT => 80, //ignore explicit setting of port 80
    CURLOPT_RETURNTRANSFER => TRUE,
    CURLOPT_FOLLOWLOCATION => TRUE,
    CURLOPT_ENCODING => '',
    CURLOPT_HTTPHEADER => array(
        'Proxy-Connection: Close',
        'User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1017.2 Safari/535.19',
        'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'Accept-Encoding: gzip,deflate,sdch',
        'Accept-Language: en-US,en;q=0.8',
        'Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3',
        'Cookie: __qca=blabla',
        'Connection: Close',
    ),
    CURLOPT_VERBOSE => TRUE, // TRUE to output verbose information. Writes output to STDERR, or the file specified using CURLOPT_STDERR.
);

$url = "http://stackoverflow.com/questions/tagged/java";
$handle = curl_init($url);
curl_setopt_array($handle, $curlDefault);
$html = curl_exec($handle);
curl_close($handle);

This might help you to improve the code and to debug things.

Furthermore you're making use of the CURLOPT_VERBOSE option. This will put the verbose information into STDERR - so you can't track it any longer. Instead you can add it to the output as well to better see what's going on:

...
    CURLOPT_VERBOSE => TRUE, // TRUE to output verbose information. Writes output to STDERR, or the file specified using CURLOPT_STDERR.
    CURLOPT_STDERR => $verbose = fopen('php://temp', 'rw+'),
);

$url = "http://stackoverflow.com/questions/tagged/java";
$handle = curl_init($url);
curl_setopt_array($handle, $curlDefault);
$html = curl_exec($handle);
$urlEndpoint = curl_getinfo($handle, CURLINFO_EFFECTIVE_URL);
echo "Verbose information:
<pre>", !rewind($verbose), htmlspecialchars(stream_get_contents($verbose)), "</pre>
";
curl_close($handle);

Which gives sort of the following output:

Verbose information:
* About to connect() to stackoverflow.com port 80 (#0)
*   Trying 64.34.119.12...
* connected
* Connected to stackoverflow.com (64.34.119.12) port 80 (#0)
> GET /questions/tagged/java HTTP/1.1
Host: stackoverflow.com
Proxy-Connection: Close
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1017.2 Safari/535.19
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Cookie: __qca=blabla
Connection: Close

< HTTP/1.1 200 OK
< Cache-Control: private
< Content-Type: text/html; charset=utf-8
< Content-Encoding: gzip
< Vary: Accept-Encoding
< Date: Mon, 05 Mar 2012 17:33:11 GMT
< Connection: close
< Content-Length: 10537
< 
* Closing connection #0

Which should provide you the information needed to track things down if they are request/curl related. You can then easily change parameters and see if it makes a difference. Also compare the curl version you have installed locally with the one on the server. To obtain it, use curl_version:

$curlVersion = curl_version();
echo $curlVersion['version']; // e.g. 7.24.0

Hope this helps you to track things down.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...