PHP curl实现异步的http请求,转

PHP是不支持线程的,但是我们有总想并行地干一些事情,比如,同时执行多个http的请求,如果使用多进程的话,有两个问题:

1. 不能跨平台

2. 创建进程的开销似乎大了些

于是,我们就想到了使用异步来达到类似并行的效果,曾经早就写程序实现过,不过是很初级的,现在curl帮我们实现了,只是目前网上的文档还少一些,这个贡献一些。

文档1:

Let’s get one thing out in the open. Curl is sweet. It does it’s job very well, and I’m absoutely thrilled it exists.

If you’re using curl in your PHP app to make web requests, you’ve probably realized that by doing them one after the other, the total time of your request is the sum of all the requests put together. That’s lame.

Unfortunately using the curl_multi_exec is poorly documented in the PHP manual.

Let’s say that your app is hitting APIs from these servers:

Google: .1s

Microsoft: .3s

rustyrazorblade.com: .5s

Your total time will be .9s, just for api calls.

By using curl_multi_exec, you can execute those requests in parallel, and you’ll only be limited by the slowest request, which is about .5 sec to rustyrazorblade in this case, assuming your download bandwidth is not slowing you down.

Sample code:

<?php
$nodes = array('http://www.google.com', 'http://www.microsoft.com', 'http://www.rustyrazorblade.com'); 
$node_count = count($nodes); 

$curl_arr = array(); 
$master = curl_multi_init(); 

for($i = 0; $i < $node_count; $i++) 
{ 
    $url =$nodes[$i]; 
    $curl_arr[$i] = curl_init($url); 
    curl_setopt($curl_arr[$i], CURLOPT_RETURNTRANSFER, true); 
    curl_multi_add_handle($master, $curl_arr[$i]); 
} 

do { 
    curl_multi_exec($master,$running); 
} while($running > 0); 

echo "results: "; 
for($i = 0; $i < $node_count; $i++) 
{ 
    $results = curl_multi_getcontent  ( $curl_arr[$i]  ); 
    echo( $i . "\n" . $results . "\n"); 
} 
echo 'done';
?>

  

It’s really not documented on php.net how to use curl_multi_getcontent, so hopefully this helps someone.

文档2:

<?php
function getMultipleDocuments($nodes, $referer){
    set_time_limit(90);
    if(!$referer){
        $referer = $nodes[0];
    }
    $node_count = count($nodes);

    $curl_arr = array();
    $master = curl_multi_init();

    for($i = 0; $i < $node_count; $i++)
    {
        $curl_arr[$i] = curl_init($nodes[$i]);
        curl_setopt($curl_arr[$i], CURLOPT_FOLLOWLOCATION, true);
        curl_setopt($curl_arr[$i],CURLOPT_FRESH_CONNECT,true);
        curl_setopt($curl_arr[$i],CURLOPT_CONNECTTIMEOUT,10);
        curl_setopt($curl_arr[$i],CURLOPT_RETURNTRANSFER,true);
        curl_setopt($curl_arr[$i],CURLOPT_REFERER,$referer);
        curl_setopt($curl_arr[$i],CURLOPT_TIMEOUT,30);

        curl_multi_add_handle($master, $curl_arr[$i]);
    }
    $previousActive = -1;
    $finalresult = array();
    $returnedOrder = array();
    do{
        curl_multi_exec($master, $running);
        if($running !== $previousActive){
            $info = curl_multi_info_read($master);
            if($info['handle']){
                $finalresult[] = curl_multi_getcontent($info['handle']);
                $returnedOrder[] = array_search($info['handle'], $curl_arr, true);
                curl_multi_remove_handle($master, $info['handle']);
                curl_close($curl_arr[end($returnedOrder)]);
                echo 'downloaded '.$nodes[end($returnedOrder)].'. We can process it further straight away, but for this example, we will not.';
                ob_flush();flush();
            }
        }
        $previousActive = $running;
    }while($running > 0);
    curl_multi_close($master);

    set_time_limit(30);
    return array_combine($returnedOrder, $finalresult);
}

$nodes = array('http://mediumSpeedSite.org', 'http://fastSpeedSite.com', 'http://quiteSlowSite.com');
$returnedDocs = getMultipleDocuments($nodes, null);

?>