(PHP 5)
curl_multi_exec — 运行当前 cURL 句柄的子连接
$mh
, int &$still_running
)处理在栈中的每一个句柄。无论该句柄需要读取或写入数据都可调用此方法。
Example #1 curl_multi_exec() example
这个范例将会创建 2 个 cURL 句柄,把它们加到批处理句柄,然后并行地运行它们。
<?php
// 创建一对cURL资源
$ch1 = curl_init();
$ch2 = curl_init();
// 设置URL和相应的选项
curl_setopt($ch1, CURLOPT_URL, "http://lxr.php.net/");
curl_setopt($ch1, CURLOPT_HEADER, 0);
curl_setopt($ch2, CURLOPT_URL, "http://www.php.net/");
curl_setopt($ch2, CURLOPT_HEADER, 0);
// 创建批处理cURL句柄
$mh = curl_multi_init();
// 增加2个句柄
curl_multi_add_handle($mh,$ch1);
curl_multi_add_handle($mh,$ch2);
$active = null;
// 执行批处理句柄
do {
$mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
while ($active && $mrc == CURLM_OK) {
if (curl_multi_select($mh) != -1) {
do {
$mrc = curl_multi_exec($mh, $active);
} while ($mrc == CURLM_CALL_MULTI_PERFORM);
}
}
// 关闭全部句柄
curl_multi_remove_handle($mh, $ch1);
curl_multi_remove_handle($mh, $ch2);
curl_multi_close($mh);
?>
Daniel G Zylberberg (2012-07-11 18:00:53)
This function wait for the last page, can get a configuration array with curl options to made a post, or pass timeouts, etc.
Retun the same array but add "error", and "data".Error is the string destription if something fail, data is the response.
<?php
// author: Daniel G Zylberberg
// date 11 jul 2012
// $res: array with structure 0=>array("url"=>"blah"),1=>array("url"=>"some url")
// $options (optional): array with curl options (timeout, postfields, etc)
// return the same array that gets, and add "data" to the current row(html content)
// and "error", with the string description in the case that something fail.
function multiCurl($res,$options=""){
if(count($res)<=0) return False;
$handles = array();
if(!$options) // add default options
$options = array(
CURLOPT_HEADER=>0,
CURLOPT_RETURNTRANSFER=>1,
);
// add curl options to each handle
foreach($res as $k=>$row){
$ch{$k} = curl_init();
$options[CURLOPT_URL] = $row['url'];
curl_setopt_array($ch{$k}, $options);
$handles[$k] = $ch{$k};
}
$mh = curl_multi_init();
foreach($handles as $k => $handle){
curl_multi_add_handle($mh,$handle);
//echo "<br>adding handle {$k}";
}
$running_handles = null;
//execute the handles
do {
$status_cme = curl_multi_exec($mh, $running_handles);
} while ($cme == CURLM_CALL_MULTI_PERFORM);
while ($running_handles && $status_cme == CURLM_OK) {
if (curl_multi_select($mh) != -1) {
do {
$status_cme = curl_multi_exec($mh, $running_handles);
// echo "<br>''threads'' running = {$running_handles}";
} while ($status == CURLM_CALL_MULTI_PERFORM);
}
}
foreach($res as $k=>$row){
$res[$k]['error'] = curl_error($handles[$k]);
if(!empty($res[$k]['error']))
$res[$k]['data'] = '';
else
$res[$k]['data'] = curl_multi_getcontent( $handles[$k] ); // get results
// close current handler
curl_multi_remove_handle($mh, $handles[$k] );
}
curl_multi_close($mh);
return $res; // return response
}
$res = array(
"11"=>array("url"=>"http://localhost/public_html/test/sleep.php?t=1"),
"13"=>array("url"=>"http://localhost/public_html/test/sleep.php?t=3"),
"25"=>array("url"=>"this doesn't exist"),
);
print_r( multiCurl($res));
?>
---------- sleep.php -------------------------------------
<?php
sleep($_GET['t']);
echo "sleep for {$_GET['t']} seconds and show this.";
?>
emmet at trovit dot com (2012-06-25 16:57:55)
Here's something that had me pulling my hair out for quite a while. I was trying to download multiple files and save each one in a file. If you want to read the file again in the same script that has downloaded the url, always make sure to close the original filehandle that you opened for the connection BEFORE trying to read from the file again, even if you open a new filehandle to do so. If you don't do this, a file_get_contents() or fread() will cut your file and only return a limited size of it, 40960 characters in my case, without any other explanation or error. The file will exist (and be complete) on your disk, PHP just wont show it.
I (perhaps mistakenly) thought this was a bug so I created a bug report, see it for examples of the code I used to recreate this misterious behavior:
https://bugs.php.net/bug.php?id=62409
gnuffo1 at gmail dot com (2010-12-11 20:02:57)
Make sure you always remove the individual curl handles from the multi handle. For a CLI script I wrote that fetched the same pages over and over again every few seconds, I had originally omitted doing this, not even realising I had done so. It worked perfectly on my Windows 7 box. However, on my CentOS server, after about 45 seconds, the script kept dying with absolutely no warnings or errors. After hours of debugging I finally realised I wasn't removing the handles from the multi handle and lo and behold, it fixed it.
zhongeren at gmail dot com (2010-09-19 19:55:58)
download all the images given in the URL array in parallel !!
<?php
/**
*
*@param $picsArr Array [0]=> [url],
*@$picsArr Array will filled with the image data , you can use the data as you want or just save it in the next step.
**/
function getAllPics(&$picsArr){
if(count($picsArr)<=0) return false;
$hArr = array();//handle array
foreach($picsArr as $k=>$pic){
$h = curl_init();
curl_setopt($h,CURLOPT_URL,$pic['url']);
curl_setopt($h,CURLOPT_HEADER,0);
curl_setopt($h,CURLOPT_RETURNTRANSFER,1);//return the image value
array_push($hArr,$h);
}
$mh = curl_multi_init();
foreach($hArr as $k => $h) curl_multi_add_handle($mh,$h);
$running = null;
do{
curl_multi_exec($mh,$running);
}while($running > 0);
// get the result and save it in the result ARRAY
foreach($hArr as $k => $h){
$picsArr[$k]['data'] = curl_multi_getcontent($h);
}
//close all the connections
foreach($hArr as $k => $h){
$info = curl_getinfo($h);
preg_match("/^image\/(.*)$/",$info['content_type'],$matches);
echo $tail = $matches[1];
curl_multi_remove_handle($mh,$h);
}
curl_multi_close($mh);
return true;
}
?>
dirkwybe at gmail dot com (2009-12-12 13:50:50)
If you have problems with client timeouts (browser, wget) - like I did - add some keep alive code.
<?php
$locations = array(
"file1" => "[url here]",
"file2" => "[url here]",
"file3" => "[url here]",
"file4" => "[url here]"
);
$mh = curl_multi_init();
$threads = null;
foreach ($locations as $name => $url)
{
$c[$name]=curl_init($url);
$f[$name]=fopen ($name.".xml", "w");
curl_setopt ($c[$name], CURLOPT_FILE, $f[$name]);
curl_setopt ($c[$name], CURLOPT_TIMEOUT,600);
curl_multi_add_handle ($mh,$c[$name]);
}
$t1 = time();
do
{
$n=curl_multi_exec($mh,$threads);
if (time() > $t1 + 2)
{
echo "keep-alive" ."<br/>";
$t1 = time();
}
}
while ($threads > 0);
foreach ($locations as $name => $url)
{
curl_multi_remove_handle($mh,$c[$name]);
curl_close($c[$name]);
fclose ($f[$name]);
}
curl_multi_close($mh);
?>
jakob dot voss at gbv dot de (2009-02-14 05:10:12)
With the current examples using curl-multi-exec is still not much better then just using file_get_contents. Your script has to needlessly wait at least the maximum of all connection times (instead of the sum). A better solution would be to run the connections as another thread in parallel to other parts of the script:
<?php
$cm = new CurlManager();
$handle1 = $cm->register($url1);
$handle2 = $cm->register($url2);
$cm->start(); # start connections
$result1 = $cm->result($handle1); # surely NULL because connection has not been finished
sleep($seconds); # do something else (including use of other CurlManagers!)
$result1 = $cm->result($handle1); # maybe NULL if connection takes longer then $seconds
$cm->finish(); # wait if some connections are still open
$result1 = $cm->result($handle1);
$result2 = $cm->result($handle2);
?>
But I am not sure whether such a CurlManger is possible, especially if you may want to use multiple of them (for instance because some URLs are not known at the start of the script). It's a pain that PHP does not support real threads.
jorge dot hebrard at gmail dot com (2009-01-24 21:40:00)
Based on the below comments, I made a simple class.
See note below code.
<?php
/**
* Multiple Curl Handlers
* @author Jorge Hebrard ( jorge.hebrard@gmail.com )
**/
class curlNode{
static private $listenerList;
private $callback;
public function __construct($url){
$new =& self::$listenerList[];
$new['url'] = $url;
$this->callback =& $new;
}
/**
* Callbacks needs 3 parameters: $url, $html (data of the url), and $lag (execution time)
**/
public function addListener($callback){
$this->callback['callback'] = $callback;
}
/**
* curl_setopt() wrapper. Enjoy!
**/
public function setOpt($key,$value){
$this->callback['opt'][$key] = $value;
}
/**
* Request all the created curlNode objects, and invoke associated callbacks.
**/
static public function request(){
//create the multiple cURL handle
$mh = curl_multi_init();
$running=null;
# Setup all curl handles
# Loop through each created curlNode object.
foreach(self::$listenerList as &$listener){
$url = $listener['url'];
$current =& $ch[];
# Init curl and set default options.
# This can be improved by creating
$current = curl_init();
curl_setopt($current, CURLOPT_URL, $url);
# Since we don't want to display multiple pages in a single php file, do we?
curl_setopt($current, CURLOPT_HEADER, 0);
curl_setopt($current, CURLOPT_RETURNTRANSFER, 1);
# Set defined options, set through curlNode->setOpt();
if (isset($listener['opt'])){
foreach($listener['opt'] as $key => $value){
curl_setopt($current, $key, $value);
}
}
curl_multi_add_handle($mh,$current);
$listener['handle'] = $current;
$listener['start'] = microtime(1);
} unset($listener);
# Main loop execution
do {
# Exec until there's no more data in this iteration.
# This function has a bug, it
while(($execrun = curl_multi_exec($mh, $running)) == CURLM_CALL_MULTI_PERFORM);
if($execrun != CURLM_OK) break; # This should never happen. Optional line.
# Get information about the handle that just finished the work.
while($done = curl_multi_info_read($mh)) {
# Call the associated listener
foreach(self::$listenerList as $listener){
# Strict compare handles.
if ($listener['handle'] === $done['handle']) {
# Get content
$html = curl_multi_getcontent($done['handle']);
# Call the callback.
call_user_func($listener['callback'],
$listener['url'],
$html,(microtime(1)-$listener['start']));
# Remove unnecesary handle (optional, script works without it).
curl_multi_remove_handle($mh, $done['handle']);
}
}
}
# Required, or else we would end up with a endless loop.
# Without it, even when the connections are over, this script keeps running.
if (!$running) break;
# I don't know what these lines do, but they are required for the script to work.
while (($res = curl_multi_select($mh)) === 0);
if ($res === false) break; # Select error, should never happen.
} while (true);
# Finish out our script ;)
curl_multi_close($mh);
}
}
$curlGoogle = new curlNode('http://www.google.com/');
$curlGoogle->setOpt(CURLOPT_HEADER, 0);
$curlGoogle->addListener('callbackGoogle');
$curlMySpace = new curlNode('http://www.myspace.com/');
$curlMySpace->addListener('callbackMySpace');
curlNode::request();
?>
NOTE: I think curl has a bug managing local files, when multiple handles are set.
I tested with a local file, giving me 0.1~ sec. latency.
Then I added another handle, google. With this new addition, my original file had a latency of 1.0~ sec. latency!!
But with external sites (try Yahoo and Google), the latency is not a problem. Google alone gives me 1.7~ sec. latency, and with Yahoo, Google gives me 1.8~ sec. latency.
If anyone can explain, I will be very granted.
manixrock-~-gmail-~-com (2008-12-07 03:13:46)
For anyone trying to dynamically add links to mcurl and removing them as they complete (for example to keep a constant of 5 links downloading), I found that the following code works:
<?php
$mcurl = curl_multi_init();
for(;;) {
curl_multi_select($mcurl);
while(($mcRes = curl_multi_exec($mcurl, $mcActive)) == CURLM_CALL_MULTI_PERFORM);
if($mcRes != CURLM_OK) break;
while($done = curl_multi_info_read($mcurl)) {
// code that parses, adds or removes links to mcurl
}
// here you should check if all the links are finished and break, or add new links to the loop
}
curl_multi_close($mcurl);
?>
You should note the return value of curl_multi_select() is ignored. I found that if you do the curl_multi_exec() loop only if the return value of curl_multi_select() is != -1 then new links added to mcurl are ignored. This may be a bug. Hope this saves someone some time.
Jimmy Ruska (2008-09-13 11:56:52)
> replying to viczbk.ru
Just sharing my attempt at it.
> while (($res = curl_multi_select($mh)) === 0) {};
This worked on my windows computer (php 5.2.5) but when I ran the curl program in my new centOS server (php 5.1.6) the function never updates unless curl_multi_exec() is added to the loop, taking away the point of using it to save cycles. curl_multi_select() also allows you to set a timeout but it doesn't seem to help, then again, don't see why people wouldn't use it anyway.
Even when curl_multi_select($mh) does work there's no way to know which of the sockets updated in status or if they've received partial data, completely finished, or just timed out. It's not reliable if you want to remove / add data as soon as things finish. Try calling just example.com or some other website with very little data. curl_multi_select($mh) can send non 0 value a couple of thousand times before finishing. Lazily adding a usleep(25000) or some minimal amount can also help not waste cycles.
viczbk.ru (2008-02-08 20:04:21)
http://curl.haxx.se/libcurl/c/libcurl-multi.html
"When you've added the handles you have for the moment (you can still add new ones at any time), you start the transfers by call curl_multi_perform(3).
curl_multi_perform(3) is asynchronous. It will only execute as little as possible and then return back control to your program. It is designed to never block. If it returns CURLM_CALL_MULTI_PERFORM you better call it again soon, as that is a signal that it still has local data to send or remote data to receive."
So it seems the loop in sample script should look this way:
<?php
$running=null;
//execute the handles
do {
while (CURLM_CALL_MULTI_PERFORM === curl_multi_exec($mh, $running));
if (!$running) break;
while (($res = curl_multi_select($mh)) === 0) {};
if ($res === false) {
echo "<h1>select error</h1>";
break;
}
} while (true);
?>
This worked fine (PHP 5.2.5 @ FBSD 6.2) without running non-blocked loop and wasting CPU time.
However this seems to be the only use of curl_multi_select, coz there's no simple way to bind it with other PHP wrappers for select syscall.
robert dot reichel at boboton dot com (2007-10-20 05:54:39)
I was testing PHP code provided by dtorop933@gmail.com in curl_multi_exec section of PHP Manual.
The part of the code '$err = curl_error($conn[$i])' should return error message for each cURL session, but it does not.
The function curl_error() works well with the curl_exec(). Is there any other solution for getting session error message with curl_multi_exec() or there is a bug in cURL library.
The script was tested with Windows XP and PHP-5.2.4
shichuanr at gmail dot com (2007-09-06 05:12:13)
For people who have problem running the above example script(Example 425. curl_multi_exec()), you can alter it a little bit to make it work properly by replacing the existing chunk of code with the one below:
<?php
// create both cURL resources
$ch1 = curl_init();
$ch2 = curl_init();
// set URL and other appropriate options
curl_setopt($ch1, CURLOPT_URL, "http://www.google.com/");
curl_setopt($ch1, CURLOPT_HEADER, 0);
curl_setopt($ch2, CURLOPT_URL, "http://www.php.net/");
curl_setopt($ch2, CURLOPT_HEADER, 0);
//create the multiple cURL handle
$mh = curl_multi_init();
//add the two handles
curl_multi_add_handle($mh,$ch1);
curl_multi_add_handle($mh,$ch2);
$running=null;
//execute the handles
do {
curl_multi_exec($mh,$running);
} while ($running > 0);
//close the handles
curl_multi_remove_handle($mh,$ch1);
curl_multi_remove_handle($mh,$ch1);
curl_multi_close($mh);
?>
substr("iscampifriese",1,11) dot " at beer dot com" (2007-07-04 06:10:21)
If you are using mulit handles and you wish to re-execute any of them (if they timed out or something), then you need to remove the handles from the mulit-handle and then re-add them in order to get them to re-execute. Otherwise cURL will just give you back the same results again without actually retrying.