1

curl を使用して外部 URL から xml ファイルをダウンロードするために php スクリプトを使用していますが、問題が発生しています。Curl が完全なファイルのダウンロードに失敗することがあります。この問題は、cron を使用してホスト サーバー経由でスクリプトを実行すると、さらに頻繁に発生します。

これはスクリプトです:

<?php
header('Content-type:text/html; charset=utf-8');

//initialize downloading xml file tries
$xml_dl_attempts = 0;

//set filename of output xml file
$findex = 0;
while(file_exists("xml".$findex.".xml"))
{
    $findex++;
}
$filename = "xml".$findex.".xml";

//filname for log file
$logfilename = "log.txt";

//Open (append) logfile for write.
$logfileout = fopen($logfilename, 'a');
fwrite($logfileout, "Starting attempts to download the xml file at ".date("H:i:s Y-m-d")."\r\n");

//Attempt to download xml file 8 times
do {
    //Sleep 3 second before retrying download
    if($xml_dl_attempts > 0 ) sleep(3);

    //Increse number of download attempts
    $xml_dl_attempts++;
    //Write to logfile
    fwrite($logfileout, date("H:i:s Y-m-d").": Download attempt number ".$xml_dl_attempts.": ");

    //Download xml file using curl
    $ch = curl_init();
    $url = 'http://www.opap.gr/web/services/rs/betting/availableBetGames/sport/program/4100/0/sport-1.xml?localeId=el_GR';

    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, false);
    curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

    set_time_limit(300); 
    curl_setopt($ch, CURLOPT_TIMEOUT, 300);

    $outfile = fopen($filename, 'w');
    if (!$outfile)
    {
    exit;
    }
    curl_setopt($ch, CURLOPT_FILE, $outfile);

    if(curl_exec($ch)==false)
    {
        fwrite($logfileout, "curl_error: ".curl_error($ch));
    }
    fclose($outfile);
    curl_close($ch);

    //Clear errors
    libxml_use_internal_errors(true);
    libxml_clear_errors();

    //Parse xml file
    $xml = simplexml_load_file($filename);

    //Check for errors
    if($err = libxml_get_last_error())
    {
        fwrite($logfileout, "failed\r\n");
    }
} while($err !== false && $xml_dl_attempts < 8); //repeat if xml was not completely downloaded

//Check if 
if(!$err)
{
    fwrite($logfileout, "successfull\r\n");
}
fwrite($logfileout, "End.\r\n");
fclose($logfileout);
?>

ご覧のとおり、ダウンロードした xml ファイルの解析中に simplexml パーサーがエラーを出すかどうかを確認します。エラーが発生した場合は、試行回数を 8 回に制限してプロセスを繰り返します。ログファイルも作成しました。

以下は 1 日のログ ファイルです。

Starting attempts to download the xml file at 18:35:00 2012-09-25

18:35:00 2012-09-25: Download attempt number : failed

18:35:03 2012-09-25: Download attempt number : failed

18:35:07 2012-09-25: Download attempt number : successfull

End.

Starting attempts to download the xml file at 19:35:00 2012-09-25

19:35:00 2012-09-25: Download attempt number 1: failed

19:35:03 2012-09-25: Download attempt number 2: failed

19:35:06 2012-09-25: Download attempt number 3: failed

19:35:10 2012-09-25: Download attempt number 4: failed

19:35:13 2012-09-25: Download attempt number 5: failed

19:35:16 2012-09-25: Download attempt number 6: failed

19:35:20 2012-09-25: Download attempt number 7: failed

19:35:23 2012-09-25: Download attempt number 8: successfull

End.

Starting attempts to download the xml file at 20:35:00 2012-09-25

20:35:00 2012-09-25: Download attempt number 1: failed

20:35:04 2012-09-25: Download attempt number 2: failed

20:35:08 2012-09-25: Download attempt number 3: successfull

End.

Starting attempts to download the xml file at 21:35:00 2012-09-25

21:35:00 2012-09-25: Download attempt number 1: failed

21:35:04 2012-09-25: Download attempt number 2: failed

21:35:07 2012-09-25: Download attempt number 3: failed

21:35:11 2012-09-25: Download attempt number 4: successfull

End.

Starting attempts to download the xml file at 22:35:00 2012-09-25

22:35:00 2012-09-25: Download attempt number 1: failed

22:35:04 2012-09-25: Download attempt number 2: failed

22:35:07 2012-09-25: Download attempt number 3: successfull

End.

Starting attempts to download the xml file at 23:35:00 2012-09-25

23:35:00 2012-09-25: Download attempt number 1: failed

23:35:03 2012-09-25: Download attempt number 2: failed

23:35:07 2012-09-25: Download attempt number 3: failed

23:35:10 2012-09-25: Download attempt number 4: failed

23:35:14 2012-09-25: Download attempt number 5: failed

23:35:17 2012-09-25: Download attempt number 6: failed

23:35:21 2012-09-25: Download attempt number 7: successfull

End.

Starting attempts to download the xml file at 00:35:00 2012-09-26

00:35:00 2012-09-26: Download attempt number 1: successfull

End.

Starting attempts to download the xml file at 01:35:00 2012-09-26

01:35:00 2012-09-26: Download attempt number 1: failed

01:35:04 2012-09-26: Download attempt number 2: failed

01:35:07 2012-09-26: Download attempt number 3: failed

01:35:11 2012-09-26: Download attempt number 4: failed

01:35:14 2012-09-26: Download attempt number 5: failed

01:35:18 2012-09-26: Download attempt number 6: failed

01:35:21 2012-09-26: Download attempt number 7: failed

01:35:30 2012-09-26: Download attempt number 8: failed

End.

Starting attempts to download the xml file at 02:35:00 2012-09-26

02:35:00 2012-09-26: Download attempt number 1: failed

02:35:03 2012-09-26: Download attempt number 2: failed

02:35:07 2012-09-26: Download attempt number 3: failed

02:35:10 2012-09-26: Download attempt number 4: failed

02:35:13 2012-09-26: Download attempt number 5: failed

02:35:17 2012-09-26: Download attempt number 6: failed

02:35:20 2012-09-26: Download attempt number 7: failed

02:35:24 2012-09-26: Download attempt number 8: failed

End.

Starting attempts to download the xml file at 03:35:00 2012-09-26

03:35:00 2012-09-26: Download attempt number 1: failed

03:35:04 2012-09-26: Download attempt number 2: failed

03:35:07 2012-09-26: Download attempt number 3: failed

03:35:10 2012-09-26: Download attempt number 4: failed

03:35:14 2012-09-26: Download attempt number 5: failed

03:35:17 2012-09-26: Download attempt number 6: failed

03:35:21 2012-09-26: Download attempt number 7: failed

03:35:30 2012-09-26: Download attempt number 8: failed

End.

Starting attempts to download the xml file at 04:35:00 2012-09-26

04:35:00 2012-09-26: Download attempt number 1: failed

04:35:03 2012-09-26: Download attempt number 2: failed

04:35:07 2012-09-26: Download attempt number 3: failed

04:35:10 2012-09-26: Download attempt number 4: failed

04:35:14 2012-09-26: Download attempt number 5: failed

04:35:17 2012-09-26: Download attempt number 6: failed

04:35:21 2012-09-26: Download attempt number 7: failed

04:35:24 2012-09-26: Download attempt number 8: successfull

End.

Starting attempts to download the xml file at 05:35:00 2012-09-26

05:35:00 2012-09-26: Download attempt number 1: failed

05:35:04 2012-09-26: Download attempt number 2: failed

05:35:08 2012-09-26: Download attempt number 3: failed

05:35:11 2012-09-26: Download attempt number 4: failed

05:35:15 2012-09-26: Download attempt number 5: failed

05:35:18 2012-09-26: Download attempt number 6: failed

05:35:22 2012-09-26: Download attempt number 7: failed

05:35:25 2012-09-26: Download attempt number 8: failed

End.

Starting attempts to download the xml file at 06:35:00 2012-09-26

06:35:00 2012-09-26: Download attempt number 1: failed

06:35:03 2012-09-26: Download attempt number 2: failed

06:35:07 2012-09-26: Download attempt number 3: failed

06:35:10 2012-09-26: Download attempt number 4: failed

06:35:14 2012-09-26: Download attempt number 5: failed

06:35:17 2012-09-26: Download attempt number 6: failed

06:35:21 2012-09-26: Download attempt number 7: failed

06:35:24 2012-09-26: Download attempt number 8: failed

End.

Starting attempts to download the xml file at 07:35:00 2012-09-26

07:35:00 2012-09-26: Download attempt number 1: failed

07:35:04 2012-09-26: Download attempt number 2: failed

07:35:07 2012-09-26: Download attempt number 3: failed

07:35:11 2012-09-26: Download attempt number 4: failed

07:35:14 2012-09-26: Download attempt number 5: failed

07:35:18 2012-09-26: Download attempt number 6: failed

07:35:21 2012-09-26: Download attempt number 7: failed

07:35:24 2012-09-26: Download attempt number 8: failed

End.

Starting attempts to download the xml file at 08:35:00 2012-09-26

08:35:00 2012-09-26: Download attempt number 1: failed

08:35:03 2012-09-26: Download attempt number 2: failed

08:35:06 2012-09-26: Download attempt number 3: failed

08:35:10 2012-09-26: Download attempt number 4: failed

08:35:13 2012-09-26: Download attempt number 5: failed

08:35:16 2012-09-26: Download attempt number 6: failed

08:35:20 2012-09-26: Download attempt number 7: failed

08:35:23 2012-09-26: Download attempt number 8: failed

End.

Starting attempts to download the xml file at 09:35:00 2012-09-26

09:35:00 2012-09-26: Download attempt number 1: failed

09:35:04 2012-09-26: Download attempt number 2: failed

09:35:07 2012-09-26: Download attempt number 3: successfull

End.

Starting attempts to download the xml file at 10:35:00 2012-09-26

10:35:00 2012-09-26: Download attempt number 1: failed

10:35:03 2012-09-26: Download attempt number 2: failed

10:35:06 2012-09-26: Download attempt number 3: failed

10:35:10 2012-09-26: Download attempt number 4: failed

10:35:13 2012-09-26: Download attempt number 5: failed

10:35:17 2012-09-26: Download attempt number 6: failed

10:35:20 2012-09-26: Download attempt number 7: successfull

End.

Starting attempts to download the xml file at 11:35:00 2012-09-26

11:35:00 2012-09-26: Download attempt number 1: failed

11:35:03 2012-09-26: Download attempt number 2: failed

11:35:07 2012-09-26: Download attempt number 3: successfull

End.

Starting attempts to download the xml file at 12:35:00 2012-09-26

12:35:00 2012-09-26: Download attempt number 1: failed

12:35:04 2012-09-26: Download attempt number 2: failed

12:35:07 2012-09-26: Download attempt number 3: failed

12:35:11 2012-09-26: Download attempt number 4: failed

12:35:14 2012-09-26: Download attempt number 5: failed

12:35:17 2012-09-26: Download attempt number 6: failed

12:35:21 2012-09-26: Download attempt number 7: successfull

End.

Starting attempts to download the xml file at 13:35:00 2012-09-26

13:35:00 2012-09-26: Download attempt number 1: failed

13:35:03 2012-09-26: Download attempt number 2: successfull

End.

Starting attempts to download the xml file at 14:35:00 2012-09-26

14:35:00 2012-09-26: Download attempt number 1: failed

14:35:03 2012-09-26: Download attempt number 2: failed

14:35:07 2012-09-26: Download attempt number 3: failed

14:35:10 2012-09-26: Download attempt number 4: successfull

End.

Starting attempts to download the xml file at 15:35:00 2012-09-26

15:35:00 2012-09-26: Download attempt number 1: failed

15:35:03 2012-09-26: Download attempt number 2: failed

15:35:07 2012-09-26: Download attempt number 3: failed

15:35:10 2012-09-26: Download attempt number 4: failed

15:35:13 2012-09-26: Download attempt number 5: failed

15:35:17 2012-09-26: Download attempt number 6: failed

15:35:20 2012-09-26: Download attempt number 7: failed

15:35:24 2012-09-26: Download attempt number 8: failed

End.

Starting attempts to download the xml file at 16:35:00 2012-09-26

16:35:00 2012-09-26: Download attempt number 1: failed

16:35:03 2012-09-26: Download attempt number 2: failed

16:35:07 2012-09-26: Download attempt number 3: successfull

End.

問題は、いくつかの試行の後に完全なファイルを取得できる場合もあれば、完全に失敗する場合もあります。注意すべきもう 1 つの点は、xml が不完全な場合に curl_exec がエラーを返さないことです。

残念ながら、xml を持つサーバーは範囲をサポートしていないため、不完全なファイルを再開することはできません。試行回数の上限を 50 回まで増やすこともできますが、失敗した試行でもスクリプトはまだ一部のデータをダウンロードするため、1MB の xml ファイルの場合、1 回あたり 500KB のダウンロードに 30 回失敗すると、ダウンロードされたはずです。試行が成功した場合、16 MB のデータ。このスクリプトを 1 時間ごとに実行したいので、サーバーの帯域幅に悪影響を与えると思います。

curl が完全なファイルのダウンロードに失敗する理由。最終的に常にファイルを取得するブラウザのように動作させるためのオプションはありますか?

ありがとう。

4

1 に答える 1

1

問題はソース、つまりサーバーにあります。

スクレーパーを実行してみscraperwikiましたが、次のように表示されます。

1枚目のスクリーンショット

また、個人的にxmlを読み込もうとしたときに同じ問題が発生し、3回目にはうまくいきました。

次の図の最初の 2 つの要求では、サーバーが接続を閉じていることがわかりますが、3 番目 (成功した要求) ではありません。

2 番目のスクリーンショット

したがって、問題はサーバーにあり、それがあなたのものでなければ、あなたはそれについて何もできません. (もちろん、これをサーバー管理者に通知することを除いて!)

注: スクレイパーウィキは多くの人から信頼されているため、インターネット接続が非常に優れていると思います。したがって、サーバー障害として安全に非難できます#jboss

于 2012-09-26T17:57:26.103 に答える