0

I am trying to parse a csv file in PHP. My problem is the following: If there is a field stating with "é" or "í", the parser eats all those characters from the start of a field.

The problem is only present on my host, it's not present when using XAAMP locally (newer PHP version). The PHP version on my host with the bug is: 5.2.6-1+lenny9

The code is nothing but one line of fgetcsv.

while (($program = fgetcsv($handle, 0, ',', '"')) !== FALSE) {...}

This code already outputs the "eaten" version, for example when viewed by print_r.

Is there anything I can do? It must be a bug in PHP something, which has been fixed since then. One alternative option I found out was to just escape the sequence, by putting a comma at the end of a field (my csv source, Google Spreadsheets automatically wraps the field in " " if there is a , present inside). Then I can write a function that deletes the last character if it's a comma (any help on this?).

Is is (or was it) a known bug in PHP, and were there any solutions for this? If not, can you help me with the delete-last-character-if-its-a-comma function?

4

1 に答える 1

0

実際の問題は、Web サーバーがマルチバイト文字セットを禁止するロケールで実行されていることです。に設定するCと、同じ結果が得られます。

<?php print_r(str_getcsv("ée, íi, zz, bb, "));

$   LC_ALL=C   php test_getcsv.php

é田んぼのカットí[0] => e [1] => i [2] => zz

しかし、次のように実行すると:

$   LC_ALL=de_DE.UTF-8  php test_getcsv.php

正しい結果が得られます。[0] => ée [1] => íi [2] => zz

サーバーで利用可能なロケールを調査しsetlocale(LC_ALL, "xy_zz.UTF-8")、スクリプトの開始時に使用する必要があります。

于 2011-03-29T10:12:18.423 に答える