3

コンマ区切りの値を含む CSV ファイルがいくつかあり、列の値の一部には次のような文字を含めることができます,.<>!/\;&

CSV をコンマ区切り、引用符で囲まれた CSV に変換しようとしています。

サンプルデータ:

DateCreated,DateModified,SKU,Name,Category,Description,Url,OriginalUrl,Image,Image50,Image100,Image120,Image200,Image300,Image400,Price,Brand,ModelNumber
2012-10-19 10:52:50,2013-06-11 02:07:16,34,Austral Foldaway 45 Rotary Clothesline,Home & Garden > Household Supplies > Laundry Supplies > Drying Racks & Hangers,"Watch the Product Video            Plenty of Space to Hang a Family Wash  Austral's Foldaway 45 rotary clothesline is a folding head rotary clothes hoist beautifully finished in either Beige or Heritage Green.  Even though the Foldaway 45 is compact, you still get a large 45 metres of line space, big enough for a full family wash.  If you want the advantage of a rotary hoist, but dont want to lose your yard, then the Austral Foldaway 45 is the clothesline for you.&nbsp;  Installation Note:&nbsp;A core hole is only required when installing into existing concrete, e.g. a pathway. Not required in the ground(grass/soil).  To watch video on YouTube, click the following link:&nbsp;Austral Foldaway 45 Rotary Clothesline      &nbsp;            //           Customer Video Reviews  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;",https://track.commissionfactory.com.au/p/10604/1718695,http://www.lifestyleclotheslines.com.au/austral-foldaway-45-rotary-clothesline/,http://content.commissionfactory.com.au/Products/7228/1718695.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@50x50.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@100x100.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@120x120.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@200x200.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@300x300.jpg,http://content.commissionfactory.com.au/Products/7228/1718695@400x400.jpg,309.9000 AUD,Austral,FA45GR

そして、私が達成しようとしている出力は

"DateCreated","DateModified","SKU","Name","Category","Description","Url","OriginalUrl","Image","Image50","Image100","Image120","Image200","Image300","Image400","Price","Brand","ModelNumber"
"2012-10-19 10:52:50","2013-06-11 02:07:16","34","Austral Foldaway 45 Rotary Clothesline","Home & Garden > Household Supplies > Laundry Supplies > Drying Racks & Hangers","Watch the Product Video            Plenty of Space to Hang a Family Wash  Austral's Foldaway 45 rotary clothesline is a folding head rotary clothes hoist beautifully finished in either Beige or Heritage Green.  Even though the Foldaway 45 is compact, you still get a large 45 metres of line space, big enough for a full family wash.  If you want the advantage of a rotary hoist, but dont want to lose your yard, then the Austral Foldaway 45 is the clothesline for you.&nbsp;  Installation Note:&nbsp;A core hole is only required when installing into existing concrete, e.g. a pathway. Not required in the ground(grass/soil).  To watch video on YouTube, click the following link:&nbsp;Austral Foldaway 45 Rotary Clothesline      &nbsp;            //           Customer Video Reviews  &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;","https://track.commissionfactory.com.au/p/10604/1718695","http://www.lifestyleclotheslines.com.au/austral-foldaway-45-rotary-clothesline/","http://content.commissionfactory.com.au/Products/7228/1718695.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@50x50.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@100x100.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@120x120.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@200x200.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@300x300.jpg","http://content.commissionfactory.com.au/Products/7228/1718695@400x400.jpg","309.9000 AUD","Austral","FA45GR"

どんな支援も大歓迎です。

4

3 に答える 3

3

まず、各フィールドに二重引用符を追加するだけの簡単な (そして「十分ではない」) ソリューションを試してみましょう (すでに二重引用符が含まれているものも含まれます! これはあなたが望むものではありません)。

sed -r 's/([^,]*)/"\1"/g'

最初の部分はコンマを含まないシーケンスを探し、2 番目の部分はそれらを二重引用符で囲み、最後の 'g' は行ごとに複数回実行することを意味します

これで回ります

abc,345, some words ,"some text","text,with,commas"

「abc」、「345」、「一部の単語」、「「一部のテキスト」」、「「テキスト」、「with」、「カンマ」」に変換

注意すべき点がいくつかあります。

  • 「いくつかの単語」をスペースで正しく囲みますが、最初と最後のスペースも囲みます。問題ないと思いますが、そうでない場合は修正できます

  • フィールドにすでに引用符が含まれている場合は、再度引用されますが、これは悪いことです。修正が必要

  • フィールドに既に引用符があり、内部テキストにカンマ (フィールドの区切りとは見なされません) が含まれている場合、これらのカンマも引用されます。これも修正が必要

したがって、2 つの異なる正規表現を照合する必要があります - 引用符で囲まれた文字列またはコンマのないフィールドのいずれかです。

sed -r 's/([^,"]*|"[^"]*")/"\1"/g'

結果は次のようになります

"abc","345"," some words ",""some text"",""text,with,commas""

ご覧のとおり、元の引用テキストに二重引用符があります。これは、2 番目の sed コマンドで削除する必要があります。

sed -r 's/([^,"]*|"[^"]*")/"\1"/g' | sed 's/""/"/g'

その結果、

"abc","345"," some words ","some text","text,with,commas"

わーい!

于 2013-07-24T15:28:06.170 に答える
0

ソリューションを試してください。フィールド内のコンマを正しく処理するパーサーを使用するようになったため、以前のモジュールText::CSV_XSが機能する必要があります。

#!/usr/bin/env perl

use strict;
use warnings;
use Text::CSV_XS;

die qq|Usage: perl $0 <csv-file>\n| unless @ARGV == 1;

open my $fh, '<', shift or die qq|ERROR: Could not open input file\n|;

my $csv = Text::CSV_XS->new( {
        always_quote => 1,
} );

while ( my $row = $csv->getline( $fh ) ) { 
        $csv->print( *STDOUT, $row );
        print "\n";
}
$csv->eof;
close $fh;
于 2013-07-24T09:01:51.033 に答える