matlab - 複雑なcsvファイルを数値ベクトルにMatlabにインポートする方法

Question

文字列、double、char などで構成される複雑な csv ファイルをどのように読み取るべきなのか疑問に思っています。

たとえば、この csv ファイルで数値を抽出できる成功したコマンドを教えてください。

ここをクリックしてください。

例えば：

yield curve data 2013-10-04     
Yields in percentages per annum.        


Parameters - AAA-rated bonds        
Series key   Parameters  Description
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0  2.03555 Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 0 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA1  -2.009068   Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA2  24.54184    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 2 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA3  -21.80556   Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Beta 3 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU1   5.351378    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 1 - Euro, provided by ECB
YC.B.U2.EUR.4F.G_N_A.SV_C_YM.TAU2   4.321162    Euro area (changing composition) - Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous compounding - yield error minimisation - Yield curve parameters, Tau 2 - Euro, provided by ECB

これらは、ファイル内の情報の一部です。そしてcsvread('yc_latest.csv', 6, 1, [6,1,6,1])、値 2.03555 を取得しようとしましたが、次のエラーが発生しました。

   Error using dlmread (line 139)
    Mismatch between file and format string.
    Trouble reading number from file (row 1u, field 3u) ==> "Euro area (changing composition) -
    Government bond, nominal, all issuers whose rating is triple A - Svensson model - continuous
    compounding - yield error minimisation - Yield curve parameters, Beta 0

    Error in csvread (line 50)
        m=dlmread(filename, ',', r, c, rng);

score 2 · Accepted Answer

これはかなりハッキーな解決策です。残念ながら、Matlab は csv ファイルの読み取りにかなりの打撃を与えているため、この種のハッカーは不運にも必需品となっています。明るい面としては、おそらくこの種のコードを 1 回書くだけで済みます。

fid = fopen('yc_latest.csv');   %// open the file

%// parse as csv, skipping the first six lines
contents = textscan(fid, '%s %f %[^\n]', 'HeaderLines', 6); 

%// unpack the fields and give them meaningful names
[seriesKey, parameters, description]   = contents{:};

fclose(fid);                    %// don't forget this!

score 0 · Accepted Answer

Chris のソリューションの代替:

fid=fopen('yc_latest.csv');
Rows = textscan(fid,'%s', 'delimiter','\n'); %Creates a temporary cell array with the rows
fclose(fid);

%looks for the lines with a euro value:
value=strfind(Rows,'Euro'); 
Idx = find(~cellfun('isempty', value)); 

Columns= cellfun(@(x) textscan(x,'%f','delimiter','\t','CollectOutput',1), Rows);
Columns= cellfun(@transpose, Columns, 'UniformOutput', 0);

実際のユーロ値を持つすべての行のインデックスは Idx に保存されます。

score 0 · Accepted Answer

このように使いたいと思うかもしれませんtextscan。

各行は通常の区切り文字 (タブ、スペース) で解析され%*s、最初の要素 (YC.B.U2.EUR.4F.G_N_A.SV_C_YM.BETA0) をスキップし%fて値を取得するために使用される形式はスターです。最後%*[^\n]に残りの行をスキップします。

fid = fopen(filename);                                
C = textscan(fid, '%*s%f%*[^\n]', 'HeaderLines', 6); 
fclose(fid);

values   = C{1};

matlab - 複雑なcsvファイルを数値ベクトルにMatlabにインポートする方法

4 に答える 4

Related

Reference