matlab - MATLAB でデータ圧縮を実現する簡単な方法は?

Question

私は、データを含む大きな行列を取得し、何らかの形でデータを圧縮して、より扱いやすいサイズにする必要がある課題に取り組んでいます。ただし、データは別のものへの入力として再利用する必要があります。(ツールボックスなど)。これが私がこれまでに行ったことです。この例の行列では、find 関数を使用して、値がゼロ以外のすべてのインデックスの行列を取得します。しかし、元の図情報が保持されるように、それを入力として使用する方法についてはわかりません。他の人々がこれに対する他のより良い（単純な）解決策を持っているかどうか、私は興味がありました.

number_1 =     [0 0 0 0 0 0 0 0 0 0 ...
                0 0 1 1 1 1 0 0 0 0 ...     
                0 1 1 0 1 1 0 0 0 0 ...
                0 1 1 0 1 1 0 0 0 0 ...
                0 0 0 0 1 1 0 0 0 0 ...
                0 0 0 0 1 1 0 0 0 0 ...
                0 0 0 0 1 1 0 0 0 0 ...
                0 0 0 0 1 1 0 0 0 0 ...
                0 0 0 0 1 1 0 0 0 0 ...
                0 0 0 0 1 1 0 0 0 0 ...
                0 0 0 0 1 1 0 0 0 0 ...
                0 1 1 1 1 1 1 1 1 0 ...
                0 0 0 0 0 0 0 0 0 0]; 

number = number_1;
compressed_number = find(number);
compressed_number = compressed_number';
disp(compressed_number)

score 1 · Accepted Answer

1 と 0 しかなく、フィルファクターがそれほど小さくない場合、最善の方法は、数値を 2 進数として格納することです。元のサイズが必要な場合は、個別に保存してください。コードを拡張し、中間ステップをもう少し明確に示し、さまざまな配列に必要なストレージの量も示しました。注 - データを 13x10 の配列に再形成したのは、表示が改善されたためです。

number_1 = [0 0 0 0 0 0 0 0 0 0 ...
    0 0 1 1 1 1 0 0 0 0 ...
    0 1 1 0 1 1 0 0 0 0 ...
    0 1 1 0 1 1 0 0 0 0 ...
    0 0 0 0 1 1 0 0 0 0 ...
    0 0 0 0 1 1 0 0 0 0 ...
    0 0 0 0 1 1 0 0 0 0 ...
    0 0 0 0 1 1 0 0 0 0 ...
    0 0 0 0 1 1 0 0 0 0 ...
    0 0 0 0 1 1 0 0 0 0 ...
    0 0 0 0 1 1 0 0 0 0 ...
    0 1 1 1 1 1 1 1 1 0 ...
    0 0 0 0 0 0 0 0 0 0];

n1matrix = reshape(number_1, 10, [])'; % make it nicer to display;
% transpose because data is stored column-major (row index changes fastest).

disp('the original data in 13 rows of 10:');
disp(n1matrix);

% create a matrix with 8 rows and enough columns
n1 = numel(number_1);
nc = ceil(n1/8); % "enough columns"
npad = zeros(8, nc);
npad(1:n1) = number_1; % fill the first n1 elements: the rest is zero

binVec = 2.^(7-(0:7)); % 128, 64, 32, 16, 8, 4, 2, 1 ... powers of two

compressed1 = uint8(binVec * npad); % 128 * bit 1 + 64 * bit 2 + 32 * bit 3...

% showing what we did...
disp('Organizing into groups of 8, and calculated their decimal representation:')
for ii = 1:nc
    fprintf(1,'%d    ', npad(:, ii));
    fprintf(1, '=  %d\n', compressed1(ii));
end

% now the inverse operation: using dec2bin to turn decimals into binary
% this function returns strings, so some further processing is needed
% original code used de2bi (no typo) but that requires a communications toolbox
% like this the code is more portable
decompressed = dec2bin(compressed1);
disp('the string representation of the numbers recovered:');
disp(decompressed); % this looks a lot like the data in groups of 8, but it's a string

% now we turn them back into the original array
% remember it is a string right now, and the values are stored
% in column-major order so we need to transpose
recovered = ('1'==decompressed'); % all '1' characters become logical 1
display(recovered); 

% alternative solution #1: use logical array
compressed2 = (n1matrix==1);
display(compressed2);

recovered = double(compressed2); % looks just the same...

% other suggestions 1: use find
compressed3 = find(n1matrix);  % fewer elements, but each element is 8 bytes
compressed3b = uint8(compressed);  % if you know you have fewer than 256 elements

% or use `sparse`
compressed4 = sparse(n1matrix);

% or use logical sparse:
compressed5 = sparse((n1matrix==1));


whos number_1 comp*


the original data in 13 rows of 10:

     0     0     0     0     0     0     0     0     0     0
     0     0     1     1     1     1     0     0     0     0
     0     1     1     0     1     1     0     0     0     0
     0     1     1     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     1     1     1     1     1     1     1     1     0
     0     0     0     0     0     0     0     0     0     0

Organizing into groups of 8, and their decimal representation:
0    0    0    0    0    0    0    0    =  0
0    0    0    0    1    1    1    1    =  15
0    0    0    0    0    1    1    0    =  6
1    1    0    0    0    0    0    1    =  193
1    0    1    1    0    0    0    0    =  176
0    0    0    0    1    1    0    0    =  12
0    0    0    0    0    0    1    1    =  3
0    0    0    0    0    0    0    0    =  0
1    1    0    0    0    0    0    0    =  192
0    0    1    1    0    0    0    0    =  48
0    0    0    0    1    1    0    0    =  12
0    0    0    0    0    0    1    1    =  3
0    0    0    0    0    0    0    0    =  0
1    1    0    0    0    0    0    1    =  193
1    1    1    1    1    1    1    0    =  254
0    0    0    0    0    0    0    0    =  0
0    0    0    0    0    0    0    0    =  0

the string representation of the numbers recovered:
00000000
00001111
00000110
11000001
10110000
00001100
00000011
00000000
11000000
00110000
00001100
00000011
00000000
11000001
11111110
00000000
00000000

compressed2 =

     0     0     0     0     0     0     0     0     0     0
     0     0     1     1     1     1     0     0     0     0
     0     1     1     0     1     1     0     0     0     0
     0     1     1     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     1     1     1     1     1     1     1     1     0
     0     0     0     0     0     0     0     0     0     0


recovered =

     0     0     0     0     0     0     0     0     0     0
     0     0     1     1     1     1     0     0     0     0
     0     1     1     0     1     1     0     0     0     0
     0     1     1     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     0     0     0     1     1     0     0     0     0
     0     1     1     1     1     1     1     1     1     0
     0     0     0     0     0     0     0     0     0     0

  Name              Size             Bytes  Class      Attributes

  compressed1       1x17                17  uint8                
  compressed2      13x10               130  logical              
  compressed3      34x1                272  double          
  compressed3b     34x1                 34  uint8     
  compressed4      13x10               632  double     sparse    
  compressed5      13x10               394  logical    sparse    
  number_1          1x130             1040  double

ご覧のとおり、元の配列には 1040 バイトが必要です。圧縮された配列は 17 かかります。ほぼ 64 倍の圧縮が得られます (132 は 8 の倍数ではないため、完全ではありません)。非常にまばらなデータセットのみが、他の方法でより適切に圧縮されます。近づく唯一のもの（そしてそれは超高速です）は

compressed3b = uint8(find(number_1));

34 バイトで、小さな配列 (< 256 要素) の候補であることは間違いありません。

注 - データを Matlab に保存すると ( を使用save(fileName, 'variableName'))、圧縮が自動的に行われます。これにより、興味深い驚くべき結果が得られます。上記の各変数を取得し、Matlab のを使用してファイルに保存するとsave、ファイルサイズ (バイト単位) は次のようになります。

number_1     195
compressed1  202
compressed2  213
compressed3  219
compressed3b 222
compressed4  256
compressed5  252

一方、バイナリファイルを自分で作成する場合

fid = fopen('myFile.bin', 'wb');
fwrite(fid, compressed1)
fclose(fid)

デフォルトでは writeuint8になるため、ファイルサイズは 130、17、130、34、34 になります。スパース配列はこの方法で書き込むことはできません。それはまだ最高の圧縮を持つ「複雑な」圧縮を示しています。

matlab - MATLAB でデータ圧縮を実現する簡単な方法は?

2 に答える 2

Related

Reference