merge - SAS データセットのインターリーブ (共通の患者番号による)

Question

SAS データセットにインターリーブする必要がありますが、患者 ID が両方に存在する場合のみです。マージステートメントでは "in" と "if" を使用しますが、データをスタックする必要があります。データは変数に関して同等です。

何か案は？

score 1 · Accepted Answer

これはちょっとした回避策ですが、データセットが同じである場合は、以下を試すことができます。変数 ID で一致していると仮定します。

proc sql;
select t1.*
from
  TABLE_A t1
where ID in (select ID from TABLE_B)
union all
select t2.*
from
  TABLE_B t2  
where ID in (select ID from TABLE_A)
;quit;

score 0 · Accepted Answer

どちらかのデータセットに正確に 1 つの行がある場合、データステップでこれを行うのはかなり簡単です。

data have_1;
  do id = 1 to 20 by 2;
    output;
  end;
run;

data have_2;
  do id = 1 to 20 by 3;
    output;
  end;
run;

data want;
  set have_1 have_2;
  by id;
  if not (first.id and last.id);
run;

基本的に、その ID の最初または最後の行でない場合にのみ行を出力します。これは、両方のデータセットにある場合に当てはまります。ID ごとにいずれかのデータセットに複数の行がある場合、これは機能しません。

score 0 · Accepted Answer

一方または両方のデータセットで ID ごとに重複がある場合は、他の解決策がたくさんあります。これは、MERGE のアイデアに最も似ているものです。

Double DoW ループでは、データセットを 2 回ループします。1 回は状態をチェックし、次に 1 回は実際に出力します。これにより、各 ID のすべての行を確認し、条件が有効かどうかを確認してから、すべての行をもう一度確認してその条件に対応できます。

data have_1;
  do id = 1 to 20 by 2;
    output;
    output;
  end;
run;

data have_2;
  do id = 1 to 20 by 3;
    output;
    output;
  end;
run;



data want;
  _a=0;  *initialize temporary variables;
  _b=0;  *they will be cleared once for each ID;
  do _n_ = 1 by 1 until (last.id);
    set have_1(in=a) have_2(in=b);
    by id;
    if a then _a=1;  *save that value temporarily;
    if b then _b=1;  *again temporary;
  end;
  do _n_ = 1 by 1 until (last.id);
    set have_1 have_2;
    by id;
    if _a and _b then output;  *only output the rows that have both _a and _b;
  end;
run;

merge - SAS データセットのインターリーブ (共通の患者番号による)

3 に答える 3

Related

Reference