c++ - 一時配列を使用してコードを削減する-非効率的ですか？

Question

私はc++（およびSO）を初めて使用するので、これが明らかな場合は申し訳ありません。

コードで一時配列を使用して、繰り返しを減らし、複数のオブジェクトに対して同じことを簡単に行えるようにしました。したがって、代わりに：

MyObject obj1, obj2, obj3, obj4;

obj1.doSomming(arg);
obj2.doSomming(arg);
obj3.doSomming(arg);
obj4.doSomming(arg);

私がやっている：

MyObject obj1, obj2, obj3, obj4;
MyObject* objs[] = {&obj1, &obj2, &obj3, &obj4};

for (int i = 0; i !=4; ++i)
    objs[i]->doSomming(arg);

これはパフォーマンスに悪影響を及ぼしますか？同様に、それは不必要なメモリ割り当てを引き起こしますか？それは良い習慣ですか？ありがとう。

score 6 · Accepted Answer

一般に、このレベルのパフォーマンスについて心配する必要はありません。パフォーマンスの問題になることがよくありますが、特にパフォーマンスの最適化の経験があまりない場合は、予想とはまったく異なることがわかります。

常に最初に明確なコードを書くことを考えるべきであり、パフォーマンスが重要な場合は、アルゴリズムの観点から考える必要があります（つまり、big-O）。次に、パフォーマンスを測定し、最適化に力を注ぐ場所をガイドに任せる必要があります。

これで、中間配列を避けて元のオブジェクトに配列を使用するだけで、コードをさらに明確でわかりやすくすることができます。

MyObject obj[4];

for (int i = 0; i !=4; ++i)
  objs[i].doSomming(arg);

しかし、いいえ、最適化コンパイラは一般的にこれに問題はないはずです。

たとえば、コードを取得すると、次のようになります。

struct MyObject {
    void doSomming() {
        std::printf("Hello\n");
    }
};

void foo1() {
    MyObject obj1, obj2, obj3, obj4;

    obj1.doSomming();
    obj2.doSomming();
    obj3.doSomming();
    obj4.doSomming();
}

void foo2() {
    MyObject obj1, obj2, obj3, obj4;
    MyObject* objs[] = {&obj1, &obj2, &obj3, &obj4};

    for (int i = 0; i !=4; ++i)
        objs[i]->doSomming();
}

void foo3() {
    MyObject obj[4];

    for (int i = 0; i !=4; ++i)
        obj[i].doSomming();
}

LLVM IRを生成します（実際のアセンブリよりもコンパクトであるため）。で次のようになり-O3ます。

define void @_Z4foo1v() nounwind uwtable ssp {
entry:
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i1 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i2 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i3 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  ret void
}

define void @_Z4foo2v() nounwind uwtable ssp {
entry:
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.1 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.2 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.3 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  ret void
}

define void @_Z4foo3v() nounwind uwtable ssp {
entry:
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.1 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.2 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %puts.i.3 = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  ret void
}

ループで-O3展開され、コードは元のバージョンと同じです。ループは展開されませんが-Os、ポインタの間接参照や配列でさえ、インライン化後に不要になるため、表示されなくなります。

define void @_Z4foo2v() nounwind uwtable optsize ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %i.05 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %inc = add nsw i32 %i.05, 1
  %cmp = icmp eq i32 %inc, 4
  br i1 %cmp, label %for.end, label %for.body

for.end:                                          ; preds = %for.body
  ret void
}

define void @_Z4foo3v() nounwind uwtable optsize ssp {
entry:
  br label %for.body

for.body:                                         ; preds = %entry, %for.body
  %i.03 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %puts.i = tail call i32 @puts(i8* getelementptr inbounds ([6 x i8]* @str, i64 0, i64 0)) nounwind
  %inc = add nsw i32 %i.03, 1
  %cmp = icmp eq i32 %inc, 4
  br i1 %cmp, label %for.end, label %for.body

for.end:                                          ; preds = %for.body
  ret void
}

c++ - 一時配列を使用してコードを削減する-非効率的ですか？

1 に答える 1

Related

Reference