c++ - Realloc()/C++ での文字列実装用のオブジェクトのサイズ変更

Question

それらがメモリで表現される場合、C++ オブジェクトは C 構造体と同じですか?

たとえば、C では、次のようなことができます。

struct myObj {
       int myInt;
       char myVarChar;
};

int main() {
       myObj * testObj = (myObj *) malloc(sizeof(int)+5);
       testObj->myInt = 3;
       strcpy((char*)&testObj->myVarChar, "test");
       printf("String: %s", (char *) &testObj->myVarChar);
}

+C++ では組み込み型の演算子をオーバーロードできないと思いますchar *。

だから私は余分なオーバーヘッドがない独自の軽量文字列クラスを作成したいと思いますstd::string。std::string連続して表されていると思います：

(int)length, (char[])data

まったく同じ機能が必要ですが、長さの前に付けることはありません (8 バイトのオーバーヘッドを節約できます)。

テストに使用しているコードは次のとおりですが、セグメンテーション違反が発生します

#include <iostream>
using namespace std;
class pString {
    public:
        char c;
        pString * pString::operator=(const char *);
};


pString * pString::operator=(const char * buff) {

    cout << "Address of this: " << (uint32_t) this << endl;
    cout << "Address of this->c: " << (uint32_t) &this->c << endl;

    realloc(this, strlen(buff)+1);
    memcpy(this, buff,  strlen(buff));
    *(this+strlen(buff)) = '\0';

    return this;
};

struct myObj {
        int myInt;
        char myVarChar;
};

int main() {

    pString * myString = (pString *) malloc(sizeof(pString));
    *myString = "testing";
    cout << "'" << (char *) myString << "'";    
}

編集：誰も私がやりたいことを本当に理解していません。はい、クラス内の文字列へのポインターを持つことができることはわかっていますが、それはプレーンな cstring よりも 8 バイト高価です。まったく同じ内部表現が必要でした。でも試してくれてありがとう

編集:私が達成したかったことの最終結果は、strcat などを使用する場合と比較して、追加のメモリ使用量なしで + 演算子を使用できることでした。

const char * operator+(const char * first, const char * second);

score 16 · Accepted Answer

文字列クラスを書くのに時間を無駄にすべきではありません - そもそも人々がそれらを書くのに時間を費やしたのには理由があり、彼らがそれらを書いたのは、数時間で簡単に改善できる難読化されオーバーヘッドのある大きなコードを作成したかったからだと考えるのは単純です. .

たとえば、コードは代入演算子のメモリ再割り当てに二次的な複雑さを持っています-1文字大きい文字列の各割り当ては、1バイト大きい新しいメモリブロックを使用し、このような「少数の」割り当ての後に大きなメモリ断片化をもたらします-保存します数バイトですが、アドレス空間とメモリページの断片化のために数メガバイトを失う可能性があります。

また、このように設計すると、ほとんどの場合、追加された文字列をコピーするだけでなく、常に文字列全体をコピーする必要があるため、 += 演算子を効率的に実装する方法がありません。したがって、小さな文字列を大きな文字列に追加すると、再び二次複雑度に達します。数回。

申し訳ありませんが、あなたのアイデアは、std::string のような典型的な文字列の実装よりも、維持するのがひどくなり、効率が桁違いになる可能性が非常に高いようです。

心配しないでください。これは、「標準コンテナの独自のより良いバージョンを作成する」という実質的にすべての優れたアイデアに当てはまります:)

score 8 · Accepted Answer

struct myObj {
   //...
   char myVarChar;
};

これは機能しません。固定サイズの配列、charへのポインター、またはstructhackを使用する必要があります。これにポインタを割り当てることはできませんmyVarChar。

したがって、std::stringに余分なオーバーヘッドがない独自の軽量文字列クラスを作成したいと思います。

どのような余分なオーバーヘッドを参照していますか？std::string本当にボトルネックであるかどうかを確認するためにテストおよび測定しましたか？

std::stringは連続して表されていると思います

はい、ほとんどの場合、文字バッファ部分です。ただし、次のとおりです。

（int）length（char []）data

標準では必要ありません。翻訳：文字列の実装では、データのこの特定のレイアウトを使用する必要はありません。そして、それは追加のデータを持っているかもしれません。

これで、軽量の文字列クラスにエラーが発生します。

class pString {
public:
    char c; // typically this is implementation detail, should be private
    pString * pString::operator=(const char *); 
    // need ctors, dtors at least as well
    // won't you need any functions on strings?
};

次のようなことを試してください。

/* a light-weight string class */
class lwstring { 
  public:
     lwstring(); // default ctor
     lwstring(lwstring const&); // copy ctor
     lwstring(char const*); // consume C strings as well
     lwstring& operator=(lwstring const&); // assignment
     ~lwstring(); // dtor
     size_t length() const; // string length
     bool empty() const; // empty string?
  private:
     char *_myBuf;
     size_t _mySize;
};

score 5 · Accepted Answer

わお。あなたがやろうとしているのは C++ の完全な乱用であり、それが機能するかどうかは完全にコンパイラに依存し、いつか TheDailyWTF にたどり着くでしょう。

segfault が発生する理由は、おそらく operator= がオブジェクトを別のアドレスに再割り当てしているが、メインの myString ポインターを更新していないためです。コンストラクターが呼び出されたことがないため、この時点でそれをオブジェクトと呼ぶことさえためらっています。

あなたがしようとしているのは、 pString を文字列へのよりスマートなポインターにすることだと思いますが、それはすべて間違っています。ちょっとやってみよう。

#include <iostream>
using namespace std;
class pString {
    public:
        char * c;
        pString & operator=(const char *);
        const char * c_str();
};


pString & pString::operator=(const char * buff) {

    cout << "Address of this: " << (uint32_t) this << endl;
    cout << "Address of this->c: " << (uint32_t) this->c << endl;

    c = (char *) malloc(strlen(buff)+1);
    memcpy(c, buff,  strlen(buff));
    *(c+strlen(buff)) = '\0';

    return *this;
};

const char * pString::c_str() {
    return c;
}

int main() {

    pString myString;
    myString = "testing";
    cout << "'" << myString.c_str() << "'";

}

今は malloc を使用せず、代わりに new/delete を使用しますが、これを可能な限りオリジナルに近づけました。

クラスのポインターのスペースを無駄にしていると思うかもしれませんが、そうではありません。以前にメインに保持していたポインターと交換しています。この例で明確になることを願っています。変数は同じサイズであり、malloc/realloc によって割り当てられる追加メモリの量も同じです。

pString myString;
char * charString;
assert(sizeof(myString) == sizeof(charString));

PS このコードにはまだ多くの作業が必要であり、穴がたくさんあることを指摘しておく必要があります。手始めに、ポインターを初期化するためのコンストラクターと、完了時にポインターを解放するためのデストラクタが必要です。operator+ を独自に実装することもできます。

score 2 · Accepted Answer

それらがメモリで表される場合、C構造体と同じオブジェクトC++オブジェクトです。

厳密に言えば、違います。一般的に、はい。C ++クラスと構造体は、メモリレイアウトがC構造体と同じですが、次の点が異なります。

ビットフィールドには異なるパッキングルールがあります
サイズはコンパイル時に固定されます
仮想関数がある場合、コンパイラはメモリレイアウトにvtableエントリを追加します。
オブジェクトが基本クラスを継承する場合、新しいクラスのレイアウトは、vtableがあればそれも含めて、基本クラスのレイアウトに追加されます。

C++では組み込みのchar*型の+演算子をオーバーロードできないと思います。したがって、std::stringに余分なオーバーヘッドがない独自の軽量文字列クラスを作成したいと思います。std::stringは連続して表されていると思います

operator+タイプのオーバーロードを作成できますchar*。通常の動作はポインタ演算です。文字列にデータを追加するためのstd::stringオーバーロード。文字列は、C文字列と追加情報としてメモリに保存されます。メンバー関数は、内部配列へのポインターを返します。operator+char*c_str()char

Cの例では、未定義の動作に依存しています。それは好きではありませんrealloc。それは悪いこと、つまり奇妙なセグフォールトを引き起こす可能性があります。

あなたのC++の例は、悪いこともやっていrealloc(this)ます。代わりに、を実行し、の代わりに文字を格納するためのバッファをchar*取得する必要があります。そのようなaの動作は定義されていません。new char[]realloc()realloc

score 2 · Accepted Answer

クラスの定義/使用法に多くの間違いがあります。文字列を格納する場合は、個々のcharではなく、char*メンバーなどのポインタ型を使用する必要があります。単一の文字を使用するということは、メモリの1文字だけが割り当てられることを意味します。

もう1つの間違いは、これに対して再割り当てを行う割り当てコードです。割り当てられたメモリを変更できる可能性がありますが、この値は変更できません。これを達成するには、結果をこれに割り当てる必要があります（this = (*pString)realloc(this, strlen(buff+1));）。とにかく、これは本当に悪い習慣です。char*メンバーでreallocを使用する方がはるかに優れています。

残念ながら、C ++本体には、reallocまたはexpandの代替手段がなく、代わりにnewとdeleteを使用して、自分でコピーする必要があります。

score 2 · Accepted Answer

2

なぜクラスを使ってCで書くのですか、なぜC ++を使わないのですか？

于 2009-03-16T19:18:42.813 に答える

score 2 · Accepted Answer

「これ」があなたが思うように機能するとは思いません。

具体的には、メンバー関数内のより大きなバッファーを指すように this を再割り当てすることはできません。これは、そのメンバー関数が呼び出されたものは、まだ古い「this」へのポインターを持っているためです。参照渡しではないため、更新する方法はありません。

これを回避する明白な方法は、クラスがバッファーへのポインターを保持し、それを再割り当てすることです。ただし、文字列クラスを再実装することは、後で頭を悩ませる良い方法です。単純なラッパー関数は、おそらくあなたが望んでいたことを達成するでしょう (「strcat を使用する場合と比較して余分なメモリを使用せずに + 演算子を使用できる」ことが本当にあなたが望んでいたことであると仮定します):

void concatenate(std::string& s, const char* c) {
    s.reserve(s.size() + strlen(c));
    s.append(c);
}

とにかく、追加が内部的にそれを行う可能性があります。

score 2 · Accepted Answer

C または C++ のどちらでも、オブジェクト/構造体のサイズを変更することはできません。それらのサイズはコンパイル時に固定されます。

score 1 · Accepted Answer

C++ オブジェクトを再割り当てすることはできません。他の人が指摘してthisいるように、実際には変更できるポインターではないため、アクセスできる領域を指すという保証はありませんrealloc。

連結に対する 1 つの解決策は、必要になるまで実際の連結を延期するクラス階層を実装することです。

このようなもの

class MyConcatString;
class MyString {
public:
  MyString(const MyConcatString& c) {
    reserve(c.l.length()+c.r.lenght());
    operator = (l);
    operator += (r);   
  }
  MyConcatString operator + (const MyString& r) const {
    return MyConcatString(*this, r);
  }
};

class MyConcatString {
public:
  friend class MyString;
  MyConcatString(const MyString& l, const MyString& r):l(l), r(r) {};
  ...
  operator MyString () {
    MyString tmp;
    tmp.reserve(l.length()+r.length());
    tmp = l;
    tmp += r;
    return tmp;
  }
private:
  MyString& l;
  MyString& r;
}

だからあなたが持っているなら

MyString a = "hello";
MyString b = " world";
MyString c = a + b;

MyString c = MyConcatString(a, b); になります。

詳細については、「C++ プログラミング言語」を参照してください。

他の解決策は、 char* を構造体内にラップすることですが、このアイデアが気に入らないようです。

しかし、どのソリューションを選択しても、C++ のオブジェクトは再配置できません。

score 1 · Accepted Answer

 #include <iostream>
    using namespace std;
    class pString {
        public:
            char c[1];
            pString * pString::operator=(const char *);
    };


    pString * pString::operator=(const char * buff) {

        cout << "Address of this: " << (uint32_t) this << endl;
        cout << "Address of this->c: " << (uint32_t) &this->c << endl;

        realloc(this->c, strlen(buff)+1);
        memcpy(this->c, buff,  strlen(buff));
        *(this->c+strlen(buff)) = '\0';

        return this;
    };

    struct myObj {
            int myInt;
            char myVarChar;
    };

    int main() {

        pString * myString = (pString *) malloc(sizeof(pString));
        *myString = "testing vijay";
        cout << "'" << ((char*)myString << "'";
    }


This should work. But its not advisable.

score 1 · Accepted Answer

これはモックアップであるため、const の正確性の欠如を気にしないでください。

class light_string {
public:
    light_string(const char* str) {
        size_t length = strlen(str);
        char*  buffer = new char[sizeof(size_t) + length + 1];

        memcpy(buffer, &length, sizeof(size_t));
        memcpy(buffer + sizeof(size_t), str, length);
        memset(buffer + sizeof(size_t) + length, 0, 1);

        m_str = buffer + sizeof(size_t);
    }

    ~light_string() {
        char* addr = m_str - sizeof(size_t);
        delete [] addr;
    }

    light_string& operator =(const char* str) {
        light_string s = str;
        std::swap(*this, s);

        return *this;
    }

    operator const char*() {
        return m_str;
    }

    size_t length() {
        return
            *reinterpret_cast<size_t *>(m_str - sizeof(size_t));
    }

private:
    char* m_str;
};


int main(int argc, char* argv[]) 
{
    cout<<sizeof(light_string)<<endl;

    return 0;
}

score 1 · Accepted Answer

「this」ポインターを移動しています。それはうまくいきません。あなたが本当に欲しいのは、バッファのラッパーだけだと思います。

score 1 · Accepted Answer

あなたがやりたいことは、C++では機能せず、機能しません。あなたが探しているのは、柔軟な配列の C99 機能です。これが C99 でうまく機能する理由は 2 つあります。1 つ目は組み込みコンストラクタがないこと、2 つ目は (少なくとも言語機能としては) 継承がないことです。クラスが別のクラスから継承する場合、サブクラスによって使用されるメモリは、親クラスのメモリによってパックされますが、柔軟な配列は構造体/クラスの最後にある必要があります。

class pString {
    char txt[];
}

class otherString : pString { // This cannot work because now the
    size_t len;               // the flexible array is not at the
}                             // end

C++ の専門家によって書かれた std::string を例にとってみましょう。彼らは理由もなく「優れたトリック」を除外したわけではないと確信しています。プログラムでこれらの文字列がうまく機能しない場合は、代わりに単純な C 文字列を使用してください。もちろん、必要な API は提供されません。

score 1 · Accepted Answer

#include <iostream>
using namespace std;
class pString {
public:
    char c;
    pString * pString::operator=(const char *);
};

pString * pString::operator=(const char * buff) {

    cout << "Address of this: " << (uint32_t) this << endl;
    cout << "Address of this->c: " << (uint32_t) &this->c << endl;

    char *newPoint = (char *)realloc(this, strlen(buff)+1);
    memcpy(newPoint, buff,  strlen(buff));
    *((char*)newPoint+strlen(buff)) = '\0';

    cout << "Address of this After: " << (uint32_t) newPoint << endl;

    return (pString*)newPoint;
};

int main() {

    pString * myString = (pString *) malloc(sizeof(pString));
    *myString = "testing";

    cout << "Address of myString: " << (uint32_t) myString << endl;

    cout << "'" << (char *) myString << "'";    
}

realloc がポインターを再割り当てしない場合に機能します。

this のアドレス: 1049008 this->c のアドレス: 1049008 this のアドレス After: 1049008 myString のアドレス: 1049008 'testing'

動作しますが、次の場合は失敗します

this のアドレス: 1049008 this->c のアドレス: 1049008 this のアドレス After: 1049024 myString のアドレス: 1049008 ''

明らかな解決策は、

this = (pString*) newPoint;

しかし、コンパイラは代入の無効な左辺値について不平を言います。これを更新する正しい方法はありますか（完全を期すために、誰もが嫌うように見えるので、コードを使用することはないと思います）。ありがとう

score 1 · Accepted Answer

std::string文字列の長さがわからないことを除いて基本的に同じものが必要な場合は、どのようstd::stringに機能するか、どの演算子のオーバーロードがあるかなどを学び、必要な違いだけでそれを模倣する必要があります。

ただし、これに本当の意味があるとは考えにくいです。

あなたの最新の更新について - 一般的なアプリケーションコードがヒープオブジェクトへのネイキッドポインターを渡す設計が必要だとおっしゃっています。自動クリーンアップなし。

これは非常に悪い考えです。

score 0 · Accepted Answer

このコードはごちゃごちゃしており、RnR や他の提案はお勧めできません。しかし、それは私がやりたいことのために機能します:

#include <iostream>
using namespace std;

struct pString {
        /* No Member Variables, the data is the object */ 
        /* This class cannot be extended & will destroy a vtable */
    public:
        pString * pString::operator=(const char *);
};

pString& operator+(pString& first, const char *sec) {


        int lenFirst;
        int lenSec = strlen(sec);
        void * newBuff = NULL;

        if (&first == NULL)
        {
            cout << "NULL" << endl;
            lenFirst = 0; 
            newBuff = malloc(sizeof(pString)+lenFirst+lenSec+1);
        } else {
            lenFirst = strlen((char*)&first);
            newBuff= (pString*)realloc(&first, lenFirst+lenSec+1);
        }

        if (newBuff == NULL)
        {
            cout << "Realloc Failed"<< endl;
            free(&first);
            exit(0);
        }       

        memcpy((char*)newBuff+lenFirst, sec, lenSec);
        *((char*)newBuff+lenFirst+lenSec) = '\0';


        cout << "newBuff: " << (char*)newBuff << endl;

        return *(pString*)newBuff;

};


pString * pString::operator=(const char * buff) {

    cout << "Address of this: " << (uint32_t) this << endl;

    char *newPoint = (char *)realloc(this, strlen(buff)+200);
    memcpy(newPoint, buff,  strlen(buff));
    *((char*)newPoint+strlen(buff)) = '\0';

    cout << "Address of this After: " << (uint32_t) newPoint << endl;

    return (pString*)newPoint;
};


int main() {

    /* This doesn't work that well, there is something going wrong here, but it's just a proof of concept */

    cout << "Sizeof: " << sizeof(pString) << endl;

    pString * myString = NULL;

    //myString = (pString*)malloc(1);
    myString = *myString = "testing";
    pString& ref = *myString;


    //cout << "Address of myString: " << myString << endl;

    ref = ref + "test";
    ref = ref + "sortofworks" + "another" + "anothers";


    printf("FinalString:'%s'", myString);

}

score 0 · Accepted Answer

パフォーマンスが必要な場合は、次のようにクラスを記述できます。

template<int max_size> class MyString
{
public:
   size_t size;
   char contents[max_size];

public:
   MyString(const char* data);
};

コンテキストの下で max_size を適切な値に初期化します。このようにして、オブジェクトをスタック上に作成でき、メモリ割り当ては必要ありません。

new 演算子をオーバーロードすることで、必要なものを作成できます。

class pstring
{
public:
    int myInt;
    char myVarchar;

    void* operator new(size_t size, const char* p);
    void operator delete(void* p); 
};

void* pstring::operator new(size_t size, const char* p)
{
    assert(sizeof(pstring)==size);
    char* pm = (char*)malloc(sizeof(int) + strlen(p) +1 );
    strcpy(sizeof(int)+pm, p);
    *(int*)(pm) = strlen(p);  /* assign myInt */
    return pm;
}

void pstring::operator delete(void* p)
{
    ::free(p);
}


pstring* ps = new("test")pstring;

delete ps;

c++ - Realloc()/C++ での文字列実装用のオブジェクトのサイズ変更

17 に答える 17

Related

Reference