python - Python で 2 つの MongoDB コレクションのデータを結合するにはどうすればよいですか?

Question

学習課題として、Flask + MongoDB (w/pymongo) でミニ Twitter クローンを作成しています。2 つのコレクションのデータを結合するための助けが必要です。私は、MongoDB では結合を実行できないことを理解しています。そのため、Python で結合を行う方法を尋ねています。

ユーザー情報を保存するコレクションがあります。ドキュメントは次のようになります。

{
    "_id" : ObjectId("51a6c4e3eedc89e34ee46e32"),
    "email" : "alex@email.com",
    "message" : [
        ObjectId("51a6c5e1eedc89e34ee46e36")
    ],
    "pw_hash" : "alexhash",
    "username" : "alex",
    "who_id" : [
        ObjectId("51a6c530eedc89e34ee46e33"),
        ObjectId("51a6c54beedc89e34ee46e34")
    ],
    "whom_id" : [ ]
}

メッセージ (ツイート) を保存する別のコレクション:

{
    "_id" : ObjectId("51a6c5e1eedc89e34ee46e36"),
    "author_id" : ObjectId("51a6c4e3eedc89e34ee46e32"),
    "text" : "alex first twit",
    "pub_date" : ISODate("2013-05-30T03:22:09.462Z")
}

ご覧のとおり、メッセージには、メッセージドキュメントの "author_id" にユーザーの "_id" への参照が含まれており、ユーザードキュメントの "message" にメッセージの "_id" への参照が含まれています。

基本的に、私がやりたいことは、すべてのメッセージの「author_id」を取得し、ユーザーコレクションから対応するユーザー名を取得し、「username」+「text」+「pub_date」を含む新しい辞書を作成することです。これで、Jinja2 テンプレートでデータを簡単にレンダリングできました。

私は、私がやりたいことをする次のコードを持っています:

def getMessageAuthor():
    author_id = []
    # get a list of author_ids for every message
    for author in coll_message.find():
        author_id.append(author['author_id'])
    # iterate through every author_ids to find the corresponding username
    for item in author_id:
        message = coll_message.find_one({"author_id": item}, {"text": 1, "pub_date": 1})
        author = coll_user.find_one({"_id": item}, {"username": 1})
        merged = dict(chain((message.items() + author.items())))

出力は次のようになります。

{u'username': u'alex', u'text': u'alex first twit', u'_id': ObjectId('51a6c4e3eedc89e34ee46e32'), u'pub_date': datetime.datetime(2013, 5, 30, 3, 22, 9, 462000)}

これはまさに私が欲しいものです。

私は .find_one() を実行しているため、コードは機能しません。そのため、ユーザーが 2 つ以上のメッセージを持っている場合でも、常に最初のメッセージが表示されます。.find() を実行するとこの問題が解決する場合がありますが、.find() はカーソルを返し、.find_one() のような辞書は返しません。カーソルを .find_one() からの出力と同じ辞書形式に変換し、それらをマージして上記と同じ出力を得る方法がわかりません。

これは私が立ち往生しているところです。これを修正するためにどのように進めればよいかわかりません。どんな助けでも大歓迎です。

ありがとうございました。

score 3 · Accepted Answer

("_id", "author_id") を追加して、この ID を使用して対応するメッセージを期待どおりに取得し、author_id を使用してユーザー名を取得します。

それを行うには一意のキーが必要です：

def getMessageAuthor():
    author_id = []
    # get a list of ids and author_ids for every message
    for author in coll_message.find():
        author_id.append( (author['_id'], author['author_id']))
    # iterate through every author_ids to find the corresponding username
    for id, item in author_id:
        message = coll_message.find_one({"_id": id}, {"text": 1, "pub_date": 1})
        author = coll_user.find_one({"_id": item}, {"username": 1})
        merged = dict(chain((message.items() + author.items())))

python - Python で 2 つの MongoDB コレクションのデータを結合するにはどうすればよいですか?

1 に答える 1

Related

Reference