python - Python で Google Cloud BigTable データをスキャンするのに最適な API は何ですか?

翻译自：https://stackoverflow.com/questions/40786823 2016-11-24T12:48:00.347

347 次

Google Cloud のサンプルコードには、hbase テーブルスキャン API が 2 つあります。

1) google.cloud モジュール bigtable オブジェクトの使用 https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/bigtable/hello/main.py

from google.cloud import bigtable
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)
table = instance.table(table_id)
partial_rows = table.read_rows(...)
partial_rows.consume_all()
for row_key, row in partial_rows.rows.items():

2) google.cloud モジュールの bigtable と happybase オブジェクトを使用する https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/bigtable/hello_happybase/main.py

from google.cloud import bigtable
from google.cloud import happybase
client = bigtable.Client(project=project_id, admin=True)
instance = client.instance(instance_id)
connection = happybase.Connection(instance=instance)
table = connection.table(table_name)
for key, row in table.scan():

これら 2 つのメカニズムのうち、BigTable をスキャンするための推奨されるアプローチはどれですか?

また、それらは PySpark からの使用に適していますか?

python - Python で Google Cloud BigTable データをスキャンするのに最適な API は何ですか?

1 に答える 1

Related

Reference