python - Pythonを使用してGoogleドキュメントからスプレッドシートをダウンロードする

Question

キーとワークシートID（）を指定してGoogle Docsスプレッドシートをダウンロードする方法のPythonの例を作成できますgidか？私はできません。

APIのバージョン1、2、3を精査しました。私は運が悪く、コンパイルされたATOMのようなフィードAPIを理解できません。gdata.docs.service.DocsService._DownloadFileプライベートメソッドは私が無許可であると言っており、Googleログイン認証システム全体を自分で作成したくありません。欲求不満で顔を刺そうとしています。

スプレッドシートがいくつかあり、次のようにアクセスしたいと思います。

username = 'mygooglelogin@gmail.com'
password = getpass.getpass()

def get_spreadsheet(key, gid=0):
    ... (help!) ...

for row in get_spreadsheet('5a3c7f7dcee4b4f'):
    cell1, cell2, cell3 = row
    ...

私の顔を救ってください。

更新1：次のことを試しましたが、組み合わせがないDownload()か、機能していないExport()ようです。DocsService （ここのドキュメント）

import gdata.docs.service
import getpass
import os
import tempfile
import csv

def get_csv(file_path):
  return csv.reader(file(file_path).readlines())

def get_spreadsheet(key, gid=0):
  gd_client = gdata.docs.service.DocsService()
  gd_client.email = 'xxxxxxxxx@gmail.com'
  gd_client.password = getpass.getpass()
  gd_client.ssl = False
  gd_client.source = "My Fancy Spreadsheet Downloader"
  gd_client.ProgrammaticLogin()

  file_path = tempfile.mktemp(suffix='.csv')
  uri = 'http://docs.google.com/feeds/documents/private/full/%s' % key
  try:
    entry = gd_client.GetDocumentListEntry(uri)

    # XXXX - The following dies with RequestError "Unauthorized"
    gd_client.Download(entry, file_path)

    return get_csv(file_path)
  finally:
    try:
      os.remove(file_path)
    except OSError:
      pass

score 38 · Accepted Answer

https://github.com/burnash/gspreadgdataライブラリは、低レベルであるだけでなく、過度にライブラリであることを示唆する古い回答ではなく、Googleスプレッドシートを操作するための新しい簡単な方法です。複雑。

また、サービスアカウントキーを（JSON形式で）作成してダウンロードする必要があります：https ：//console.developers.google.com/apis/credentials/serviceaccountkey

使用方法の例を次に示します。

import csv
import gspread
from oauth2client.service_account import ServiceAccountCredentials

scope = ['https://spreadsheets.google.com/feeds']
credentials = ServiceAccountCredentials.from_json_keyfile_name('credentials.json', scope)

docid = "0zjVQXjJixf-SdGpLKnJtcmQhNjVUTk1hNTRpc0x5b9c"

client = gspread.authorize(credentials)
spreadsheet = client.open_by_key(docid)
for i, worksheet in enumerate(spreadsheet.worksheets()):
    filename = docid + '-worksheet' + str(i) + '.csv'
    with open(filename, 'wb') as f:
        writer = csv.writer(f)
        writer.writerows(worksheet.get_all_values())

score 20 · Accepted Answer

誰かがこれに出くわして迅速な修正を探している場合に備えて、gdataクライアントライブラリに依存しない別の（現在）機能しているソリューションを次に示します。

#!/usr/bin/python

import re, urllib, urllib2

class Spreadsheet(object):
    def __init__(self, key):
        super(Spreadsheet, self).__init__()
        self.key = key

class Client(object):
    def __init__(self, email, password):
        super(Client, self).__init__()
        self.email = email
        self.password = password

    def _get_auth_token(self, email, password, source, service):
        url = "https://www.google.com/accounts/ClientLogin"
        params = {
            "Email": email, "Passwd": password,
            "service": service,
            "accountType": "HOSTED_OR_GOOGLE",
            "source": source
        }
        req = urllib2.Request(url, urllib.urlencode(params))
        return re.findall(r"Auth=(.*)", urllib2.urlopen(req).read())[0]

    def get_auth_token(self):
        source = type(self).__name__
        return self._get_auth_token(self.email, self.password, source, service="wise")

    def download(self, spreadsheet, gid=0, format="csv"):
        url_format = "https://spreadsheets.google.com/feeds/download/spreadsheets/Export?key=%s&exportFormat=%s&gid=%i"
        headers = {
            "Authorization": "GoogleLogin auth=" + self.get_auth_token(),
            "GData-Version": "3.0"
        }
        req = urllib2.Request(url_format % (spreadsheet.key, format, gid), headers=headers)
        return urllib2.urlopen(req)

if __name__ == "__main__":
    import getpass
    import csv

    email = "" # (your email here)
    password = getpass.getpass()
    spreadsheet_id = "" # (spreadsheet id here)

    # Create client and spreadsheet objects
    gs = Client(email, password)
    ss = Spreadsheet(spreadsheet_id)

    # Request a file-like object containing the spreadsheet's contents
    csv_file = gs.download(ss)

    # Parse as CSV and print the rows
    for row in csv.reader(csv_file):
        print ", ".join(row)

score 18 · Accepted Answer

ドキュメントの「スプレッドシートのエクスポート」セクションで説明されているAuthSubメソッドを使用してみてください。

スプレッドシートサービス用に別のログイントークンを取得し、それをエクスポートの代わりに使用します。get_spreadsheetこれをコードに追加すると、うまくいきました。

import gdata.spreadsheet.service

def get_spreadsheet(key, gid=0):
    # ...
    spreadsheets_client = gdata.spreadsheet.service.SpreadsheetsService()
    spreadsheets_client.email = gd_client.email
    spreadsheets_client.password = gd_client.password
    spreadsheets_client.source = "My Fancy Spreadsheet Downloader"
    spreadsheets_client.ProgrammaticLogin()

    # ...
    entry = gd_client.GetDocumentListEntry(uri)
    docs_auth_token = gd_client.GetClientLoginToken()
    gd_client.SetClientLoginToken(spreadsheets_client.GetClientLoginToken())
    gd_client.Export(entry, file_path)
    gd_client.SetClientLoginToken(docs_auth_token) # reset the DocList auth token

PDFファイルしか提供していないように見えるのでExport、私も使用していることに注意してください。Download

score 6 · Accepted Answer

（2016年7月）現在の用語で言い換えると、「 Pythonを使用してGoogleドライブからCSVまたはXLSX形式のGoogleスプレッドシートをダウンロードするにはどうすればよいですか？」（Googleドキュメントは現在、Googleスプレッドシートのスプレッドシートへのアクセスを提供しないクラウドベースのワードプロセッサ/テキストエディタのみを参照しています。）

まず、他のすべての回答はかなり古くなっているか、GData（ " Google Data"）プロトコル、ClientLogin、またはAuthSubを使用しているため、いずれも廃止される予定です。Google SheetsAPIv3以前を使用するすべてのコードまたはライブラリについても同じことが言えます。

最新のGoogleAPIアクセスは、APIキー（公開データにアクセスするため）、OAuth2クライアントID（ユーザーが所有するデータにアクセスするため）、またはサービスアカウント（アプリケーション/クラウド内のアプリケーションが所有するデータにアクセスするため）を主に使用して発生します。非GCPAPI用のGCPAPIとGoogleAPIクライアントライブラリ。このタスクの場合、Pythonの場合は後者になります。

これを実現するには、コードにGoogleドライブAPIへの許可されたアクセスが必要です。たとえば、ダウンロードする特定のスプレッドシートをクエリしてから、実際のエクスポートを実行する必要があります。これは一般的な操作である可能性が高いため、これを行うコードスニペットを共有するブログ投稿を作成しました。これをさらに追求したい場合は、Googleドライブにファイルをアップロードする方法とGoogleドライブからファイルをダウンロードする方法の概要を説明するビデオと一緒に別の投稿を用意しています。

新しいGoogleSheetsAPI v4もありますが、これは主にスプレッドシート指向の操作、つまりデータの挿入、スプレッドシートの行の読み取り、セルの書式設定、グラフの作成、ピボットテーブルの追加などを対象としており、エクスポートなどのファイルベースのリクエストではありません。ここで、DriveAPIが正しい使用方法です。

ドライブからGoogleスプレッドシートをCSVとしてエクスポートするデモをブログに投稿しました。スクリプトのコア部分：

# setup
FILENAME = 'inventory'
SRC_MIMETYPE = 'application/vnd.google-apps.spreadsheet'
DST_MIMETYPE = 'text/csv'
DRIVE = discovery.build('drive', 'v3', http=creds.authorize(Http()))

# query for file to export
files = DRIVE.files().list(
    q='name="%s" and mimeType="%s"' % (FILENAME, SRC_MIMETYPE), orderBy='modifiedTime desc,name').execute().get('files', [])

# export 1st match (if found)
if files:
    fn = '%s.csv' % os.path.splitext(files[0]['name'].replace(' ', '_'))[0]
    print('Exporting "%s" as "%s"... ' % (files[0]['name'], fn), end='')
    data = DRIVE.files().export(fileId=files[0]['id'], mimeType=DST_MIMETYPE).execute()
    if data:
        with open(fn, 'wb') as f:
            f.write(data)
        print('DONE')

PythonでのGoogleスプレッドシートの使用の詳細については、同様の質問に対する私の回答を参照してください。XLSXおよびドライブでサポートされているその他の形式のシートをダウンロードすることもできます。

Google APIをまったく使用したことがない場合は、さらに一歩下がって、最初にこれらのビデオを確認する必要があります。

Google APIの使用方法とAPIプロジェクトの作成方法-UIは変更されましたが、概念は同じです
承認ボイラープレートコード（Python）のウォークスルー-サポートされている任意の言語を使用してGoogleAPIにアクセスできます。Pythonを使用しない場合は、Pythonを疑似コードとして使用して開始してください
Googleドライブにファイルをリストし、詳細な投稿をコーディングする

すでにGSuiteAPIの経験があり、両方のAPIの使用に関するビデオをもっと見たい場合：

score 3 · Accepted Answer

これは、gdata2.0.1.4以降では機能しなくなりました。

gd_client.SetClientLoginToken(spreadsheets_client.GetClientLoginToken())

代わりに、次のことを行う必要があります。

gd_client.SetClientLoginToken(gdata.gauth.ClientLoginToken(spreadsheets_client.GetClientLoginToken()))

score 2 · Accepted Answer

私の場合、次のコードが機能します（Ubuntu 10.4、python 2.6.5 gdata 2.0.14）

import gdata.docs.service
import gdata.spreadsheet.service
gd_client = gdata.docs.service.DocsService()
gd_client.ClientLogin(email,password)
spreadsheets_client = gdata.spreadsheet.service.SpreadsheetsService()
spreadsheets_client.ClientLogin(email,password)
#...
file_path = file_path.strip()+".xls"
docs_token = gd_client.auth_token
gd_client.SetClientLoginToken(spreadsheets_client.GetClientLoginToken())
gd_client.Export(entry, file_path)  
gd_client.auth_token = docs_token

score 2 · Accepted Answer

gspreadの代わりにpygsheetsを作成しましたが、googleapiv4を使用しました。exportスプレッドシートをエクスポートする方法があります。

import pygsheets

gc = pygsheets.authorize()

# Open spreadsheet and then workseet
sh = gc.open('my new ssheet')
wks = sh.sheet1

#export as csv
wks.export(pygsheets.ExportType.CSV)

score 1 · Accepted Answer

不要なオブジェクト指向を削除することで、@Cameronの回答をさらに簡略化しました。これにより、コードが小さくなり、理解しやすくなります。URLも編集しましたが、うまくいくかもしれません。

#!/usr/bin/python
import re, urllib, urllib2

def get_auth_token(email, password):
    url = "https://www.google.com/accounts/ClientLogin"
    params = {
        "Email": email, "Passwd": password,
        "service": 'wise',
        "accountType": "HOSTED_OR_GOOGLE",
        "source": 'Client'
    }
    req = urllib2.Request(url, urllib.urlencode(params))
    return re.findall(r"Auth=(.*)", urllib2.urlopen(req).read())[0]

def download(spreadsheet, worksheet, email, password, format="csv"):
    url_format = 'https://docs.google.com/spreadsheets/d/%s/export?exportFormat=%s#gid=%s'

    headers = {
        "Authorization": "GoogleLogin auth=" + get_auth_token(email, password),
        "GData-Version": "3.0"
    }
    req = urllib2.Request(url_format % (spreadsheet, format, worksheet), headers=headers)
    return urllib2.urlopen(req)


if __name__ == "__main__":
    import getpass
    import csv

    spreadsheet_id = ""             # (spreadsheet id here)
    worksheet_id = ''               # (gid here)
    email = ""                      # (your email here)
    password = getpass.getpass()

    # Request a file-like object containing the spreadsheet's contents
    csv_file = download(spreadsheet_id, worksheet_id, email, password)

    # Parse as CSV and print the rows
    for row in csv.reader(csv_file):
        print ", ".join(row)

score 1 · Accepted Answer

私はこれを使用しています：curl'https://docs.google.com/spreadsheets/d/1-lqLuYJyHAKix-T8NR8wV8ZUUbVOJrZTysccid2-ycs/gviz/tq ?tqx=out :csv'公的に読み取り可能に設定されているシート。

したがって、パブリックシートで作業できる場合は、Pythonバージョンのcurlが必要になります。

表示したくないタブがいくつかあるシートがある場合は、新しいシートを作成し、公開する範囲をそのシートのタブにインポートします。

score 1 · Accepted Answer

google docからスプレッドシートをダウンロードするのは、シートを使用すると非常に簡単です。

あなたは上の詳細なドキュメントに従うことができます

https://pypi.org/project/gsheets/

または、以下の手順に従ってください。カバレッジを向上させるために、ドキュメントを読むことをお勧めします。

pipインストールgsheets
スプレッドシートにアクセスするGoogleアカウントを使用して、GoogleDevelopersConsoleにログインします。プロジェクトを作成（または選択）し、DriveAPIとSheetsAPI（Google Apps APIの下）を有効にします。
プロジェクトの[クレデンシャル]に移動し、[その他]タイプの[新しいクレデンシャル]>[OAuthクライアントID]を作成します。OAuth 2.0クライアントIDのリストで、作成したクライアントIDの[JSONのダウンロード]をクリックします。ファイルをclient_secrets.jsonとしてホームディレクトリ（ユーザーディレクトリ）に保存します。
次のコードスニペットを使用します。

    from gsheets import Sheets
    sheets = Sheets.from_files('client_secret.json')
    print(sheets) # will ensure authenticate connection
    
    s = sheets.get("{SPREADSHEET_URL}")
    print(s) # will ensure your file is accessible 
    
    s.sheets[1].to_csv('Spam.csv', encoding='utf-8', dialect='excel') # will download the file as csv

score 0 · Accepted Answer

これは完全な答えではありませんが、Andreas Kahlerは、Google Docs + Google App Engline+Pythonを使用して興味深いCMSソリューションを作成しました。この分野での経験がないので、コードのどの部分があなたに役立つか正確にはわかりませんが、チェックしてください。Google Docsアカウントと連動し、ファイルを操作することを知っているので、何が起こっているのかがわかると思います。それは少なくともあなたを正しい方向に向けるべきです。

Google AppEngine + GoogleDocs+一部のPython=シンプルなCMS

score 0 · Accepted Answer

Gspreadは、確かにGoogleCLとGdata（どちらも使用しており、ありがたいことにGspreadを優先して段階的に廃止されています）よりも大幅に改善されています。このコードは、シートの内容を取得するための以前の回答よりもさらに速いと思います。

username = 'sdfsdfsds@gmail.com'
password = 'sdfsdfsadfsdw'
sheetname = "Sheety Sheet"

client = gspread.login(username, password)
spreadsheet = client.open(sheetname)

worksheet = spreadsheet.sheet1
contents = []
for rows in worksheet.get_all_values():
    contents.append(rows)

score 0 · Accepted Answer

（2019年3月、Python 3）私のデータは通常機密性が低く、通常はCSVに似たテーブル形式を使用します。

このような場合はpublish to the web、シートを単純に作成して、サーバー上のCSVファイルとして使用できます。

File（ -> Publish to the web ...-> Sheet 1-> Comma separated values (.csv)- >を使用して公開しますPublish）。

import csv
import io
import requests

url = "https://docs.google.com/spreadsheets/d/e/<GOOGLE_ID>/pub?gid=0&single=true&output=csv"  # you can get the whole link in the 'Publish to the web' dialog
r = requests.get(url)
r.encoding = 'utf-8'
csvio = io.StringIO(r.text, newline="")
data = []
for row in csv.DictReader(csvio):
    data.append(row)

python - Pythonを使用してGoogleドキュメントからスプレッドシートをダウンロードする

13 に答える 13

Related

Reference