用 Python 下載 Chrome Extension 檔案

Mar 26, 2023

研究用途,需要要下載大量的 chrome extension,又不想手動執行更新。

本來想說要先研究一下 chrome 是如何從 chrome web store 下載 extension 的,

但是想想這種需求一定不只我有。

所以搜尋了一下 github ,果然有各路大神寫的各種下載器版本 ~

這邊我參考了 qcarq/inboxunpacktonystark93/crx-download 的版本,修改了自己用的版本。

Extension 的下載路徑

從 Stackoverflow 上的 How to download a CRX file from the Chrome web store for a given ID? 查到要拼出下載路徑需要幾個參數

  • extension id
  • os
  • nacl_arch
  • chrome version
  • 從 web store url 取出 extension id

    這個可以從 chrome web store 的路徑分析出來,例如

    https://chrome.google.com/webstore/detail/picture-in-picture-extens/hkgfoiooedgoejojocmhlaklaeopbecg?hl=zh-TW

    那麼他的 id 就是 hkgfoiooedgoejojocmhlaklaeopbecg 這串。

    所以我從 tonystark93/crx-download 抄了可以取出 id 的 regex

    ^https?:\/\/chrome.google.com\/webstore\/.+?\/([a-z]{32})(?=[\/#?]|$)

    改成 python

    CHROME_URL_PATTERN = r"https?:\/\/chrome.google.com\/webstore\/.+?\/([a-z]{32})(?=[\/#?]|$)" def get_extension_id(extension_url: str): match = re.search(CHROME_URL_PATTERN, extension_url).group(1) if match is None: raise Exception("Invalid extension url") return match

    有了 extension_id 就可以結合 os , prodversion, acceptformat 等參數,拼出下載路徑

    extension_id = get_extension_id(extension_url) # https://clients2.google.com/service/update2/crx?response=redirect&prodversion=${version}&acceptformat=crx2,crx3&x=id%3D${result[1]}%26uc&nacl_arch=${nacl_arch} url = 'https://clients2.google.com/service/update2/crx' params = { 'response': 'redirect', 'os': _os, 'prodversion': chromium_version, 'acceptformat': 'crx2,crx3', 'nacl_arch': 'x86-64', 'x': 'id={}&installsource=ondemand&uc'.format(extension_id), } full_url = f"{url}?{urllib.parse.urlencode(params)}"

    下載與解壓縮

    為了減少相依性,所以使用 python 內建的 urllib 來下載 extension 檔案

    # download extension with urllib.request.urlopen(full_url) as response: data = response.read() print(f"Downloading file from {response.url}...") filename = f"{extension_name}_{datetime.datetime.now()}.crx" if not filename.endswith('.crx'): return RuntimeError('Something gone wrong during GET {}'.format( response.request.url)) # write extension to temp dir crx_path = os.path.join(target_directory, filename) print(f"Saving file to ${crx_path}...") with open(crx_path, 'wb') as fp: fp.write(data)

    下載下來的檔案格式是 crx 但是他其實是一個 zip 壓縮檔,所以參考 qcarq/inboxunpack使用 python 內建的 ZipFile 解壓縮

    inoxunpack/inoxunpack.py at master · gcarq/inoxunpack
    Downloads extensions from Chrome WebStore and unpacks them. - inoxunpack/inoxunpack.py at master · gcarq/inoxunpack
    favicon
    https://github.com/gcarq/inoxunpack/blob/master/inoxunpack.py#L74
    inoxunpack/inoxunpack.py at master · gcarq/inoxunpack

    參考資料

    https://github.com/gcarq/inoxunpack

    https://github.com/tonystark93/crx-download

    How to download a CRX file from the Chrome web store for a given ID?
    I'd like to download the .crx file of an extension from webstore, I use fiddler to analyze the network request when I install an extension from webstore and got it. For example, for the extension: ...
    favicon
    https://stackoverflow.com/questions/7184793/how-to-download-a-crx-file-from-the-chrome-web-store-for-a-given-id
    How to download a CRX file from the Chrome web store for a given ID?

    ← Go home