用 Python 下載 Chrome Extension 檔案
Mar 26, 2023
研究用途,需要要下載大量的 chrome extension,又不想手動執行更新。
本來想說要先研究一下 chrome 是如何從 chrome web store 下載 extension 的,
但是想想這種需求一定不只我有。
所以搜尋了一下 github ,果然有各路大神寫的各種下載器版本 ~
這邊我參考了 qcarq/inboxunpack 與 tonystark93/crx-download 的版本,修改了自己用的版本。
Extension 的下載路徑
從 Stackoverflow 上的 How to download a CRX file from the Chrome web store for a given ID? 查到要拼出下載路徑需要幾個參數
從 web store url 取出 extension id
這個可以從 chrome web store 的路徑分析出來,例如
那麼他的 id 就是 hkgfoiooedgoejojocmhlaklaeopbecg 這串。
所以我從 tonystark93/crx-download 抄了可以取出 id 的 regex
^https?:\/\/chrome.google.com\/webstore\/.+?\/([a-z]{32})(?=[\/#?]|$)
改成 python
CHROME_URL_PATTERN = r"https?:\/\/chrome.google.com\/webstore\/.+?\/([a-z]{32})(?=[\/#?]|$)" def get_extension_id(extension_url: str): match = re.search(CHROME_URL_PATTERN, extension_url).group(1) if match is None: raise Exception("Invalid extension url") return match
有了 extension_id 就可以結合 os , prodversion, acceptformat 等參數,拼出下載路徑
extension_id = get_extension_id(extension_url) # https://clients2.google.com/service/update2/crx?response=redirect&prodversion=${version}&acceptformat=crx2,crx3&x=id%3D${result[1]}%26uc&nacl_arch=${nacl_arch} url = 'https://clients2.google.com/service/update2/crx' params = { 'response': 'redirect', 'os': _os, 'prodversion': chromium_version, 'acceptformat': 'crx2,crx3', 'nacl_arch': 'x86-64', 'x': 'id={}&installsource=ondemand&uc'.format(extension_id), } full_url = f"{url}?{urllib.parse.urlencode(params)}"
下載與解壓縮
為了減少相依性,所以使用 python 內建的 urllib 來下載 extension 檔案
# download extension with urllib.request.urlopen(full_url) as response: data = response.read() print(f"Downloading file from {response.url}...") filename = f"{extension_name}_{datetime.datetime.now()}.crx" if not filename.endswith('.crx'): return RuntimeError('Something gone wrong during GET {}'.format( response.request.url)) # write extension to temp dir crx_path = os.path.join(target_directory, filename) print(f"Saving file to ${crx_path}...") with open(crx_path, 'wb') as fp: fp.write(data)
下載下來的檔案格式是 crx 但是他其實是一個 zip 壓縮檔,所以參考 qcarq/inboxunpack使用 python 內建的 ZipFile 解壓縮
inoxunpack/inoxunpack.py at master · gcarq/inoxunpack
Downloads extensions from Chrome WebStore and unpacks them. - inoxunpack/inoxunpack.py at master · gcarq/inoxunpack
https://github.com/gcarq/inoxunpack/blob/master/inoxunpack.py#L74
參考資料
https://github.com/gcarq/inoxunpack
https://github.com/tonystark93/crx-download
How to download a CRX file from the Chrome web store for a given ID?
I'd like to download the .crx file of an extension from webstore, I use fiddler to analyze the network request when I install an extension from webstore and got it.
For example, for the extension: ...
https://stackoverflow.com/questions/7184793/how-to-download-a-crx-file-from-the-chrome-web-store-for-a-given-id