Uploading files is full of pain, especially in Python. A large file uploading will block a request for a long time, which will cause a performance problem. Some framework may even have performance issue with form parsing for files.
What if we can upload files from browser to cloud storage directly? Wouldn't that be wonderful? There are lots of guides on how to do this with S3, but not Google Cloud Storage.
Google however has provided a document for you to upload objects by HTML forms. The API is called POST Object API.
Good luck. I hope you can understand the documentation. If not, here is a little tips on how to do it:
- We need to get a Google service account, which can be created on Service accounts page
- We need to create a backend API to generate form fields for uploading
- Then use the form fields in your front end, submit the form to Google Cloud Storage
The most difficult part is how to generate the form fields. Here is how:
import json
import datetime
with open('your-google-credential.json') as f:
conf = json.load(f)
BUCKET = 'your-bucket'
def create_upload_fields(key, content_type, cache_control, acl='public-read'):
# step 1. prepare form fields
fields = [
{'acl': acl},
{'key': name},
{'bucket': bucket},
{'Content-Type': content_type},
]
# you may add more fields
if cache_control:
fields.append({'Cache-Control': cache_control})
# step 2. prepare policy json
now = datetime.datetime.utcnow()
# you may set a different expire time
expires_at = now + datetime.timedelta(minutes=2)
expiration = expires_at.strftime('%Y-%m-%dT%H:%M:%SZ')
conditions = list(fields)
# you may extend conditions here
policy_json = {
'expiration': expiration,
'conditions': conditions
}
# step 3. base64 policy
policy_text = json.dumps(policy_json, separators=(',', ':'))
policy = base64.b64encode(policy_text.encode('utf-8'))
# step 4. sign policy
signature = base64.b64encode(sign_policy(policy))
# step 5. return fields
fields.extend([
{'GoogleAccessId': conf['client_email']},
{'policy': policy.decode('utf-8')},
{'signature': signature.decode('utf-8')},
])
return fields
The code above has a sign_policy
not implemented. We will define it as another method to make our code more readable. This sign_policy
will use RSA to sign, we need to install a third party library to do this. My suggestion is cryptography
:
pip install cryptography
You may encounter troubles with installation, check out the installation guide of cryptography.
Here is our code for sign_policy
:
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.asymmetric import padding
from cryptography.hazmat.primitives.serialization import load_pem_private_key
def sign_policy(policy):
private_key = load_pem_private_key(
conf['private_key'].encode('utf-8'), # conf: check above code
password=None,
backend=default_backend(),
)
return private_key.sign(policy, padding.PKCS1v15(), hashes.SHA256())
This is a RS256
sign, you may find it is very hard to understand, that is ok. Crypto is hard, but using the library is not, just use the sign_policy
code.
With the form fields created by create_upload_fields
, you can create the HTML form like:
<form action="https://storage.googleapis.com" method="post" enctype="multipart/form-data">
<input type="text" name="key" value="some-key-value">
<input type="hidden" name="bucket" value="your-bucket">
<input type="hidden" name="Content-Type" value="image/jpeg">
<input type="hidden" name="acl" value="bucket-owner-read">
<input type="hidden" name="policy" value=".....">
<input type="hidden" name="signature" value="....">
<input name="file" type="file">
<input type="submit" value="Upload">
</form>
Remember to put <input name="file" type="file">
at the end of form.
You can add more fields and conditions to your form, read the offical guide to find out what they are.
Also, you may be interested in Multipart Upload to Google Cloud Storage with Authlib.
Need a more detailed guide? Patreon me at https://www.patreon.com/lepture, I'll create a video guide on uploading to Google Cloud Storage from a single page application (SPA).