How to read data in cloud storage data using clouds functions - python

I'm trying to make a cloud function using python, which reads json files containing schemas of tables from a directory in the cloud storage and from these schemas I need to create tables in bigquery.
I had some attempts to access cloud storage, but without success, previously I developed something similar in google colab, reading these schemas from a directory on the drive, but now things seem quite different.
Can someone help me?

You can check the Streaming data from Cloud Storage into BigQuery using Cloud Functions solution guide of GCP.
If you'd like a different approach you can refer to the download object guide at the GCP doc to retrieve the data from GCS, see the sample code below.
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "storage-object-name"
# destination_file_name = "local/path/to/file"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print(
"Blob {} downloaded to {}.".format(
source_blob_name, destination_file_name
)
)

You can create Cloud Function and read the download data from the file in Cloud Storage
def loader(event, context):
"""Triggered by a change to a Cloud Storage bucket.
Args:
event (dict): Event payload.
context (google.cloud.functions.Context): Metadata for the event.
"""
try:
file_name = event['name']
bucket_name = event['bucket']
client = storage.Client()
bucket = client.get_bucket(bucket_name)
file_blob = storage.Blob(file_name, bucket)
data = file_blob.download_as_string().decode()
Once you get the data you can create table in BigQuery.

Related

How to create text file in Google Cloud Storage without saving it locally?

I know how to upload a string saved to a text file to Google Cloud Storage: using the upload_blob function below (source):
from google.cloud import storage
def upload_blob(bucket_name, source_file_name, destination_blob_name):
"""Uploads a file to the bucket."""
# The ID of your GCS bucket
# bucket_name = "your-bucket-name"
# The path to your file to upload
# source_file_name = "local/path/to/file"
# The ID of your GCS object
# destination_blob_name = "storage-object-name"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_filename(source_file_name)
I can create a file stored on local disk:
!touch localfile
!echo "contents of my file" > localfile
!cat localfile # outputs: contents of my file
Upload this file to Google Cloud Storage:
upload_blob('my-project','localfile','gcsfile')
It is indeed uploaded:
How can I create gcsfile in Google Cloud Storage containing the string contents of my file, without saving it first?
I tried:
import io
output = io.BytesIO()
output.write(b'First line.\n')
upload_blob('adventdalen-003',output,'out')
Doesn't work, I get:
TypeError: expected str, bytes or os.PathLike object, not _io.BytesIO
Similar but different threads:
How do I upload a base64 encoded image (string) directly to a Google Cloud Storage bucket using Node.js?
Upload JSON string to Google Cloud Storage without a file
Neither of these are in Python.
Using #johnhanley's suggestions this is the code to implement blob.upload_from_string():
from google.cloud import storage
def write_to_blob(bucket_name,file_name):
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(file_name)
blob.upload_from_string("written in python")
write_to_blob(bucket_name="test-bucket",file_name="from_string.txt")
Saved in Google Cloud Storage:
Inside from_string.txt:

Downloading Files from Google Cloud Storage to Remote Server

Requirement :
I need to execute a script from a remote server which download a particular file from google cloud storage.
Script should use a service account where key should be in hashicorp vault.
i have already a hashicorp vault setup established
but not sure how to invoke it from shell/python script
so if some can help me in how to download file from GCS by using service account key in hashicorp vault
either python/shell script would be fine
you can follow the google documentation to download an objects using:
Console
gsutil
Programming Language Script
curl + rest api
for more information visit this Link
For a python utility you can use this code
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "storage-object-name"
# destination_file_name = "local/path/to/file"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
# Construct a client side representation of a blob.
# Note `Bucket.blob` differs from `Bucket.get_blob` as it doesn't retrieve
# any content from Google Cloud Storage. As we don't need additional data,
# using `Bucket.blob` is preferred here.
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print(
"Blob {} downloaded to {}.".format(
source_blob_name, destination_file_name
)
)

Unable to upload image to firebase storage from python server

I am creating an android app which sends an image to python server and from my python server I want to upload that received image to firebase storage. My problem is that when I try to upload the received image from python server only the filename is stored in the specified collection but my image doesn't upload. My firebase storage looks like this
Firebase storage screenshot
In the attached screenshot the first image starting with Actual is the one that I upload from android and the second starting with processed is the one that I am trying to upload from python but unable to do that. The file type is also different that the one uploaded from android. Below is my code that I am using to upload the image from my server:
Function where I receive image from android:
def handle_request():
print(flask.request.files)
print(flask.request.form)
imagefile = flask.request.files['image']
userID = flask.request.form['user_ID']
upload_to_firebase(imagefile, filename, userID)
Function which stores the image to firebase storage.
def upload_to_firebase(file, filename, userID):
firebase = Firebase(config)
firebase = pyrebase.initialize_app(config)
storage = firebase.storage()
storage.child(userID + "/" + filename + "/Processed_" + filename+"").put(file)
downloadURL = storage.child(userID + "/" + filename + "/Processed_" + filename+"").get_url(None)
Is there any way I can pass the content type image/jpeg while sending the image or any other way I can fix this. I have searched a lot this solution but none have worked so far.
Notice that as explained on the Firebase documentation:
You can use the bucket references returned by the Admin SDK in conjunction with the official Google Cloud Storage client libraries to upload, download, and modify content in the buckets associated with your Firebase projects
Therefore you'll need to modify your upload_to_firebase function to something similar as explained here on the relevant section of the Google Cloud Storage client library for Python:
from google.cloud import storage
def upload_blob(bucket_name, source_file_name, destination_blob_name):
"""Uploads a file to the bucket."""
# bucket_name = "your-bucket-name"
# source_file_name = "local/path/to/file"
# destination_blob_name = "storage-object-name"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_filename(source_file_name)
print(
"File {} uploaded to {}.".format(
source_file_name, destination_blob_name
)
)
where you could obtain the bucket_name variable from the name propertyof the bucket object you define with the Firebase Admin SDK (e.g. if you are using the default bucket of your project):
import firebase_admin
from firebase_admin import credentials
from firebase_admin import storage
cred = credentials.Certificate('path/to/serviceAccountKey.json')
firebase_admin.initialize_app(cred, {
'storageBucket': '<BUCKET_NAME>.appspot.com'
})
bucket = storage.bucket()
and the source_file_name will correspond to the full path of the uploaded image within the server serving your Flask application.
Notice that you could end up with disk space issues if not properly deleting or managing the files uploaded to the Python server, so be careful on that regards.

Video Reading Problem from Google Cloud Bucket

I am trying to deploy my website to Google Cloud. However, I have a problem with video processing. My website takes video from the user and then it shows that video or previously updated videos to the user. I can show the video on the template page. And I also need to process that video in the background. So, I should read the corresponding video with OpenCV. My code works locally. However, in Google Cloud part, the video is stored as a URL and OpenCV cannot read using URL as expected. According to the sources, the solution is to download the video into local file system:
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "storage-object-name"
# destination_file_name = "local/path/to/file"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print(
"Blob {} downloaded to {}.".format(
source_blob_name, destination_file_name
)
)
https://cloud.google.com/storage/docs/downloading-objects#code-samples
I have two problems with this code:
1-First, I do not want to download the video into my local computer. I need to keep the video in the Google Cloud and I have to read the video with OpenCV from there.
2-When I try to run the above code, I still get an error because it cannot download the video into the “destination_file_name”.
Could anyone help me with this problem?
Best.
Edit: I solved the problem with the help of answers. Thank you. I download video file to /tmp folder than use with OpenCV. Here is my function:
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "path/file in Google Cloud Storage bucket"
# destination_file_name = "/tmp/path/to/file(without extension, for example in my case ".mp4")"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_blob_name)
with open(destination_file_name, "wb") as file_obj:
blob.download_to_file(file_obj)
This is a duplicate question. You cannot write to the production server in the cloud. From: https://cloud.google.com/appengine/docs/standard/php/runtime#filesystem
An App Engine application cannot:
write to the filesystem. Applications can use Google Cloud Storage for
storing persistent files. Reading from the filesystem is allowed, and
all application files uploaded with the application are available.
You want to use Google Cloud Storage to upload your videos.
You can write to the /tmp directory temporarily, but that will not persist. But it may work for your need:
# destination_file_name = "/tmp/path/to/file"
This is a duplicate question. You cannot write to the production server in the cloud. From: https://cloud.google.com/appengine/docs/standard/php/runtime#filesystem
An App Engine application cannot:
write to the filesystem. Applications can use Google Cloud Storage for
storing persistent files. Reading from the filesystem is allowed, and
all application files uploaded with the application are available.
You want to use Google Cloud Storage to upload photos. You can write to the /tmp directory temporarily, but that will not persist.

Google Cloud Storage Python list_blob() not printing object list

I am Python and Google Cloud Storage newbie.
I am writing a python script to get a file list from Google Cloud Storage bucket using Google Cloud Python Client Library and list_blobs() function from Bucket class is not working as I expected.
https://googlecloudplatform.github.io/google-cloud-python/stable/storage-buckets.html
Here is my python code:
from google.cloud import storage
from google.cloud.storage import Blob
client = storage.Client.from_service_account_json(service_account_json_key_path, project_id)
bucket = client.get_bucket(bucket_id)
print(bucket.list_blobs())
If I understood the documentation correctly, print(bucket.list_blobs()) should print something like this: ['testfile.txt', 'testfile2.txt'].
However, my script printed this:
"google.cloud.storage.bucket._BlobIterator object at 0x7fdfdbcac590"
delete_blob() documentation has example code same as mine.
https://googlecloudplatform.github.io/google-cloud-python/stable/storage-buckets.html
I am not sure what I am doing wrong here.
Any pointers/examples/answers will be greatly appreciated. Thanks!
An example list function:
def list_blobs(bucket_name):
"""Lists all the blobs in the bucket."""
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blobs = bucket.list_blobs()
for blob in blobs:
print(blob.name)
What do you see if you:
for blob in bucket.list_blobs():
print(blob)
print(blob.name)

Categories

Resources