Downloading Files from Google Cloud Storage to Remote Server - python

Requirement :
I need to execute a script from a remote server which download a particular file from google cloud storage.
Script should use a service account where key should be in hashicorp vault.
i have already a hashicorp vault setup established
but not sure how to invoke it from shell/python script
so if some can help me in how to download file from GCS by using service account key in hashicorp vault
either python/shell script would be fine

you can follow the google documentation to download an objects using:
Console
gsutil
Programming Language Script
curl + rest api
for more information visit this Link
For a python utility you can use this code
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "storage-object-name"
# destination_file_name = "local/path/to/file"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
# Construct a client side representation of a blob.
# Note `Bucket.blob` differs from `Bucket.get_blob` as it doesn't retrieve
# any content from Google Cloud Storage. As we don't need additional data,
# using `Bucket.blob` is preferred here.
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print(
"Blob {} downloaded to {}.".format(
source_blob_name, destination_file_name
)
)

Related

Unable to upload image to firebase storage from python server

I am creating an android app which sends an image to python server and from my python server I want to upload that received image to firebase storage. My problem is that when I try to upload the received image from python server only the filename is stored in the specified collection but my image doesn't upload. My firebase storage looks like this
Firebase storage screenshot
In the attached screenshot the first image starting with Actual is the one that I upload from android and the second starting with processed is the one that I am trying to upload from python but unable to do that. The file type is also different that the one uploaded from android. Below is my code that I am using to upload the image from my server:
Function where I receive image from android:
def handle_request():
print(flask.request.files)
print(flask.request.form)
imagefile = flask.request.files['image']
userID = flask.request.form['user_ID']
upload_to_firebase(imagefile, filename, userID)
Function which stores the image to firebase storage.
def upload_to_firebase(file, filename, userID):
firebase = Firebase(config)
firebase = pyrebase.initialize_app(config)
storage = firebase.storage()
storage.child(userID + "/" + filename + "/Processed_" + filename+"").put(file)
downloadURL = storage.child(userID + "/" + filename + "/Processed_" + filename+"").get_url(None)
Is there any way I can pass the content type image/jpeg while sending the image or any other way I can fix this. I have searched a lot this solution but none have worked so far.
Notice that as explained on the Firebase documentation:
You can use the bucket references returned by the Admin SDK in conjunction with the official Google Cloud Storage client libraries to upload, download, and modify content in the buckets associated with your Firebase projects
Therefore you'll need to modify your upload_to_firebase function to something similar as explained here on the relevant section of the Google Cloud Storage client library for Python:
from google.cloud import storage
def upload_blob(bucket_name, source_file_name, destination_blob_name):
"""Uploads a file to the bucket."""
# bucket_name = "your-bucket-name"
# source_file_name = "local/path/to/file"
# destination_blob_name = "storage-object-name"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_filename(source_file_name)
print(
"File {} uploaded to {}.".format(
source_file_name, destination_blob_name
)
)
where you could obtain the bucket_name variable from the name propertyof the bucket object you define with the Firebase Admin SDK (e.g. if you are using the default bucket of your project):
import firebase_admin
from firebase_admin import credentials
from firebase_admin import storage
cred = credentials.Certificate('path/to/serviceAccountKey.json')
firebase_admin.initialize_app(cred, {
'storageBucket': '<BUCKET_NAME>.appspot.com'
})
bucket = storage.bucket()
and the source_file_name will correspond to the full path of the uploaded image within the server serving your Flask application.
Notice that you could end up with disk space issues if not properly deleting or managing the files uploaded to the Python server, so be careful on that regards.

Video Reading Problem from Google Cloud Bucket

I am trying to deploy my website to Google Cloud. However, I have a problem with video processing. My website takes video from the user and then it shows that video or previously updated videos to the user. I can show the video on the template page. And I also need to process that video in the background. So, I should read the corresponding video with OpenCV. My code works locally. However, in Google Cloud part, the video is stored as a URL and OpenCV cannot read using URL as expected. According to the sources, the solution is to download the video into local file system:
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "storage-object-name"
# destination_file_name = "local/path/to/file"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print(
"Blob {} downloaded to {}.".format(
source_blob_name, destination_file_name
)
)
https://cloud.google.com/storage/docs/downloading-objects#code-samples
I have two problems with this code:
1-First, I do not want to download the video into my local computer. I need to keep the video in the Google Cloud and I have to read the video with OpenCV from there.
2-When I try to run the above code, I still get an error because it cannot download the video into the “destination_file_name”.
Could anyone help me with this problem?
Best.
Edit: I solved the problem with the help of answers. Thank you. I download video file to /tmp folder than use with OpenCV. Here is my function:
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "path/file in Google Cloud Storage bucket"
# destination_file_name = "/tmp/path/to/file(without extension, for example in my case ".mp4")"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_blob_name)
with open(destination_file_name, "wb") as file_obj:
blob.download_to_file(file_obj)
This is a duplicate question. You cannot write to the production server in the cloud. From: https://cloud.google.com/appengine/docs/standard/php/runtime#filesystem
An App Engine application cannot:
write to the filesystem. Applications can use Google Cloud Storage for
storing persistent files. Reading from the filesystem is allowed, and
all application files uploaded with the application are available.
You want to use Google Cloud Storage to upload your videos.
You can write to the /tmp directory temporarily, but that will not persist. But it may work for your need:
# destination_file_name = "/tmp/path/to/file"
This is a duplicate question. You cannot write to the production server in the cloud. From: https://cloud.google.com/appengine/docs/standard/php/runtime#filesystem
An App Engine application cannot:
write to the filesystem. Applications can use Google Cloud Storage for
storing persistent files. Reading from the filesystem is allowed, and
all application files uploaded with the application are available.
You want to use Google Cloud Storage to upload photos. You can write to the /tmp directory temporarily, but that will not persist.

How to read data in cloud storage data using clouds functions

I'm trying to make a cloud function using python, which reads json files containing schemas of tables from a directory in the cloud storage and from these schemas I need to create tables in bigquery.
I had some attempts to access cloud storage, but without success, previously I developed something similar in google colab, reading these schemas from a directory on the drive, but now things seem quite different.
Can someone help me?
You can check the Streaming data from Cloud Storage into BigQuery using Cloud Functions solution guide of GCP.
If you'd like a different approach you can refer to the download object guide at the GCP doc to retrieve the data from GCS, see the sample code below.
from google.cloud import storage
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
# bucket_name = "your-bucket-name"
# source_blob_name = "storage-object-name"
# destination_file_name = "local/path/to/file"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print(
"Blob {} downloaded to {}.".format(
source_blob_name, destination_file_name
)
)
You can create Cloud Function and read the download data from the file in Cloud Storage
def loader(event, context):
"""Triggered by a change to a Cloud Storage bucket.
Args:
event (dict): Event payload.
context (google.cloud.functions.Context): Metadata for the event.
"""
try:
file_name = event['name']
bucket_name = event['bucket']
client = storage.Client()
bucket = client.get_bucket(bucket_name)
file_blob = storage.Blob(file_name, bucket)
data = file_blob.download_as_string().decode()
Once you get the data you can create table in BigQuery.

Using files in google cloud storage

I want to transfer the ETL scripts (I used to run them on a local machine) to the google cloud, using the Google Functions + Google Scheduler.
Scripts move data from/to Google Analytics (this is an example).
I have a problem with the location of the key file (.p12).
I would like to put it in Google Storage and point the way to it.
Currently KEY_FILE_LOCATION = r'c:/local_path/file.p12'.
Connect to google analytics:
def initialize_analyticsreporting():
credentials = ServiceAccountCredentials.from_p12_keyfile(
SERVICE_ACCOUNT_EMAIL, KEY_FILE_LOCATION, scopes=SCOPES)
http = credentials.authorize(httplib2.Http())
analytics = build('analytics', 'v4', http=http, discoveryServiceUrl=DISCOVERY_URI)
return analytics
I would like to use
from google.cloud import storage
client = storage.Client ()
bucket = client.get_bucket ('utk_owox')
.....
But I can not understand how to do it correctly.
Please be aware that Cloud Functions API client libraries that use application default credentials automatically obtain the built-in service account credentials from the Cloud Functions host at runtime. By default, the client authenticates using the YOUR_PROJECT_ID#appspot.gserviceaccount.com service account.
So if you can configure this service account to have appropriate permissions for the resources that you need to access, it is recommended to relay on it.
Example how to upload file to the bucket in python.
Example how to download file from the bucket in python:
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print('Blob {} downloaded to {}.'.format(
source_blob_name,
destination_file_name))
Create credentials from the file as described here:
def get_ga_service(bucket):
download_gcs_file('auth.json', '/tmp/auth.json', bucket)
credentials = ServiceAccountCredentials.from_json_keyfile_name(
'/tmp/auth.json',
scopes=['https://www.googleapis.com/auth/analytics',
'https://www.googleapis.com/auth/analytics.edit'])
# Build the service object.
return build('analytics', 'v3', credentials=credentials, cache_discovery=False)

download_to_filename makes an empty file (google cloud storage)

Answer:
I needed to grant ownership access to the storage policy and not just grant a general API ownership to the IAM connected to my instance.
I have simply followed the tutorial on the website here: https://cloud.google.com/storage/docs/downloading-objects#storage-download-object-python
Which says
def download_blob(bucket_name, source_blob_name, destination_file_name):
"""Downloads a blob from the bucket."""
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(source_blob_name)
blob.download_to_filename(destination_file_name)
print('Blob {} downloaded to {}.'.format(
source_blob_name,
destination_file_name))
download_blob([BUCKETNAME],[FILENAME],"/home/me/documents/file.png")
I don't receive an error but the last line to be executed is blob.download_to_filename(destination_file_name)
Which creates an empty file.
Additional info:
My bucket is in the format "mybucketname"
my file is in the format "sdg-1234-fggr-34234.png"
I hope anyone has knowledge about my issue.
Is it encoding? Or doesn't the download_to_filename execute? Or something else?
When working with API's it's important to grant ownership access to the storage policy and not just grant a general API ownership to the IAM connected to an instance.

Categories

Resources