Automating running Python code using Azure services - python

Hi everyone on Stackoverflow,
I wrote two python scripts. One script is for picking up local files and sending them to GCS (Google Cloud Storage). Another one is opposite - for taking files from GCS that were uploaded and saving locally.
I want to automate process using Azure.
What would you recommend to use? Azure Function App, Azure Logic App or other services?
*
I'm now trying to use Logic App. I made .exe file using pyinstaller and looking for connector in Logic App that will run my program (.exe file). I have trigger in Logic App - "When a file is added or modified", but now I stack when selecting next step (connector)..
Kind regards,
Anna
Adding code as requested:
from google.cloud import storage
import os
import glob
import json
# Finding path to config file that is called "gcs_config.json" in directory C:/
def find_config(name, path):
for root, dirs, files in os.walk(path):
if name in files:
return os.path.join(root, name)
def upload_files(config_file):
# Reading 3 Parameters for upload from JSON file
with open(config_file, "r") as file:
contents = json.loads(file.read())
print(contents)
# Setting up login credentials
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = contents['login_credentials']
# The ID of GCS bucket
bucket_name = contents['bucket_name']
# Setting path to files
LOCAL_PATH = contents['folder_from']
for source_file_name in glob.glob(LOCAL_PATH + '/**'):
# For multiple files upload
# Setting destination folder according to file name
if os.path.isfile(source_file_name):
partitioned_file_name = os.path.split(source_file_name)[-1].partition("-")
file_type_name = partitioned_file_name[0]
# Setting folder where files will be uploaded
destination_blob_name = file_type_name + "/" + os.path.split(source_file_name)[-1]
# Setting up required variables for GCS
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
# Running upload and printing confirmation message
blob.upload_from_filename(source_file_name)
print("File from {} uploaded to {} in bucket {}.".format(
source_file_name, destination_blob_name, bucket_name
))
config_file = find_config("gcs_config.json", "C:/")
upload_files(config_file)
config.json:
{
"login_credentials": "C:/Users/AS/Downloads/bright-velocity-___-53840b2f9bb4.json",
"bucket_name": "staging.bright-velocity-___.appspot.com",
"folder_from": "C:/Users/AS/Documents/Test2/",
"folder_for_downloaded_files": "C:/Users/AnnaShepilova/Documents/DownloadedFromGCS2/",
"given_date": "",
"given_prefix": ["Customer", "Account"] }

Currently, there is no built-in connector in Logic Apps for interacting with Google Cloud Services. however, you can use Google Cloud Storage does provide REST API in your Logic app or Function app.
But my suggestion is you can use the Azure Function to do these things. Because the azure Function can be more flexible to write your own flow to do the task.
Refer to run your .exe file in the Azure function. If you are using Local EXE or using Cloud Environment exe.
Refer here for more information

Related

python delete files from google cloud storage that starts with

Using python I am able to delete files from bucket using prefixes also but in python code, prefix means directory.
I want to delete the files from GCP bucket which starts with example.
For example:
example-2022-12-07
example-2022-12-08
I followed this(Delete Files from Google Cloud Storage) but did not get the answer.
I am trying this, but not working:
blobs = bucket.list_blobs()
fileList = [file.name for file in blobs if 'example' in file.name ]
print(fileList)
for file in fileList:
blob = blobs.blob(file)
blob.delete()
print(f"Blob {blob_name} deleted.")
You can try the following code to delete files from the Google Cloud Storage by using the blob.delete method as suggested in the Documentation.
Below is the example for what you are looking:
from google.cloud import storage
client = storage.Client()
bucket = client.get_bucket(bucket_name)
# list all objects in the directory
# Add prefix as parameter to bucket.list_blobs
blobs = bucket.list_blobs(prefix=?)
for blob in blobs:
blob.delete()
print(f"Blob {blob_name} deleted.")`
You can check with this thread1 and thread2.

get list of all folders in firebase storage using python

i'm using django app and firebase-admin and i have a files and sub folders(nested folders) as shown in image
each folder has its own files and folders.
i want to get a List of all folders and files inside each root folder ,
my code is :
service_account_key = 'mysak.json'
cred = firebase_admin.credentials.Certificate(service_account_key)
default_app = firebase_admin.initialize_app(cred, {
'storageBucket': 'myBucketUrl'
})
bucket = storage.bucket()
blob = list(bucket.list_blobs()) #this is returning all objects and files in storage not for the folder i want
for example i want all files in first_stage/math so i can get a url for each file
i have also read the docs about firebase storage and there is no such a method
Based on documentation List objects | Cloud Storage | Google Cloud you can do something like
bucket.list_blobs(prefix="first_stage/math")

FileNotFoundError when trying to access a file in google cloud that exists inside a bucket storage

FileNotFoundError When trying to read/access a file or a folder that exists in the bucket in the google cloud by referencing gs://BUCKET_NAME/FolderName/.
I am using python 3 as the kernel with a jupyter notebook. I have a cluster configured in the google cloud linked to a bucket. When ever I try to read/upload a file I am getting the file not found error
def get_files(bucketName):
files = [f for f in listdir(localFolder) if
isfile(join(localFolder, f))]
for file in files:
print("file path:", file)
get_files("agriculture-bucket-gl")
I should be able to access the folder contents or to reference any file that exists inside any folder in the bucket.
Error Message:
FileNotFoundError: [Errno 2] No such file or directory: 'gs://agriculture-bucket-gl/Data sets/'
You need to access the bucket using the storage library, to get the file and then get content.
You may find this code template helpful.
from google.cloud import storage
# Instantiates a client
client = storage.Client()
bucket_name = 'your_bucket_name'
bucket = client.get_bucket(bucket_name)
blob = bucket.get_blob('route/to/file.txt')
downloaded_blob = blob.download_as_string()
print(downloaded_blob)
To add to the the previous answers, the path in the Error Message: FileNotFoundError: [Errno 2] No such file or directory: 'gs://agriculture-bucket-gl/Data sets/' also contains some issues. I'd try fixing the following:
The folder name "Data sets" has a space. I'd try a name without the space.
There is the / sign is at the end of the path. The path should end without a slash.
If you want to access from storage
from google.cloud import storage
bucket_name = 'your_bucket_name'
blob_path = 'storage/path/fileThatYouWantToAccess'
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blob = bucket.blob(blob_path)
#this is optional if you want to download it to tmp folder
blob.download_to_filename('/tmp/fileThatYouWantToAccess')

Create Folders inside google cloud storage bucket using python

I am trying to create a new bucket with 2 empty folders within it on Google Cloud storage using python client library.
I referred to the python client library API for GCS (https://google-cloud-python.readthedocs.io/en/latest/storage/client.html) and I found a create_bucket() method, but I would also like to create 2 folders - 'processed' and 'unprocessed' within it, but not able to find a method to create folders. Any help would be appreciated.
Thanks
GCS has a flat namespace, i.e., the concept of a 'folder' is not built into the service but rather an abstraction implemented by various clients. For example, both the Cloud Storage web UI (console.cloud.google.com/storage/browser) and gsutil implement the folder abstraction using an object name that ends with "/"
Thus, you could create folders by creating objects like your-bucket/abc/def/
but that would only be a folder to clients that know about/support that naming convention.
def copyFilesInFolder(self, file_name, src_blob_name, destination_blob_name):
"""Copies a blob from one bucket to another with a new name."""
# bucket_name = "your-bucket-name"
# blob_name = "your-object-name"
# destination_bucket_name = "destination-bucket-name"
# destination_blob_name = "destination-object-name"
# storage_client = storage.Client()
srcBlob = src_blob_name + '/' + file_name
destBlob = destination_blob_name + '/' + file_name
source_blob = self.bucket.blob(srcBlob)
destination_bucket = storage_client.bucket(destBlob)
blob_copy = self.bucket.copy_blob(
source_blob, self.bucket, destBlob
)
print(blob_copy)
print(
"File {} in bucket {} copied to blob {} in bucket {}.".format(
file_name,
src_blob_name,
file_name,
destination_blob_name,
)
)
return True
In GCP direct folder creation concept is not there. So we can save a new file in the new folder, this way even the destination folder doesn't exist it'd be created.

Downloading a file from google cloud storage inside a folder

I've got a python script that gets a list of files that have been uploaded to a google cloud storage bucket, and attempts to retrieve the data as a string.
The code is simply:
file = open(base_dir + "/" + path, 'wb')
data = Blob(path, bucket).download_as_string()
file.write(data)
My issue is that the data I've uploaded is stored inside folders in the bucket, so the path would be something like:
folder/innerfolder/file.jpg
When the google library attempts to download the file, it gets it in the form of a GET request, which turns the above path into:
https://www.googleapis.com/storage/v1/b/bucket/o/folder%2Finnerfolder%2Ffile.jpg
Is there any way to stop this happening / download the file though this way? Cheers.
Yes - you can do this with the python storage client library.
Just install it with pip install --upgrade google-cloud-storage and then use the following code:
from google.cloud import storage
# Initialise a client
storage_client = storage.Client("[Your project name here]")
# Create a bucket object for our bucket
bucket = storage_client.get_bucket(bucket_name)
# Create a blob object from the filepath
blob = bucket.blob("folder_one/foldertwo/filename.extension")
# Download the file to a destination
blob.download_to_filename(destination_file_name)
You can also use .download_as_string() but as you're writing it to a file anyway downloading straight to the file may be easier.
The only slightly awkward thing to be aware of is that the filepath is the path from after the bucket name, so doesn't line up exactly with the path on the web interface.

Categories

Resources