List files in Google Cloud Virtual Machines before SCP - python

I am trying to download specific files from a google cloud virtual machine. The majority of directories that my google cloud command is searching in have just 1 file for that name. However, some directories have multiple files of similar name with different time stamps. Is there a command I can use to list the files within a google cloud directory so i can find the latest file name before using SCP?
I am currently using the following f string via os.system to download the files. However, this is not good enough for the case where multiple files are in the directory.
download_file = f"gcloud compute scp {project}:/nfs-client/example/documents/ID-{ID}/files/response* --zone=europe-west2-c ./temp-documents/ID-{ID}.xml"
os.system(download_file)

You can use gcloud compute ssh command to get the latest file from a folder:
gcloud compute ssh example-instance --zone=us-central1-a --command "ls -t /nfs-client/example/documents/ID-{ID}/files/response* | head -1" or something like that
then substitute your scp command with the output from the above command to get the latest file.

You can use:
gcloud compute scp instance-2:$(gcloud compute ssh instance-2 --zone=europe-west2-c --command "ls -t /nfs-client/example/documents/ID-{ID}/files/response* | head -1") /temp-documents/ID-{ID}.xml --zone=europe-west2-c

It sounds like what you really want to do is download all the files in the directory. You can do this by passing scp the --recurse flag, like this:
command = f"gcloud compute scp --recurse {project}:/nfs-client/example/documents/ID-{ID}/files/ --zone=europe-west2-c ./temp-documents/ID-{ID}".format(project, ID)
os.system(command)
This will create a directory with the ID, and then put all the response files into that directory.
If, on the other hand, you'd really like to list the files, you could get a list by using gcloud compute ssh, and then fetching the output. You'll need to use something like subprocess.Popen instead of os.system, though:
import subprocess
command = f"gcloud compute ssh {project} -- ls /nfs-client/example/documents/ID-{ID}/files/ --zone=europe-west2-c".format(project, ID)
process = subprocess.Popen(command.split(), stdout=subprocess.PIPE)
out = process.communicate()
files = out[0].split()
A few things of note here: fetching a list of files via SSH like this is pretty hacky, and prone to potential breakage. Shelling out to gcloud to do this is pretty inelegant, and you're better off putting those files in something like Google Cloud Storage in order to easily access them from your local machine. The command needs to have your instance name (which you seem to be calling project), and your ID templated in -- I assume you're doing that, and just put in a placeholder. You also need to parse the output -- parsing with str.split() works okay, but could be error prone, particularly if there are spaces in the filenames. There are ways to handle this, but that's another rabbit hole.

Related

Asterisk AGI AWS "ProfileNotFound: The config profile (foo) could not be found"

To give you some background I have a bash script being launched from Asterisk via a Python AGI that runs against Amazon Polly and generates a .sln file. I have this working on a CentOS server but am attempting to migrate it to a Debian Server.
This is the line item of code that is giving me problems
aws polly synthesize-speech --output-format pcm --debug --region us-east-2 --profile asterisk --voice-id $voice --text "$1" --sample-rate 8000 $filename.sln >/dev/null
I keep getting this error
ProfileNotFound: The config profile (foo) could not be found
This is an example of my /root/.aws/config
[default]
region = us-east-2
output = json
[profile asterisk]
region = us-east-2
output = json
[asterisk]
region = us-east-2
output = json
The /root/.aws/credentials looks similar but with the keys in them.
I've even tried storing all this data in environment variables and going with default so as to get past this, but then I have the issue where it throws unable to locate credentials, or must define region (got past that by defining the region inline). It's almost like, Asterisk is somehow running this out of some isolated session that I can't get the credentials or config/credentials file to. Which from research and how I set it up it is currently running as Root so that should not be an issue.
Any help is much appreciated, thanks!
Asterisk should be runned under asterisk user for security.
Likly on your prevous install it was under root, so all was working.
Please ensure you have setuped AWS Polly for asterisk user or create sudo entry and use sudo.
If you use System command it also have no shell(bash), so you have start it via bash script and setup PATH and other required variables yourself.

Uploading Python files to GCP and execute code

I was trying to create a VM and upload some python files to GCP, and run the code on my buckets.
I created the VM and SSH into it. After I set up my VM instance with all the Python libraries that I need. Now I'm trying to upload my python files to GCP so that I can execute the code.
So on my Mac, I did gcloud init and then tried the following:
gcloud compute scp /Users/username/Desktop/LZ_demo_poc/helper_functions.py /home/user_name/lz_text_classification/
However, I keep getting these error messages.
WARNING: `gcloud compute copy-files` is deprecated. Please use `gcloud compute scp` instead. Note that `gcloud compute scp` does not have recursive copy on by default. To turn on recursion, use the `--recurse` flag.
ERROR: (gcloud.compute.copy-files) Source(s) must be remote when destination is local. Got sources: [/Users/username/Desktop/LZ_demo_poc/helper_functions.py], destination: /home/user_name/lz_text_classification/
Can anyone help me with the process of running a python script on GCP using data that is saved as buckets.
You need to also specify the instance where you want the files copied, otherwise the destination is interpreted as a local file, leading to the (2nd line of) your error message. From the gcloud compute scp examples:
Conversely, files from your local computer can be copied to a virtual
machine:
$ gcloud compute scp ~/localtest.txt ~/localtest2.txt \
example-instance:~/narnia
In your case it should be something like:
gcloud compute scp /Users/abrahammathew/Desktop/LZ_demo_poc/helper_functions.py your_instance_name:/home/abraham_mathew/lz_text_classification/

Using python to ssh into multiple servers and grab the file with the same postfix

I normally use a bash script to grab all the files onto local machine and use glob to process all the files. Just wondering what would be the best way to use python (instead of another bash script) to ssh into each server and process those files?
My current program runs as
for filename in glob.glob('*-err.txt'):
input_open = open (filename, 'rb')
for line in input_open:
do something
My files all have the ending -err.txt and the directories where they reside in the remote server have the same name /documents/err/. I am not able to install third party libraries as I don't have the permission.
UPDATE
I am trying to not to scp the files from the server but to read it on the remote server instead..
I want to use a local python script LOCALLY to read in files on remote server.
The simplest way to do it is to use paramico_scp to use ssh copy from the remote server (How to scp in python?)
If you are not allowed to download any libraries, you can create SSH key pair so that connecting to server does not require a password (https://www.debian.org/devel/passwordlessssh). You then can for each file do
import os
os.system('scp user#host:/path/to/file/on/remote/machine /path/to/local/file')
Note that using system is usually considered less portable than using libraries. If you give the script that use system('scp ...') to copy the files and they do not have SSH key pair set up, they will experience problems
Looks like you want to use a local Python script remotely. This has been answered here.

How do I execute a remote Python file from a .sh executable?

I am using a Mac.
I have a Python file on a server. Let's say it's here: http://jake.com/python_file.py. I want to make a script that I can run in Terminal by double-clicking and will run the Python file in Terminal on the local machine. Any ideas?
Would I have to use SSH to connect to the server and download the file to a temporary location on the hard drive (/tmp?) and then delete it when I'm done? But there's another problem with this. It would have to download the Python file to a location in the user's home folder because I don't think users have the necessary permissions to write to the folder /tmp or /var or something like that.
I was looking around for a solution to my problem and found this. It talks about how to execute a remote script using SSH on unix but I tried doing this with my Python file and it didn't work.
In case you didn't realize, the main reason why I am looking to do this is so that a user can run the file locally but they are unable to read/edit the actual Python file which is stored on the server.
If anyone has any ideas on how to accomplish this (whether using the ideas mentioned above or not) please let me know I'd really appreciate it!
Thanks,
Jake
How about a python script to download and execute the code like so?
import requests
py1 = requests.get('https://example.com/file.py').content
exec(py1, globals(), locals())
note: I'm using the requests library, but you could just as easily use the built in httplib's HTTPSConnection. It's just more verbose.
note 2: When you say you don't want the user to be able to "read/edit the actual Python file", they will be able to read it if they open the url themselves and view the content. they are just less likely to edit the file locally and mess something up. You can also deploy updates to your python script to the URL rather than having to copy them locally to every machine. This could be a security risk depending on the scope of what you are using it for.
Assuming that the remote Python script is a single file with no dependencies other than those of the Python standard library, and that a compatible version of Python is installed on the user's local machine, there are a few obvious ways to do it.
Use ssh:
#!/usr/bin/env sh
ssh user#host cat /path/to/python/script.py | python
Or use scp:
#!/usr/bin/env sh
TMPFILE=/tmp/$$
scp user#host:/path/to/python/script.py $TMPFILE
python $TMPFILE
rm $TMPFILE
The first one has the obvious advantage of not requiring any mucking about with copying files and cleaning up afterwards.
If you wanted to execute the python script on the remote server, this can also be done with ssh:
ssh user#host python /path/to/python/script.py
Standard input and output will be a terminal on the user's local machine.

ssh - getting metadata of many remote files

there's a remote file system which i can access using ssh.
i need to:
scan this file system to find all the files newer than a given datetime.
retrieve a list of those files' names, size, and modified_time_stamp
some restrictions:
i can't upload a script to this remote server. i can only run commands through ssh
there could be well over 100k of files in the remote server, and this process should happen at least once a minute, so the number of ssh calls should be minimal, and preferably equal to 1
i've already managed to get (1) using this:
`touch -am -t {timestamp} /tmp/some_filename; find {path} -newer /tmp/some_filename; rm /tmp/some_filename')`
and i thought i can move in the direction of piping the results into "xargs ls -l" and then parsing the results to extract the size and timestamp from there, but then i found this article...
also, i'm running the command using python (i.e. it's not just a command line), so it's ok to do some post processing on the results coming from the ssh command
I suggest writing or modifying your python script on the server side as follows:
When no data has been acquired in a while, acquire initial data using the touch/find script you provided and making calls on the found files to get the needed properties
Then, in the python script on the server, subscribe to inotify() data to get updates.
When a remote connects and needs all this data, provide the latest update from combining 1+2
inotify is a system call supported in Linux that allows you to monitor file system events on a directory in real time.
See:
https://serverfault.com/questions/30292/automatic-notification-of-new-or-changed-files-in-a-folder-or-share
http://linux.die.net/man/7/inotify
https://github.com/seb-m/pyinotify

Categories

Resources