GitHub Actions - Where are downloaded files saved? - python

I've seen plenty of questions and docs about how to download artifacts generated in a workflow to pass between jobs. However, I've only found one thread about persisting downloaded files between steps of the same job, and am hoping someone can help clarify how this should work, as the answer on that thread doesn't make sense to me.
I'm building a workflow that navigates a site using Selenium and exports data manually (sadly there is no API). When running this locally, I am able to navigate the site just fine and click a button that downloads a CSV. I can then re-import that CSV for further processing (ultimately, it's getting cleaned and sent to Redshift). However, when I run this in GitHub Actions, I am unclear where the file is downloaded to, and am therefore unable to re-import it. Some things I've tried:
Echoing the working directory when the workflow runs, and setting up my pandas.read_csv() call to import the file from that directory.
Downloading the file and then echoing os.listdir() to print the contents of the working directory. When I do this, the CSV file is not listed, which makes me believe it was not saved to the working directory as expected. (which would explain why #1 doesn't work)
FWIW, the website in question does not give me the option to choose where the file downloads. When run locally, I hit the button on the site, and it automatically exports a CSV to my Downloads folder. So I'm at the mercy of wherever GitHub decides to save the file.
Last, because I feel like someone will suggest this - it is not an option for me to use read_html() to scrape the file from the page's HTML.
Thanks in advance!

Related

How do I add a dynamic text file to a very simple heroku app?

I have a very simple heroku app that is basically running one python script every ten minutes or so 24/7. The script uses a text file to store a really simple queue of information (no sensitive info) that gets updated every time it runs.
I have heroku set to deploy the app via Github, but it seems like way too much work to make it programmatically commit, push, and redeploy the entire thing just to update the queue in the text file. How can I attach this file to Heroku in a way that can let it be updated easily? I've been playing around with the free database add-ons but those also seem overcomplicated (in the sense that I've got no clue how to use them).
I'm also totally open to accusations that I'm making mountains out of molehills when I could easily be using some other easier platform to freely run this script 24/7 with the queue file.
At this point I'm sure that nobody cares, but this answer is for you, future troubleshooter.
It turns out that the Heroku script works fine with a txt file queue. Once the queue included in the Heroku deployment, the script will pull from the queue and even update it, giving the correct behavior. all you have to do is put the queue in with the github repo and open/change the file with the python files as you normally would.
The confusing thing is that this does not change the files in Github. It leaves the (github repo copy) queue as the same text file it was when it was originally pushed. This means that pulling and pushing the repo is a little confusing because the stored queue gets out of date very fast.
Thanks for the question, me, I'm happy to help.

Accessing Google Keep data through script?

I am trying to create a script that get the data from a google keep list I was thinking Google Takeout might do part of what I want but I cannot find a API to automate the downloads. Does anyone know a way to grab this data via script (python/bash) so that I can easily extract what I need?
I am not sure if it is allowed or not, but you could login via a BeautifulSoup session and navigate to the site you wish to parse.
I've written a quite similar script for Python, you can find it at github, i thinkt it's pretty self-explanatory but if you should require any more help feel free to ask.
You could use selenium library for that.
Used the framework to scrape the keep.google.com webpage for all the notes and export them to a csv file
This Might be helpful, i made the script to backup my notes to my computer
https://github.com/darshkpatel/GoogleKeep_Backup
There is no API for Google Keep at this time. I don't think your going to be able automate Google Takeout either the best you will be able to do would be run it manually then create your own application to import it were ever it is you want to import it to.
Here is an automated solution for this question: a link!
Or just execute these commands in the terminal:
git clone https://github.com/Dmitry9/exportKeep.git;
cd exportKeep;
npm install;
npm run scrape;
After all dependencies installed (could take a minute or so) chrome instance will navigate to the sign-in page. After posting credentials it will scroll to the bottom of the window to force the browser to load all of the notes inside DOM. Inspecting the output of the terminal you will find a path to the saved JSON file.
In the meanwhile there is an API, see here: https://developers.google.com/keep/api/reference/rest
Also, there is a python library that implements this API (I'm not the author of the library): https://github.com/kiwiz/gkeepapi

Running an exe file on the azure

I am working on an azure web app and inside the web app, I use python code to run an exe file. The webapp recieves certain inputs (numbers) from the user and stores those inputs in in a text file. Afterwards, an exe file would run and read the inputs and generate another text file, called "results". The problem is that although the code works fine on my local computer, as soos as I put it on azure, the exe file does not get triggered by the following line of code:
subprocess.call('process.exe',cwd = case_directory.path, shell= True)
I even tried running the exe file on Azure manually from the Visual Studio Team Services (was Visual Studio Online) by "running from Console" option. It just did not do anything. I'd appreciate if anyone can help me.
Have you looked at using a WebJob to host\run your executable from? A WebJob can be virtually any kind of script or win executable. There are a number of ways to trigger your WebJob. You also get a lot of buit in monitoring and logging for free as well, via the Kudu interface.
#F.K I searched some information which may be helpful for you, please see below.
Accroding to the python document for subprocess module, Using shell=True can be a security hazard. Please see the warning under Frequently Used Arguments for details.
There is a comment in the article which gave a direction for the issue, please see the screenshot below.
However, normally, the recommended way to satisfy your needs is using Azure Queue & Blob Storage & Azure WebJobs to save the input file into a storage queue, and handling the files got from queue and save the result files into blob storage by a continuous webjob.

Authentication issue when uploading video to YouTube using YouTube API and cron

I am trying to upload a video to YouTube each afternoon using the sample YouTube Python upload script from Google (see Python code examples on developers.google, I haven't enough reputation to post more links...). I would like to run it as a cronjob. I have created the client_secrets.json file and tested the script manually. It works fine when I run it manually, however when I run the script as a cronjob I get the following error:
To make this sample run you will need to populate the
client_secrets.json file found at:
/usr/local/cron/scripts/client_secrets.json
with information from the Developers Console
https://console.developers.google.com/
For more information about the client_secrets.json file format, please
visit:
https://developers.google.com/api-client-library/python/guide/aaa_client_secrets
I've included the information in the JSON file already and the -oauth2.json file is also present in /usr/local/cron/scripts.
Is the issue because the cronjob is running the script as root and somehow the credentials in one of those two files are no longer valid? Any ideas how I can enable the upload with cron?
Cheers
James
Ok, so 7 months later I've come back to this cron issue. It turns out that the upload2youtube.py example file was hardcoded to look in the current directory for the clients_secrets.json file. This explains why I could run it manually from the local directory but not on cron. I've included the full path in the example file and this works fine now.

Edit workspace files via jenkins UI

Is there an easy way to edit our python files in the Jenkins workspace UI?
It would be super nice if we could get code highlighting too!
There is a jenkins plugin that allows you to edit files: Config File Provider
It cant edit random file but you can use it to achieve what you want.
The storage of the plugin is in the form of xml files in jenkins folder. This means that you could create script that recreates those files wherever you need them by parsing those xml files (plugin does this for the workspace although it requires build setp). For instance, i could add new custom config file like this:
Name: script.sh
Comment: /var/log
Content: ....
This will be available then in xml file which you could parse within cron job to create actual files where you need them
The closest I can think of that Jenkins offers is a file upload. You can upload file with local changes and then trigger a build. This file will be replaced at already specified location. This feature can be used by making your build parameterized and adding File Parameter option. Below is what Jenkins says about the description of this feature.
Accepts a file submission from a browser as a build parameter. The uploaded file will be placed at the specified location in the workspace, which your build can then access and use.
This is useful for many situations, such as:
Letting people run tests on the artifacts they built.
Automating the upload/release/deployment process by allowing the user to place the file.
Perform data processing by uploading a dataset.
It is possible to not submit any file. If it's case and if no file is already present at the specified location in the workspace, then nothing happens. If there's already a file present in the workspace, then this file will be kept as-is.

Categories

Resources