I want to copy my own photos in a given web directory to my Raspberry so I can display them in a slideshow.
I'm looking for a "simple" script to download these files using python. I can then paste this code into my slideshow so that it refreshes the pics every day.
I suppose that the python wget utility would be the tool to use. However, I can only find examples on how to download a single file, not a whole directory.
Any ideas how to do this?
It depends on the server used to host the images and if the script can see a list of images to download. If this list isn't there in some form e.g. a webpage list, JSON or XML feed, there is no way for a script to download the files as the script doesnt "know" what's there dynamically.
Another option is for a python script to SSH into the server, list the contents of a directory and then download. This presumes you have programmatic access to the server.
If access to the server is a no, and there is no dynamic list then the last option would be to go to this website where you know the photos are and scrape their paths and download them. However this may scrape unwanted data such as other images, icons, etc.
https://medium.freecodecamp.org/how-to-scrape-websites-with-python-and-beautifulsoup-5946935d93fe
Related
I've seen plenty of questions and docs about how to download artifacts generated in a workflow to pass between jobs. However, I've only found one thread about persisting downloaded files between steps of the same job, and am hoping someone can help clarify how this should work, as the answer on that thread doesn't make sense to me.
I'm building a workflow that navigates a site using Selenium and exports data manually (sadly there is no API). When running this locally, I am able to navigate the site just fine and click a button that downloads a CSV. I can then re-import that CSV for further processing (ultimately, it's getting cleaned and sent to Redshift). However, when I run this in GitHub Actions, I am unclear where the file is downloaded to, and am therefore unable to re-import it. Some things I've tried:
Echoing the working directory when the workflow runs, and setting up my pandas.read_csv() call to import the file from that directory.
Downloading the file and then echoing os.listdir() to print the contents of the working directory. When I do this, the CSV file is not listed, which makes me believe it was not saved to the working directory as expected. (which would explain why #1 doesn't work)
FWIW, the website in question does not give me the option to choose where the file downloads. When run locally, I hit the button on the site, and it automatically exports a CSV to my Downloads folder. So I'm at the mercy of wherever GitHub decides to save the file.
Last, because I feel like someone will suggest this - it is not an option for me to use read_html() to scrape the file from the page's HTML.
Thanks in advance!
So basically, I have a python script that scrapes web data and stores it in JSON files. Now I want to add those JSON files to my Xcode project so I can display the data.
Right now my json files are stored in a folder. My first approach was to utilize a DB and use python to send those json files to the DB. For now it seems to sort of work, but these json files are intended to be updated daily.
My second approach was to just send those files from my python script to my Resources folder I made inside Xcode, however the files don't show up in the project.
Does anyone here that had experienced this can help me go in the right direction with this? Is there any flaws in my approach that I had missed?
I need to get a complete list of files in a folder and all its subfolders regularly (daily or weekly) to check for changes. The folder is located on a server that I access as a network share.
This folder currently contains about 250,000 subfolders and will continue to grow in the future.
I do not have any access to the server other than the ability to mount the filesystem R/W.
The way I currently retrieve the list of files is by using python's os.walk() function recursively on the folder. This is limited by the latency of the internet connection and currently takes about 4.5h to complete.
A faster way to do this would be to create a file server-side containing the whole list of files, then transfering this file to my computer.
Is there a way to request such a recursive listing of the files from the client side?
A python solution would be perfect, but I am open to other solutions as well.
My script is currently run on Windows, but will probably move to a Linux server in the future; an OS-agnostic solution would be best.
You have provided the answer to your question:
I do not have any access to the server other than the ability to mount the filesystem R/W.
Nothing has to be added after that, since any server side processing requires the ability to (directly or indirectly) launch a process on the server.
If you can collaborate with the server admins, you could ask them to periodically start a server side script that would build a compressed archive (for example a zip file) containing the files you need, and move it in a specific location when done. Then you would only download that compressed archive saving a lot of network bandwidth.
You can approach this in multiple ways. I would do this by doing a running a script over ssh like
ssh xys#server 'bash -s' < local_script_togetfilenames.sh
If you prefer python you can run a similar python script by adding #!python assuming python is installed on the server
If you want to stick to fully python you should explore python RPC(Remote process call)
You can use rPyC library . Documentation is
here
Is it possible to make a program in Python in which the program automatically organize downloads from Whatsapp Web with Python?
By default when downloading an image (or file) from WhatsApp Web it remains in the folder "C:\Users\Name_User\Downloads" for windows users.
The purpose of the program is to dynamically change the default directory and to store each download according to the number (or name) of the contact from which the file comes.
Is this thing possible on python?
Sure thing you can manipulate and list any files with standard os module(copy,delete,move files,create directories etc.).Also a third party module called watchdog can monitor directory or even files changes.
I'm progressively adding images to a dropbox folder remotely which I then need to download on my raspberry pi 3.
The thing is I only need the latest uploaded image in that folder so that I can classify it remotely using some code deployed on my raspberry pi 3.
I don't know the dropbox api well so I don't know if there's any functionality to directly implement what I said above, so I'm trying to download the entire folder with all the images locally and then select the image that I want.
Dropbox api v2 says they added functionality to download entire folders as zip files but whenever I try to implement the code given in the api and save the file locally, the local zip files always says it's corrupt and can't be opened.
Does anyone know how this can be implemented in python ?
Edit: Or maybe shed light if there's a simpler way to download the latest uploaded image to a folder without explicitly changing the code with that specific image's name or link ?
https://www.dropbox.com/developers/documentation/http/documentation#files-download_zip
Start by getting the download working in a Linux terminal using CURL, then you can work your way up by making the HTTP request using Python Requests library. That way you can debug it systematically. Make sure there aren't any issues with file permissions on Dropbox or API tokens.