Best practice for watching and reliable uploading files in Python? - python

I'm building a desktop application for Windows in Python 2.7. The primary function of this application is to watch a folder for new files. Whenever a new file appears in this folder the app uploads it to remote server. The process on the remote server creates a db record for the file and stores remote file path in that record.
Currently I'm using watchdog to monitor directory and httplib for file upload.
What approach should I take to ensure that a new file will be uploaded reliably regardless of a network condition or internet connection loss?
Update: What I mean by reliable upload is that the app will upload the file even if the app restarts. Like Dropbox. Some files are quite big (> 100 MB) so simple solutions like wrapping the code in try / catch and starting the upload all over is not very efficient. I know Dropbox uses librsync, but it might be overkill in this case.
What if the source file has been changed during the upload? Should I stop the upload and start over?

You could maintain file or database of files names, timestamps and information about their upload status. Based on that data You will know what files were already sent and what to upload after any restart of application or computer.
Checking timestamps tells You that file has been modified and upload process should be started over.

Related

How can I share files in my server using python by web

I wanna share my files to clients.
Files are created in my server everyday. (ex. 20210701_A, 20210702_A ....)
Client will download today's file.
I already made choosing right file using python code.
But How can my clients download files from their own web?
(If possible, I want using Python)

Request recursive list of server files from client

I need to get a complete list of files in a folder and all its subfolders regularly (daily or weekly) to check for changes. The folder is located on a server that I access as a network share.
This folder currently contains about 250,000 subfolders and will continue to grow in the future.
I do not have any access to the server other than the ability to mount the filesystem R/W.
The way I currently retrieve the list of files is by using python's os.walk() function recursively on the folder. This is limited by the latency of the internet connection and currently takes about 4.5h to complete.
A faster way to do this would be to create a file server-side containing the whole list of files, then transfering this file to my computer.
Is there a way to request such a recursive listing of the files from the client side?
A python solution would be perfect, but I am open to other solutions as well.
My script is currently run on Windows, but will probably move to a Linux server in the future; an OS-agnostic solution would be best.
You have provided the answer to your question:
I do not have any access to the server other than the ability to mount the filesystem R/W.
Nothing has to be added after that, since any server side processing requires the ability to (directly or indirectly) launch a process on the server.
If you can collaborate with the server admins, you could ask them to periodically start a server side script that would build a compressed archive (for example a zip file) containing the files you need, and move it in a specific location when done. Then you would only download that compressed archive saving a lot of network bandwidth.
You can approach this in multiple ways. I would do this by doing a running a script over ssh like
ssh xys#server 'bash -s' < local_script_togetfilenames.sh
If you prefer python you can run a similar python script by adding #!python assuming python is installed on the server
If you want to stick to fully python you should explore python RPC(Remote process call)
You can use rPyC library . Documentation is
here

windows server file permission error when using watchdog

I have a python code that uses watchdog and pandas to automatically upload a newly added excel file once it has been pasted on a given path.
The code works well on my local machine but when I run it to access files on windows server 2012 r 2, I am getting a file permission error. what can be the best solution?
NB: I am able to access the same files using pandas read_excel() without using the watchdog but I want to automate the process so that it auto reads the files every time files are being uploaded
Few possible reasons that you get a permission deny
The file has been lock because someone is opening it.
Your account doesn't have the permission to read/write/execute

Remote server: running a Python 2.7 script and making *.csv files publicly available

I have a Python 2.7 script that produces *.csv files. I'd like to run this Python script on a remote server and make the *.csv files publicly available to read.
Can this be done on Heroku? I've gone through the tutorial, but it seems to be geared towards people who want to create a whole web site.
If Heroku isn't the solution for me, what are the alternatives? I tried Google App Engine, but it requires Python 2.5 and won't work with 2.7.
MORE DETAILS:
I have a Python 2.7 script that analyzes all stocks that trade on the AMEX, NYSE, and NASDAQ exchanges and writes the output into *.csv files that can be read with a spreadsheet application. I want the script to automatically run every night on a remote server, and I want the *.csv files it produces to be publicly available.
Web hosting
Ok so you should be able to achieve what you need pretty simply. There are many webhosts that have python support. Your requirement is pretty simple. Just upload your python scripts to the web server. Then you can schedule a cron job to call your script at a specific time every day. Your script will run as scheduled and should save the csv files in the web servers document root. Keep in mind you don't need to your script to run in the web server, just on the same server. The web server will just serve your static csv files for you once you place them in the webserver's document root.
Desktop with dropbox
Another maybe easier option is take any desktop and schedule your python script to run on it each night you can do this in windows, Linux, Mac. Also install dropbox it gives you 2GB free online storage. Then your scripts just have to save the csv fies to the Dropbox/Public directory. When they do this they will automatically get synced to the dropbox servers and can be accessed through your public url like any other web page on the internet. You get 2GB for free which should be more then enough for a whole bunch of CSV files.

Django: How to upload directly files submitted by a user to another server?

I'm using Django 1.4.
There are two servers (app server and file server).
The app server provide a web service using django, wsgi, and apache.
User can upload files via the web service.
I'd like to upload directly these files to the file server.
"directly" means that the files aren't uploaded via the app server.
I'd like to make the file server simple as possible. The file server just serve files.
Ideally, transfer costs between the app server and the file server are zero.
Could somebody tell me how to do this?
You can't actually do both of these at once:
I'd like to upload directly these files to the file server.
I'd like to make the file server simple as possible. The file server just serve files.
Under your requirements, the file server needs to both Serves Files and Accepts Uploads of files.
There are a few ways to get the files onto the FileServer
the easiest way, is just to upload to the AppServer and then have that upload to another server. this is what most AmazonS3 implementations are like.
if the two machines are on the same LAN , you can mount a volume of the FileServer onto the AppServer using NFS or something similar. Users upload to the AppServer, but the data is saved to a partition that is really on the FileServer.
You could have a file upload script work on the FileServer. However, you'd need to do a few complex things:
have a mechanism to authenticate the ability to upload the file. you couldn't just use an authtkt, you'd need to have something that allows one-and-only-one file upload along with some sort of identifier and privilege token. i'd probably opt for an encrypted payload that is timestamped and has the upload permission credentials + an id for the file.
have a callback on successful upload from the FileServer to the AppServer, letting it know that the id in the payload has been successfully received.
I think what you need is static url setting for django 1.4 in order to make available file server files from the app server.
For uploading files to file server you can write a python or php script hosted on this server (assuming apache2 server or similar) to get the job done.
I you have this ideas, i think you dont need to keep track of what files are uploaded (take into account that by using this solution you just can't)

Categories

Resources