I want to download a file from SkyDrive programmatically using Python on Linux.
I can't use the API as it's a OneNote file and the API can't be used to download these.
My understanding is that SD supports Webdav and there are plenty of examples where people have mounted an SD folder using davfs2 but I just want to be able to grab a specific file without mounting.
I can use the API to get the document owner's cid so don't need to jump through any windows based hoops but my - probably lame, have not really researched webdav - efforts to download the file always resort in an error.
For example using easywebdav:
import easywebdav
webdav = easywebdav.connect("d.docs.live.net/mycid")
webdav.download('me/skydrive/Documents/Getting\ Started', '/tmp/foo')
#this gives the 302 error mentioned in the comments at the end of the the 'jumping through windows hoops' link I posted above.
Is there any workaround for the redirection problem I've seen mentioned?
Do I have this wrong and when accessing files on a webdav share it makes sense, and indeed it's essential, to mount it as a file system?
If you are downloading a specific file, and already know the exact path/URL to that file (as per your example), I'm not sure that you really need to worry about the DAV extensions. Have you tried downloading the file using a simple HTTP GET, through something like urllib2?
Related
I'm using python office365 library to access sharepoint documents. I don't know how to access file via API that have been shared with me by sharing link. I need to get this file content and if possible metadata (last modify date). Could anyone help?
The user that I'm using have no access to this sharepoint folder other than a sharing link to a single file.
I tried many variations of normal file access API, bot by hand and by office365 library. I couldnt find a way to access a file when I have only sharing link to it.
My sharing link looks like that:
https://[redacted].sharepoint.com/:x:/s/[redacted]/dir1/dir2/ESd0HkNNSbJMhQFavQsr9-4BNHC2rHSWsnbs3zRdjtZsC3g so there is not really a filename here and I cannot read via API content of any folder per se because I have an error Attempted to perform an unathorized operation.. Authentication goes fine (when i mistake password I get different error).
According to my research and testing, you can use the following Rest API to read file (get file content):
https://xxxx.sharepoint.com/sites/xxx/_api/web/GetFolderByServerRelativeUrl('/sites/xxx/Library_Name/Folder Name')/Files('Document.docx')/$value
If you want to get last modify date, you can use the following Rest API to get the Modified field:
https://xxxx.sharepoint.com/sites/xxx/_api/web/lists/getbytitle('test_library')/Items?$select=Modified
I am pushing video to workers for a cloud dataflow pipeline. I have been advised to use beam directly to manage my objects. I can't understand the best practices for downloading objects. I can see the class
Apache Beam IO GCP So one could use it like so:
def read_file(element,local_path):
with beam.io.gcp.gcsio.GcsIO().open(element, 'r') as f:
Where element is the gcs path read from a previous beam step.
Checking out the available methods, downloader looks like.
f.downloader
Download with 57507840/57507840 bytes transferred from url https://www.googleapis.com/storage/v1/b/api-project-773889352370-testing/o/Clips%2F00011.MTS?generation=1493431837327161&alt=media
This message makes it seem like it has been downloaded, it has the right file size (57mb). But where does it go? I would like to pass a variable (local_path), so that subsequent process can handle the object. The class doesn't seem accept a path destination, its not in current working directory, /tmp/ or downloads folder. I'm testing locally on OSX before I deploy.
Am I using this tool correctly? I know that streaming video bytes may be preferable for large videos, we'll get to that once I understand basic functions. I'll open a separate question for streaming into memory (named pipe?) to be read by opencv.
I am in the process of setting up with Google Adwords API. They have a fantastic guide (https://developers.google.com/adwords/api/docs/guides/start), with the exception that one of the last steps is rather vague.
I have gotten to this step, pictured here (but from the link above)
I am instructed (for Python) to put the client ID and client secret into my own configuration file. All the other languages have specific files that were added to need to be edited (such as the PHP example below).
I have been working at this for the past 3 hours, and tried googling and youtubing and reading through every piece of documentation I can find. All of them just say "add the ID and secret to your config file." I have no idea what that means, or how to do it. I've gone into my python directory and found a file named "config.py", but have no idea how to add these credentials. There is a number of scripts on github (that Google links to), one of them for generating a refresh token, like I want. I have no idea how to implement this, though.
https://github.com/googleads/googleads-python-lib/tree/master/examples/adwords/authentication
Thank you in advance for any insight into adding credentials to my python config file or otherwise generating a refresh token.
I found the answer.
In short, the config file is in a directory that was not included in the instructions. It is advisable to download the entire "googleads-python-lib" directory versus just the directory "googlead".
https://github.com/googleads/googleads-python-lib
The config file (googleads.yaml) is within this "googleads-python-lib" directory. I unzipped it in my python2.7/site-packages. There are variables in this config file ready to take your authentication credentials.
Before I pose the question, some background: I'm creating a web management tool that, among other things, allows the user to download, tail, email, and move and files between predefined directories via the management panel. Many of these directories are local to the server, but some are actually located on remote hosts and accessed via SSH--however, this is transparent to the user. I've used Twisted to create a pseudo-REST API for the client to access, but since I want to avoid revealing actual server paths to the client, it requests downloads of files using a POST with an arbitrary ID to the api, as such: "http://XXXX:8880/api/transfer/download"
with POST params similar to this: {"srckey":"5","srcfile":"solar2-windows-1.10.zip"}. The idea being the client only knows the key of the directory and filename.
Pardon the excessive background--I'm hoping it will make my question more clear: The issue I have is I'm trying to allow users to download a copy of a file from one of the "remote" hosts via the management server that hosts the web panel, all without caching the file locally. I've used Twisted's File() object to stream large static files before, but since the file resides on another server, I'm trying to accomplish the same using a file object provided by Paramiko's "open()" method.
I've tried setting up a consumer/producer system similar to that used in the render methods of twisted.web.static.File, plugging in the file pointer provided by Paramiko in the appropriate places, but only the smallest text files transfer successfully--all cases cause Paramiko to throw this error:
socket.error: Socket is closed
The contents of the relevant python files are here:
serve-project.py: http://pastebin.com/YcjsQHu3
WrapSSH.py:
http://pastebin.com/XaKXJwxb
In a nutshell, I'm trying to stream the data from a Paramiko SFTPFile to an HTTP client. I suspect that my approach is majorly faulty, due to my minimal familiarity with Twisted. Anyone have suggestions on a more intelligent way to accomplish this?
I'm looking for a way to sell someone a card at an event that will have a unique code that they will be able to use later in order to download a file (mp3, pdf, etc.) only one time and mask the true file location so a savvy person downloading the file won't be able to download the file more than once. It would be nice to host the file on Amazon S3 to save on bandwidth where our server is co-located.
My thought for the codes would be to pre-generate the unique codes that will get printed on the cards and store those in a database that could also have a field that stores the number of times the file was downloaded. This way we could set how many attempts we would allow the user for downloading the file.
The part that I need direction on is how do I hide/mask the original file location so people can't steal that url and then download the file as many times as they want. I've done Google searches and I'm either not searching using the right keywords or there aren't very many libraries or snippets out there already for this type of thing.
I'm guessing that I might be able to rig something up using django.views.static.serve that acts as a sort of proxy between the actual file and the user downloading the file. The only drawback to this method I would think is that I would need to use the actual web server and wouldn't be able to store the file on Amazon S3.
Any suggestions or thoughts are greatly appreciated.
Neat idea. However, I would warn against the single-download method, because there is no guarantee that their first download attempt will be successful. Perhaps use a time-expiration method instead?
But it is certainly possible to do this with Django. Here is an outline of the basic approach:
Set up a django url for serving these files
Use a GET parameter which is a unique string to identify which file to get.
Keep a database table which has a FileField for the file to download. This table maps the unique strings to the location of the file on the file system.
To serve the file as a download, set the response headers in the view like this:
(path is the location of the file to serve)
with open(path, 'rb') as f:
response = HttpResponse(f.read())
response['Content-Type'] = 'application/octet-stream';
response['Content-Disposition'] = 'attachment; filename="%s"' % 'insert_filename_here'
return response
Since we are using this Django page to serve the file, the user cannot find out the original file location.
You can just use something simple such as mod_xsendfile. This functionality is also available in other popular webservers such lighttpd or nginx.
It works like this: when enabled your application (e.g. a trivial PHP script) can send a special response header, causing the webserver to serve a static file.
If you want it to work with S3 you will need to handle each and every request this way, meaning the traffic will go through your site, from there to AWS, back to your site and back to the client. Does S3 support symbolic links / aliases? If so you might just redirect a valid user to one of the symbolic URLs and delete that symlink after a couple of hours.