I'm making a program that downloads PDFs from the internet.
Here's a example of the code:
import httpx # <-- This also happens with the requests module
URL = "http://62.182.86.140/main/0/aee7239ffcf7871e1d6687ced1215e22/Markus%20Nix%20-%20Exploring%20Python-Entwickler%20%282005%29.djvu"
r = httpx.get(URL, timeout=20.0).content.decode("ascii")
with open(f"./example.pdf", "w") as f:
f.write(str(content))
But when I write to a file, none of my pdf viewers (tried okular and zathura) can read them.
But when I download it using a program like wget, there's no problems.
Then when I compare the two files (one downloaded with python, and the other with wget), everything is encoded, and I can't figure out how to decode it (.decode() doesn't work).
import httpx
def main(url):
r = httpx.get(url, timeout=20)
with open('file.djvu', 'wb') as f:
f.write(r.content)
main('http://62.182.86.140/main/0/aee7239ffcf7871e1d6687ced1215e22/Markus%20Nix%20-%20Exploring%20Python-Entwickler%20%282005%29.djvu')
I want to download a Video file from internet using requests library and before saving that endit metadata of the video.
import requests
url = 'https://www.sample-videos.com/video123/mp4/720/big_buck_bunny_720p_5mb.mp4'
r = requests.get(url, stream=True)
with open('video.mp4', 'wb') as file:
file.write(r.content)
I just want to change the metadata to video before saving the file.
I don't think that's possible. My approach would be to first download the video and then consider using a library such as tagpy or mutagen.
I would recommend mutagen since I find that it has a good documentation
See here on installation steps for mutagen
Example code using mutagen
>>> import mutagen
>>> mutagen.File("11. The Way It Is.ogg")
{'album': [u'Always Outnumbered, Never Outgunned'],
'title': [u'The Way It Is'], 'artist': [u'The Prodigy'],
'tracktotal': [u'12'], 'albumartist': [u'The Prodigy'],'date': [u'2004'],
'tracknumber': [u'11'],
>>> _.info.pprint()
u'Ogg Vorbis, 346.43 seconds, 499821 bps'
>>>
And then to change the title, you simply access the dictionary key and change the value of it
from mutagen.flac import FLAC
audio = FLAC("example.flac")
audio["title"] = u"An example"
audio.pprint()
audio.save()
To build on AzyCrw4282's answer, mutagen can be used to do what you're looking for before saving the file.
The API docs for mutagen.File() states that it's looking for a filething, which is "a filename or file-like object". This means that you can buffer it to an in-memory location, modify your metadata with Mutagen, then save it to disk. Please be aware that the entire binary response will be in memory, this may cause issues depending on your available system resources.
from io import BytesIO
import requests
import mutagen
with requests.get(url, stream=True) as r:
r.raise_for_status()
buf = BytesIO()
for chunk in r.iter_content(chunk_size=8192):
if chunk:
buf.write(chunk)
buf.seek(0)
video = mutagen.File(buf)
# ... do your modifications
with open('/your/file/path.mp4', 'wb') as f:
f.write(buf.getbuffer())
I am trying to automate downloading a .Z file from a website, but the file I get is 2kb when it should be around 700 kb and it contains a list of the contents of the page (ie: all the files available for download). I am able to download it manually without a problem. I have tried urllib and urllib2 and different configurations of each, but each does the same thing. I should add that the urlVar and fileName variables are generated in a different part of the code, but I have given an example of each here to demonstrate.
import urllib2
urlVar = "ftp://www.ngs.noaa.gov/cors/rinex/2014/100/txga/txga1000.14d.Z"
fileName = txga1000.14d.Z
downFile = urllib2.urlopen(urlVar)
with open(fileName, "wb") as f:
f.write(downFile.read())
At least the urllib2documentation suggest you should use the Requestobject. This works with me:
import urllib2
req = urllib2.Request("ftp://www.ngs.noaa.gov/cors/rinex/2014/100/txga/txga1000.14d.Z")
response = urllib2.urlopen(req)
data = response.read()
Data length seems to be 740725.
I was able to download what seems like the correct size for your file with the following python2 code:
import urllib2
filename = "txga1000.14d.Z"
url = "ftp://www.ngs.noaa.gov/cors/rinex/2014/100/txga/{}".format(filename)
reply = urllib2.urlopen(url)
buf = reply.read()
with open(filename, "wb") as fh:
fh.write(buf)
Edit: The post above me was answered faster and is much better.. I thought I'd post since I tested and wrote this out anyways.
Can I save images to disk using python? An example of an image would be:
Easiest is to use urllib.urlretrieve.
Python 2:
import urllib
urllib.urlretrieve('http://chart.apis.google.com/...', 'outfile.png')
Python 3:
import urllib.request
urllib.request.urlretrieve('http://chart.apis.google.com/...', 'outfile.png')
If your goal is to download a png to disk, you can do so with urllib:
import urllib
urladdy = "http://chart.apis.google.com/chart?chxl=1:|0|10|100|1%2C000|10%2C000|100%2C000|1%2C000%2C000|2:||Excretion+in+Nanograms+per+gram+creatinine+milliliter+(logarithmic+scale)|&chxp=1,0|2,0&chxr=0,0,12.1|1,0,3&chxs=0,676767,13.5,0,lt,676767|1,676767,13.5,0,l,676767&chxtc=0,-1000&chxt=y,x,x&chbh=a,1,0&chs=640x465&cht=bvs&chco=A2C180&chds=0,12.1&chd=t:0,0,0,0,0,0,0,0,0,1,0,0,3,2,4,6,6,9,3,6,5,11,9,10,6,2,2,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0&chdl=n=87&chtt=William+MD+-+Buprenorphine+Graph"
filename = r"c:\tmp\toto\file.png"
urllib.urlretrieve(urladdy, filename)
In python 3, you will need to use urllib.request.urlretrieve instead of urllib.urlretrieve.
The Google chart API produces PNG files. Just retrieve them with urllib.urlopen(url).read() or something along these lines and safe to a file the usual way.
Full example:
>>> import urllib
>>> url = 'http://chart.apis.google.com/chart?chxl=1:|0|10|100|1%2C000|10%2C000|100%2C000|1%2C000%2C000|2:||Excretion+in+Nanograms+per+gram+creatinine+milliliter+(logarithmic+scale)|&chxp=1,0|2,0&chxr=0,0,12.1|1,0,3&chxs=0,676767,13.5,0,lt,676767|1,676767,13.5,0,l,676767&chxtc=0,-1000&chxt=y,x,x&chbh=a,1,0&chs=640x465&cht=bvs&chco=A2C180&chds=0,12.1&chd=t:0,0,0,0,0,0,0,0,0,1,0,0,3,2,4,6,6,9,3,6,5,11,9,10,6,2,2,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0&chdl=n=87&chtt=William+MD+-+Buprenorphine+Graph'
>>> image = urllib.urlopen(url).read()
>>> outfile = open('chart01.png','wb')
>>> outfile.write(image)
>>> outfile.close()
As noted in other examples, 'urllib.urlretrieve(url, outfilename)` is even more straightforward, but playing with urllib and urllib2 will surely be instructive for you.