Can't open downloaded .mp4 files downloaded with requests and urllib - python

I've downloaded some files using requests
url = 'https://www.youtube.com/watch?v=gp5tziO5lXg&feature=youtu.be'
video_name = url.split('/')[-1]
print("Downloading file:%s" % video_name)
# download the url contents in binary format
r = requests.get(url)
# open method to open a file on your system and write the contents
with open('saved.mp4', 'wb') as f:
f.write(r.content)
and using urllib.requests
url = 'https://www.youtube.com/watch?v=gp5tziO5lXg&feature=youtu.be'
video_name = url.split('/')[-1]
print("Downloading file:%s" % video_name)
# Copy a network object to a local file
urllib.request.urlretrieve(url, "saved2.mp4")
When I then try to open the .mp4 file I get the following error
Cannot play
This file cannot be played. This can happen because the file type is
not supported, the file extension is incorrect or the file is
corrupted.
0xc00d36c4
If I test it with pytube it works fine.
What's wrong with the other methods?

To answer your question, with the other methods it is not downloading the video but the page. What you may be obtaining is an html file with an mp4 file extension.
Therefore, it gives that error when trying to open the file.
If pytube works for what you need, I would suggest using that one.
If you want to download videos from other platforms, you might consider youtube-dl.

Hello you can import IPython.display for audio diplay
import IPython.display as ipd
ipd.Audio(video_name)
regards
I hope I can have solved your problem

Related

Download file using requests python

I have a list of Url of files that open as Download Dialogue box with an option to save and open.
I'm using the python requests module to download the files. While using Python IDLE I'm able to download the file with the below code.
link = fileurl
r = requests.get(link,allow_redirects=True)
with open ("a.torrent",'wb') as code:
code.write(r.content)
But when I use this code along with for loop, the file which gets downloaded is corrupted or says unable to open.
for link in links:
name = str(links.index(link)) ++ ".torrent"
r = requests.get(link,allow_redirects=True)
with open (name,'wb') as code:
code.write(r.content)
If you are trying to download a video from a website, try
r = get(video_link, stream=True)
with open ("a.mp4",'wb') as code:
for chunk in r.iter_content(chunk_size=1024):
file.write(chunk)

How to download Flickr images using photos url (does not contain .jpg, .png, etc.) using Python

I want to download image from Flickr using following type of links using Python:
https://www.flickr.com/photos/66176388#N00/2172469872/
https://www.flickr.com/photos/clairity/798067744/
This data is obtained from xml file given at https://snap.stanford.edu/data/web-flickr.html
Is there any Python script or way to download images automatically.
Thanks.
I try to find answer from other sources and compiled the answer as follows:
import re
from urllib import request
def download(url, save_name):
html = request.urlopen(url).read()
html=html.decode('utf-8')
img_url = re.findall(r'https:[^" \\:]*_b\.jpg', html)[0]
print(img_url)
with open(save_name, "wb") as fp:
fp.write(request.urlopen(img_url).read())
download('https://www.flickr.com/photos/clairity/798067744/sizes/l/', 'image.jpg')

python : wget module downloading file without any extension

I am writing small python code to download a file from follow link and retrieve original filename
and its extension.But I have come across one such follow link for which python downloads the file but it is without any extension whereas file has .txt extension when downloads using browser.
Below is the code I am trying :
from urllib.request import urlopen
from urllib.parse import unquote
import wget
filePath = 'D:\\folder_path'
followLink = 'http://example.com/Reports/Download/c4feb46c-8758-4266-bec6-12358'
response = urlopen(followLink)
if response.code == 200:
print('Follow Link(response url) :' + response.url)
print('\n')
unquote_url = unquote(response.url)
file_name = wget.detect_filename(response.url).replace('|', '_')
print('file_name - '+file_name)
wget.download(response.url,filePa
th)
file_name variable in above code is just giving 'c4feb46c-8758-4266-bec6-12358' as filename.
Where I want to download it as c4feb46c-8758-4266-bec6-12358.txt.
I have also tried to read file name from header i.e. response.info(). But not getting proper file name.
Anyone can please help me with this.I am stucked in my work.Thanks in advance.
Wget gets the filename from the URL itself. For example, if your URL was https://someurl.com/filename.pdf, it is saved as filename.pdf. If it was https://someurl.com/filename, it is saved as filename. Since wget.download returns the filename of the downloaded file, you can rename it to any extension you want with os.rename(filename, filename+'.<extension>').

Taking list of mp4 URLs and downloading file python

How would I go about navigating to a URL that's stored in a list and downloading the file? I'd preferably like to be able to store the MP4 file as it's clip title. I've used requests to retrieve the urls.
Thanks
list_clips = ['https://clips.twitch.tv/SpeedySneakyHeronKappaClaus', 'https://clips.twitch.tv/SplendidGiantPuffinThunBeast', 'https://clips.twitch.tv/ArtsyAuspiciousHamburgerThisIsSparta', 'https://clips.twitch.tv/BoringNiceHerbsSaltBae']
You can use python's requests module to download the file. Please refer the code below
import requests, os
for clips in list_clips:
clip_title = os.path.basename(clips)
r = requests.get(clips)
with open(clip_title+'.mp4', 'wb') as f:
f.write(r.content)

Download file using python without knowing its extension. - content-type - stream

Hey i just did some research and found that i could download images from urls which end with filename.extension like 000000.jpeg. i wonder now how i could downoad a picture which doesnt have any extension.
Here is my url which i want to download the image http://books.google.com/books/content?id=i2xKGwAACAAJ&printsec=frontcover&img=1&zoom=1&source=gbs_api
when i put the url directly to the browser it displays an image
furthermore here is what i tried:
from six.moves import urllib
thumbnail='http://books.google.com/books/content?id=i2xKGwAACAAJ&printsec=frontcover&img=1&zoom=1&source=gbs_api'
img=urllib.request.Request(thumbnail)
pic=urllib.request.urlopen(img)
pic=urllib.request.urlopen(img).read()
Anyhelp will be appreciated so much
This is a way to do it using HTTP response headers :
import requests
import time
r = requests.get("http://books.google.com/books/content?id=i2xKGwAACAAJ&printsec=frontcover&img=1&zoom=1&source=gbs_api", stream=True)
ext = r.headers['content-type'].split('/')[-1] # converts response headers mime type to an extension (may not work with everything)
with open("%s.%s" % (time.time(), ext), 'wb') as f: # open the file to write as binary - replace 'wb' with 'w' for text files
for chunk in r.iter_content(1024): # iterate on stream using 1KB packets
f.write(chunk) # write the file

Categories

Resources