I use this code to upload an image to a directory:
import urllib.request
URL = ...
FILENAME = r'c:\networks\work\image1.jpg'
with urllib.request.urlopen(URL) as response:
image = response.read()
with open(FILENAME, 'wb') as output_file:
output_file.write(image)
and it works for every image, but when I try to upload this type of URL:
https://data.cyber.org.il/python/logpuzzle/p-bbjb-bbia.jpg
the uploading takes something like 3 minutes which is very slow, why does it happen?
Related
I'm a python beginner. I have a dataset column that contains thousands of URLs. I want to save the image in each URL with its extension. I don't have a problem with urls that end with the image extension like https://web.archive.org/web/20170628093753im_/http://politicot.com/wp-content/uploads/2016/12/Sean-Spicer.jpg.(with urllib or requests)
However for URLs like link1= https://nypost.com/wp-content/uploads/sites/2/2017/11/171106-texas-shooter-church-index.jpg?quality=90&strip=all&w=1200 or link2 = https://i2.wp.com/www.huzlers.com/wp-content/uploads/2017/03/maxresdefault.jpeg?fit=1280%2C720&ssl=1, i failed to save them.
I want to save the images in links as follows: image1.jpg and image2.jpeg. How can we do this?
Any help could be useful.
The following seems to work for me, give it a try:
import requests
urls = ['https://nypost.com/wp-content/uploads/sites/2/2017/11/171106-texas-shooter-church-index.jpg?quality=90&strip=all&w=1200',
'https://i2.wp.com/www.huzlers.com/wp-content/uploads/2017/03/maxresdefault.jpeg?fit=1280%2C720&ssl=1']
for i, url in enumerate(urls):
r = requests.get(url)
filename = 'image{0}.jpg'.format(i+1)
with open(filename, 'wb') as f:
f.write(r.content)
I am trying to download an image from an instagram media URL:
https://instagram.fybz2-1.fna.fbcdn.net/v/t51.2885-15/fr/e15/p1080x1080/106602453_613520712600632_6255422472318530180_n.jpg?_nc_ht=instagram.fybz2-1.fna.fbcdn.net&_nc_cat=108&_nc_ohc=WQizf6rhDmQAX883HrQ&oh=140f221889178fd03bf654cf18a9d9a2&oe=5F4D2AFE
Pasting this into my browser will bring up the image, but when I run the following code I get the following error which i suspect is due to issues with the URL containing a query string (running this on a simple url ending in .jpg works without issue
File "C:/Users/19053/InstagramImageDownloader/downloadImage.py", line 18, in <module>
with open(filename, 'wb') as f:
OSError: [Errno 22] Invalid argument: '106602453_613520712600632_6255422472318530180_n.jpg?_nc_ht=instagram.fybz2-1.fna.fbcdn.net&_nc_cat=108&_nc_ohc=WQizf6rhDmQAX883HrQ&oh=140f221889178fd03bf654cf18a9d9a2&oe=5F4D2AFE'
Full code as follows:
## Importing Necessary Modules
import requests # to get image from the web
import shutil # to save it locally
## Set up the image URL and filename
image_url = "https://instagram.fybz2-1.fna.fbcdn.net/v/t51.2885-15/fr/e15/p1080x1080/106602453_613520712600632_6255422472318530180_n.jpg?_nc_ht=instagram.fybz2-1.fna.fbcdn.net&_nc_cat=108&_nc_ohc=WQizf6rhDmQAX883HrQ&oh=140f221889178fd03bf654cf18a9d9a2&oe=5F4D2AFE"
filename = image_url.split("/")[-1]
# Open the url image, set stream to True, this will return the stream content.
r = requests.get(image_url, stream=True)
# Check if the image was retrieved successfully
if r.status_code == 200:
# Set decode_content value to True, otherwise the downloaded image file's size will be zero.
r.raw.decode_content = True
# Open a local file with wb ( write binary ) permission.
with open(filename, 'wb') as f:
shutil.copyfileobj(r.raw, f)
print('Image sucessfully Downloaded: ', filename)
else:
print('Image Couldn\'t be retreived')
The problem is with the filename. You need to first split by ? then take the first element then split by /
import requests # to get image from the web
import shutil # to save it locally
## Set up the image URL and filename
image_url = "https://instagram.fybz2-1.fna.fbcdn.net/v/t51.2885-15/fr/e15/p1080x1080/106602453_613520712600632_6255422472318530180_n.jpg?_nc_ht=instagram.fybz2-1.fna.fbcdn.net&_nc_cat=108&_nc_ohc=WQizf6rhDmQAX883HrQ&oh=140f221889178fd03bf654cf18a9d9a2&oe=5F4D2AFE"
filename = image_url.split("?")[0].split("/")[-1]
# Open the url image, set stream to True, this will return the stream content.
r = requests.get(image_url, stream=True)
# Check if the image was retrieved successfully
if r.status_code == 200:
# Set decode_content value to True, otherwise the downloaded image file's size will be zero.
r.raw.decode_content = True
# Open a local file with wb ( write binary ) permission.
with open(filename, 'wb') as f:
shutil.copyfileobj(r.raw, f)
print('Image sucessfully Downloaded: ', filename)
else:
print('Image Couldn\'t be retreived')
I have 1k of image urls in a csv file and I am trying to download all the images from the urls. I don't know why I am not able to download all the images. Here is my code:
print('Beginning file download with requests')
path = '/home/tt/image_scrap/image2'
for idx, url in tqdm(enumerate(dataset['url']), total=len(dataset['url'])):
response = requests.get(url,stream=True)
time.sleep(2)
filename = url.split("/")[-1]
with open(path+'/'+filename, 'wb') as f:
f.write(response.content)
Try / Except statements are really good for these type of 'errors':
Try this:
try:
with open(path+'/'+filename, 'wb') as f:
f.write(response.content)
except Exception as error:
print(error)
I've a lot of URL with file types .docx and .pdf I want to run a python script that downloads them from the URL and saves it in a folder. Here is what I've done for a single file I'll add them to a for loop:
response = requests.get('http://wbesite.com/Motivation-Letter.docx')
with open("my_file.docx", 'wb') as f:
f.write(response.content)
but the my_file.docx that it is saving is only 266 bytes and is corrupt but the URL is fine.
UPDATE:
Added this code and it works but I want to save it in a new folder.
import os
import shutil
import requests
def download_file(url, folder_name):
local_filename = url.split('/')[-1]
path = os.path.join("/{}/{}".format(folder_name, local_filename))
with requests.get(url, stream=True) as r:
with open(path, 'wb') as f:
shutil.copyfileobj(r.raw, f)
return local_filename
Try using stream option:
import os
import requests
def download(url: str, dest_folder: str):
if not os.path.exists(dest_folder):
os.makedirs(dest_folder) # create folder if it does not exist
filename = url.split('/')[-1].replace(" ", "_") # be careful with file names
file_path = os.path.join(dest_folder, filename)
r = requests.get(url, stream=True)
if r.ok:
print("saving to", os.path.abspath(file_path))
with open(file_path, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024 * 8):
if chunk:
f.write(chunk)
f.flush()
os.fsync(f.fileno())
else: # HTTP status code 4XX/5XX
print("Download failed: status code {}\n{}".format(r.status_code, r.text))
download("http://website.com/Motivation-Letter.docx", dest_folder="mydir")
Note that mydir in example above is the name of folder in current working directory. If mydir does not exist script will create it in current working directory and save file in it. Your user must have permissions to create directories and files in current working directory.
You can pass an absolute file path in dest_folder, but check permissions first.
P.S.: avoid asking multiple questions in one post
try:
import urllib.request
urllib.request.urlretrieve(url, filename)
I'm using python 2.7 and pycharm is my editor. What i'm trying to do is have python go to a site and download an image from that site and save it to my directory. Currently I have no errors but i don't think its downloading because the file is not showing in my directory.
import random
import urllib2
def download_web_image(url):
name = random.randrange(1,1000)
full_name = str(name) + ".jpg"
urllib2.Request(url, full_name)
download_web_image("www.example.com/page1/picture.jpg")
This will do the trick. The rest can stay the same, just edit your function to include the two lines I have added.
def download_web_image(url):
name = random.randrange(1,1000)
full_name = str(name) + ".jpg"
request = urllib2.Request(url)
img = urllib2.urlopen(request).read()
with open (full_name, 'w') as f: f.write(img)
Edit 1:
Exact code as requested in comments.
import urllib2
def download_web_image(url):
request = urllib2.Request(url)
img = urllib2.urlopen(request).read()
with open ('test.jpg', 'w') as f: f.write(img)
download_web_image("http://upload.wikimedia.org/wikipedia/commons/8/8c/JPEG_example_JPG_RIP_025.jpg")
You are simply creating a Request but you are not downloading the image. Try the following instead:
urllib.urlretrieve(url, os.path.join(os.getcwd(), full_name)) # download and save image
Or try the requests library:
import requests
image = requests.get("www.example.com/page1/picture.jpg")
with open('picture.jpg', 'wb') as f:
f.write(image.content)