Save binary data get from url into Django FileField - python

Is there any ways to save a binary data get from an external url (in my case, an excel file) into a Django FileField, with the file uploaded to the destination according to the django project settings?
class FileData(models.Model):
excel_file = models.FileField(upload_to='excel_file_path')
import requests
url = 'https://www.example.getfile.com/file_id=234'
r = requests.get(url)
# How to store the binary data response to FileField?
Thanks for help. Please also let me know if further information is needed in my case.

You can make use of django.core.files.uploadedfile.SimpleUploadedFile to save your content as a file field of your model instance.
>>> import requests
>>> from django.core.files.uploadedfile import SimpleUploadedFile
>>> response = requests.get("https://www.example.getfile.com/file_id=234")
>>> excel_file = SimpleUploadedFile("excel.xls", response.content, content_type="application/vnd.ms-excel")
>>> file_data = FileData(excel_file=excel_file)
>>> file_data.save()

Related

urllib: Get name of file from direct download link

Python 3. Probably need to use urllib to do this,
I need to know how to send a request to a direct download link, and get the name of the file it attempts to save.
(As an example, a KSP mod from CurseForge: https://kerbal.curseforge.com/projects/mechjeb/files/2355387/download)
Of course, the file ID (2355387) will be changed. It could be from any project, but always on CurseForge. (If that makes a difference on the way it's downloaded.)
That example link results in the file:
How can I return that file name in Python?
Edit: I should note that I want to avoid saving the file, reading the name, then deleting it if possible. That seems like the worst way to do this.
Using urllib.request, when you request a response from a url, the response contains a reference to the url you are downloading.
>>> from urllib.request import urlopen
>>> url = 'https://kerbal.curseforge.com/projects/mechjeb/files/2355387/download'
>>> response = urlopen(url)
>>> response.url
'https://addons-origin.cursecdn.com/files/2355/387/MechJeb2-2.6.0.0.zip'
You can use os.path.basename to get the filename:
>>> from os.path import basename
>>> basename(response.url)
'MechJeb2-2.6.0.0.zip'
from urllib import request
url = 'file download link'
filename = request.urlopen(request.Request(url)).info().get_filename()

How to use downloaded url content with Xsendfile (Django)

I want to use mod - xsendfile (which I've downloaded and installed) to save content from urls, external pages, that I read in with urllib and urllib2 in the variable one_download.I'm new to this and not sure how to properly configure some of the x-sendfile properties. In the code below I assume that I can place the urllib content in one_download directly into xsendfile instead of taking a middle step as saving it to a txt file and then pass that txt - file to xsendfile.
import urllib2,urllib
def download_from_external_url(request):
post_data = [('name','Dave'),]
# example url
#url = http://www.expressen.se/kronikorer/k-g-bergstrom/sexpartiuppgorelsen-rackte-inte--det-star-klart-nu/ - for example
result = urllib2.urlopen(url, urllib.urlencode(post_data))
print result
one_download = result.read()
# testprint content in one_download in shell
print one_download
# pass content in one_download, in dict c, to xsendfile
c = {"one_download":one_download}
c['Content-Disposition']= 'attachment; one_download=%s' %smart_str(one_download)
c["X-Sendfile"] = one_download # <-- not working
return HttpResponse(json.dumps(c),'one_download_index.html', mimetype='application/force-download')
That's not what X-Sendfile is for; it's for serving static files you already have on disk without having to go through Django. Since you're downloading the file dynamically, and it's in memory anyway, you might as well serve it directly.

Unable to open file object created in Django

I'm developing an application that runs on an Apache server with Django framework. My current script works fine when it runs on the local desktop (without Django). The script downloads all the images from a website to a folder on the desktop. However, when I run the script on the server a file object is just create by Django that apparently has something in it (should be google's logo), however, I can't open up the file. I also create an html file, updated image link locations, but the html file gets created fine, I'm assuming because it's all text, maybe? I believe I may have to use a file wrapper somewhere, but I'm not sure. Any help is appreciated, below is my code, Thanks!
from django.http import HttpResponse
from bs4 import BeautifulSoup as bsoup
import urlparse
from urllib2 import urlopen
from urllib import urlretrieve
import os
import sys
import zipfile
from django.core.servers.basehttp import FileWrapper
def getdata(request):
out = 'C:\Users\user\Desktop\images'
if request.GET.get('q'):
#url = str(request.GET['q'])
url = "http://google.com"
soup = bsoup(urlopen(url))
parsedURL = list(urlparse.urlparse(url))
for image in soup.findAll("img"):
print "Old Image Path: %(src)s" % image
#Get file name
filename = image["src"].split("/")[-1]
#Get full path name if url has to be parsed
parsedURL[2] = image["src"]
image["src"] = '%s\%s' % (out,filename)
print 'New Path: %s' % image["src"]
# print image
outpath = os.path.join(out, filename)
#retrieve images
if image["src"].lower().startswith("http"):
urlretrieve(image["src"], outpath)
else:
urlretrieve(urlparse.urlunparse(parsedURL), out) #Constructs URL from tuple (parsedURL)
#Create HTML File and writes to it to check output (stored in same directory).
html = soup.prettify("utf-8")
with open("output.html", "wb") as file:
file.write(html)
else:
url = 'You submitted nothing!'
return HttpResponse(url)
My problem had to do with storing the files on the desktop. I stored the files in the DJango workspace folder, changed the paths, and it worked for me.

Download a remote image and save it to a Django model

I am writing a Django app which will fetch all images of particular URL and save them in the database.
But I am not getting on how to use ImageField in Django.
Settings.py
MEDIA_ROOT = os.path.join(PWD, "../downloads/")
# URL that handles the media served from MEDIA_ROOT. Make sure to use a
# trailing slash.
# Examples: "http://example.com/media/", "htp://media.example.com/"
MEDIA_URL = '/downloads/'
models.py
class images_data(models.Model):
image_id =models.IntegerField()
source_id = models.IntegerField()
image=models.ImageField(upload_to='images',null=True, blank=True)
text_ind=models.NullBooleanField()
prob=models.FloatField()
download_img.py
def spider(site):
PWD = os.path.dirname(os.path.realpath(__file__ ))
#site="http://en.wikipedia.org/wiki/Pune"
hdr= {'User-Agent': 'Mozilla/5.0'}
outfolder=os.path.join(PWD, "../downloads")
#outfolder="/home/mayank/Desktop/dreamport/downloads"
print "MAYANK:"+outfolder
req = urllib2.Request(site,headers=hdr)
page = urllib2.urlopen(req)
soup =bs(page)
tag_image=soup.findAll("img")
count=1;
for image in tag_image:
print "Image: %(src)s" % image
filename = image["src"].split("/")[-1]
outpath = os.path.join(outfolder, filename)
urlretrieve('http:'+image["src"], outpath)
im = img(image_id=count,source_id=1,image=outpath,text_ind=None,prob=0)
im.save()
count=count+1
I am calling download_imgs.py inside one view like
if form.is_valid():
url = form.cleaned_data['url']
spider(url)
Django Documentation is always good place to start
class ModelWithImage(models.Model):
image = models.ImageField(
upload_to='images',
)
UPDATED
So this script works.
Loop over images to download
Download image
Save to temp file
Apply to model
Save model
.
import requests
import tempfile
from django.core import files
# List of images to download
image_urls = [
'http://i.thegrindstone.com/wp-content/uploads/2013/01/how-to-get-awesome-back.jpg',
]
for image_url in image_urls:
# Stream the image from the url
response = requests.get(image_url, stream=True)
# Was the request OK?
if response.status_code != requests.codes.ok:
# Nope, error handling, skip file etc etc etc
continue
# Get the filename from the url, used for saving later
file_name = image_url.split('/')[-1]
# Create a temporary file
lf = tempfile.NamedTemporaryFile()
# Read the streamed image in sections
for block in response.iter_content(1024 * 8):
# If no more file then stop
if not block:
break
# Write image block to temporary file
lf.write(block)
# Create the model you want to save the image to
image = Image()
# Save the temporary image to the model#
# This saves the model so be sure that it is valid
image.image.save(file_name, files.File(lf))
Some reference links:
requests - "HTTP for Humans", I prefer this to urllib2
tempfile - Save temporay file and not to disk
Django filefield save
If you want to save downloaded images without saving them to disk first (without using NamedTemporaryFile etc) then there's an easy way to do that.
This will be slightly quicker than downloading the file and writing it to disk as it is all done in memory. Note that this example is written for Python 3 - the process is similar in Python 2 but slightly different.
from django.core import files
from io import BytesIO
import requests
url = "https://example.com/image.jpg"
resp = requests.get(url)
if resp.status_code != requests.codes.ok:
# Error handling here
fp = BytesIO()
fp.write(resp.content)
file_name = url.split("/")[-1] # There's probably a better way of doing this but this is just a quick example
your_model.image_field.save(file_name, files.File(fp))
Where your_model is an instance of the model you'd like to save to and .image_field is the name of the ImageField.
See the documentation for io for more info.
# this is my solution
from django.core import files
from django.core.files.base import ContentFile
import requests
from .models import MyModel
def download_img():
r = requests.get("remote_file_url", allow_redirects=True)
filename = "remote_file_url".split("/")[-1]
my_model = MyModel(
file=files.File(ContentFile(r.content), filename)
)
my_model.save()
return
As an example of what I think you're asking:
In forms.py:
imgfile = forms.ImageField(label = 'Choose your image', help_text = 'The image should be cool.')
In models.py:
imgfile = models.ImageField(upload_to='images/%m/%d')
So there will be a POST request from the user (when the user completes the form). That request will contain basically a dictionary of data. The dictionary holds the submitted files. To focus the request on the file from the field (in our case, an ImageField), you would use:
request.FILES['imgfield']
You would use that when you construct the model object (instantiating your model class):
newPic = ImageModel(imgfile = request.FILES['imgfile'])
To save that the simple way, you'd just use the save() method bestowed upon your object (because Django is that awesome):
if form.is_valid():
newPic = Pic(imgfile = request.FILES['imgfile'])
newPic.save()
Your image will be stored, by default, to the directory you indicate for MEDIA_ROOT in settings.py.
Accessing the image in the template:
<img src="{{ MEDIA_URL }}{{ image.imgfile.name }}"></img>
The urls can be tricky, but here's a basic example of a simple url pattern to call the stored images:
urlpatterns += patterns('',
url(r'^media/(?P<path>.*)$', 'django.views.static.serve', {
'document_root': settings.MEDIA_ROOT,
}),
)
I hope it helps.
Similar to #boltsfrombluesky's answer above you can do this in Python 3 without any external dependencies like so:
from os.path import basename
import urllib.request
from urllib.parse import urlparse
import tempfile
from django.core.files.base import File
def handle_upload_url_file(url, obj):
img_temp = tempfile.NamedTemporaryFile(delete=True)
req = urllib.request.Request(
url, data=None,
headers={
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'
}
)
with urllib.request.urlopen(req) as response:
img_temp.write(response.read())
img_temp.flush()
filename = basename(urlparse(url).path)
result = obj.image.save(filename, File(img_temp))
img_temp.close()
return result
Try doing it this way instead of assigning path to the image...
import urllib2
from django.core.files.temp import NamedTemporaryFile
def handle_upload_url_file(url):
img_temp = NamedTemporaryFile()
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120427 Firefox/15.0a1')]
img_temp.write(opener.open(url).read())
img_temp.flush()
return img_temp
use the above function like this..
new_image = images_data()
#rest of the data in new_image and then do this.
new_image.image.save(slug_filename,File(handle_upload_url_file(url)))
#here slug_filename is just filename that you want to save the file with.
In case you are saving image by overriding models' save method to modify the name of file and struggling with random invalid filename(like me) in django. You can follow up below code (Copied from Accepted answer):
lf = tempfile.NamedTemporaryFile()
for block in response.iter_content(1024*8):
if not block:
break
lf.write(block)
lf.name = name. # Set your custom file name here
dc = ImageFile(file=files.File(lf))
dc.file.save()
I have configured my storage with django-storages, in order to directly upload media content to s3. For some reasons I wasn't able to replace file name. After some R&D it worked.
Note: I have used FileField in the model, hence few line of code is not needed
def qrcodesave(request):
import urllib2;
url ="http://chart.apis.google.com/chart?cht=qr&chs=300x300&chl=s&chld=H|0";
opener = urllib2.urlopen(url);
mimetype = "application/octet-stream"
response = HttpResponse(opener.read(), mimetype=mimetype)
response["Content-Disposition"]= "attachment; filename=aktel.png"
return response

How to read Excel files from a stream (not a disk-backed file) in Python?

XLRD is installed and tested:
>>> import xlrd
>>> workbook = xlrd.open_workbook('Sample.xls')
When I read the file through html form like below, I'm able to access all the values.
xls_file = request.params['xls_file']
print xls_file.filename, xls_file.type
I'm using Pylons module, request comes from: from pylons import request, tmpl_context as c
My questions:
Is xls_file read through requst.params an object?
How can I read xls_file and make it work with xlrd?
Update:
The xls_file is uploaded on web server, but the xlrd library expects a filename instead of an open file object, How can I make the uploaded file to work with xlrd? (Thanks to Martijn Pieters, I was being unable to formulate the question clearly.)
xlrd does support providing data directly without a filepath, just use the file_contents argument:
xlrd.open_workbook(file_contents=fileobj.read())
From the documentation:
file_contents – A string or an mmap.mmap object or some other behave-alike object. If file_contents is supplied, filename will not be used, except (possibly) in messages.
What I met is not totally the same with the question, but I think maybe it is similar and I can give some hints.
I am using django rest framework's request instead of pylons request.
If I write simple codes like following:
#api_view(['POST'])
#renderer_classes([JSONRenderer])
def upload_files(request):
file_obj = request.FILES['file']
from xlrd import open_workbook
wb = open_workbook(file_contents=file_obj.read())
result = {"code": "0", "message": "success", "data": {}}
return Response(status=200, data=result)
Here We can read using open_workbook(file_contents=file_obj.read()) as mentioned in previous comments.
But if you write code in following way:
from rest_framework.views import APIView
from rest_framework.parsers import MultiPartParser
class FileUploadView(APIView):
parser_classes = (MultiPartParser,)
def put(self, request, filename, format=None):
file_obj = request.FILES.get('file')
from xlrd import open_workbook
wb = open_workbook(file_contents=file_obj.read())
# do some stuff with uploaded file
return Response(status=204)
You must pay attention that using MultiPartParser instead of FileUploadParser, using FileUploadParser will raise some BOF error.
So I am wondering somehow it is also affected by how you write the API.
For me this code works. Python 3
xlrd.open_workbook(file_contents=fileobj.content)
You could try something like...
import xlrd
def newopen(fileobject, modes):
return fileobject
oldopen = __builtins__.open
__builtins__.open = newopen
InputWorkBook = xlrd.open_workbook(fileobject)
__builtins__.open = oldopen
You may have to wrap the fileobject in StringIO if it isn't already a file handle.

Categories

Resources