I use pdfkit and wkhtmltopdf to generate pdf documents. When i generate the first pdf all is well. When i quickly (within 5 seconds) generate an other i get the error [Errno 9] Bad file descriptor. If i close the error (step back in browser) and open again, it will create the pdf.
my views.py
config = pdfkit.configuration(wkhtmltopdf='C:/wkhtmltopdf/bin/wkhtmltopdf.exe')
pdfgen = pdfkit.from_url(url, printname, configuration=config)
pdf = open(printname, 'rb')
response = HttpResponse(pdf.read())
response['Content-Type'] = 'application/pdf'
response['Content-disposition'] = 'attachment ; filename =' + filename
pdf.close()
return response
Maybe important note: i run this site on IIS8, when running from commandline (python manage.py runserver) the error is not present.
Any guidelines on how to handle this error would be great.
When i quickly (within 5 seconds) generate an other
This point suggests that your code is flawless and the problem lies with your browser rejecting the URL as Peter suggests.
Most probably the cause of the error lies with file buffer flush. Consider flushing buffer at appropriate places.
With no further information forth-coming, I'll convert my comment to an answer...
Most likely the issues are that your URL is being rejected by the web server when you try the quick reload (via from_url) or that you are having problems accessing the local file you are trying to create.
You could try to eliminate the latter by just writing straight to a variable by passing False as your output file name - e.g. pdf = pdfkit.from_url('google.com', False).
If that doesn't solve it, your issue is almost certainly with the server rejecting the URL - and so you need to look at the diagnostics on that server.
Related
I'm using Locust for load testing - creating a lot of post requests to a server.
Because I need to generate different fields for every request, The best way to do it in my opinion is to read the body from a file, change the relevant fields and send the request.
The problem occurs when I open the file
I see in the Jenkins log that there is a FileNotFound exception - even though I see the file in the git repo from where the Jenkins runs the code.
I tried putting the full path in the with statement but still got the same exception.
...
with open('postRequest.json', 'r') as jsonFile:
data = json.load(jsonFile)
data["a"] = b
data["x"] = y
data[["something"] = something_else
return json.dumps(data)
Jenkins fails opening the file.
Note : The code works when I don't read the file, but just create a very long JSON string.
Thanks all !! ;)
The issue was resolved, in the Jenkins the full path is different than I thought it was.
Anyway, ran pwd and saw where I was - added the path where the file was and worked.
Thanks friends !
I am using the urllib2 module in Python 2.7 using Spyder 3.0 to batch download text files by reading a text file that contains a list of them:
reload(sys)
sys.setdefaultencoding('utf-8')
with open('ocean_not_templated_url.txt', 'r') as text:
lines = text.readlines()
for line in lines:
url = urllib2.urlopen(line.strip('ï \xa0\t\n\r\v'))
with open(line.strip('\n\r\t ').replace('/', '!').replace(':', '~'), 'wb') as out:
for d in url:
out.write(d)
I've already discovered a bunch of weird characters in the urls that I've since stripped, however, the script fails when nearly 90% complete, giving the following error:
I thought it to be a non-breaking space (denoted by \xa0 in the code), but it still fails. Any ideas?
That's an odd URL!
Specify the communication protocol over the network. Try prefixing the URL with http:// and the domain names if the file exists on the WWW.
Files always reside somewhere, in some server's directory, or locally on your system. So there must be a network path to such files, for example:
http://127.0.0.1/folder1/samuel/file1.txt
Same example, with localhost being an alias for 127.0.0.1 (generally)
http://localhost/folder1/samuel/file1.txt
That might solve the problem. Just think about where your file exists and how it should be addressed...
Update:
I experimented quite a bit on this. I think I know why that error is raised! :D
I speculate that your file which stores the URL's actually has a sneaky empty line near the end. I can say it's near the end as you said that it executes about 90% of it and then fails. So, the python urllib2 function get_type is unable to process that empty url and throws unknown url type:
I think that's the problem! Remove that empty line in the file ocean_not_templated_url.txt and try it out!
Just check and let me know! :P
I created a simple threaded python server, and I have two parameters for format, one is JSON (return string data) and the other is zip. When a user selects the format=zip as one of the input parameters, I need the server to return a zip file back to the user. How should I return a file to a user on a do_GET() for my server? Do I just return the URL where the file can be downloaded or can I send the file back to the user directly? If option two is possible, how do I do this?
Thank you
You should send the file back to the user directly, and add a Content-Type header with the correct media type, such as application/zip.
So the header could look like this:
Content-Type: application/zip
The issue was that I hadn't closed the zipfile object before I tried to return it. It appeared there was a lock on the file.
To return a zip file from a simple http python server using GET, you need to do the following:
Set the header to 'application/zip'
self.send_header("Content-type:", "application/zip")
Create the zip file using zipfile module
Using the file path (ex: c:/temp/zipfile.zip) open the file using 'rb' method to read the binary information
openObj = open( < path > , 'rb')
return the object back to the browser
openObj.close()
del openObj
self.wfile.write(openObj.read())
That's about it. Thank you all for your help.
I am trying to write a program that reads a webpage looking for file links, which it then attempts to download using curl/libcurl/pycurl. I have everything up to the pycurl correctly working, and when I use a curl command in the terminal, I can get the file to download. The curl command looks like the following:
curl -LO https://archive.org/download/TheThreeStooges/TheThreeStooges-001-WomanHaters1934moeLarryCurleydivxdabaron19m20s.mp4
This results in one redirect (a file that reads as all 0s on the output) and then it correctly downloads the file. When I remove the -L flag (so the command is just -O) it only reaches the first line, where it doesn't find a file, and stops.
But when I try to do the same operation using pycurl in a Python script, I am unable to successfully set [Curl object].FOLLOWLOCATION to 1, which is supposed to be the equivalent of the -L flag. The python code looks like the following:
c = [class Curl object] # get a Curl object
fp = open(file_name,'wb')
c.setopt(c.URL , full_url) # set the url
c.setopt(c.FOLLOWLOCATION, 1)
c.setopt(c.WRITEDATA , fp)
c.perform()
When this runs, it gets to c.perform() and shows the following:
python2.7: src/pycurl.c:272: get_thread_state: Assertion `self->ob_type == p_Curl_Type' failed.
Is it missing the redirect, or am I missing something else earlier because I am relatively new to cURL?
When I enabled verbose output for the c.perform() step, I was able to uncover what I believe was/is the underlying problem that my program had. The first line, which was effectively flagged, indicated that an open connection was being reused.
I had originally packaged the file into an object oriented setup, as opposed to a script, so the curl object had been read and reused without being closed. Therefore after the first connection attempt, which failed because I didn't set options correctly, it was reusing the connection to the website/server (which presumably had the wrong connection settings).
The problem was resolved by having the script close any existing Curl objects, and create a new one before the file download.
If a would-be-HTTP-server written in Python2.6 has local access to a file, what would be the most correct way for that server to return the file to a client, on request?
Let's say this is the current situation:
header('Content-Type', file.mimetype)
header('Content-Length', file.size) # file size in bytes
header('Content-MD5', file.hash) # an md5 hash of the entire file
return open(file.path).read()
All the files are .zip or .rar archives no bigger than a couple of megabytes.
With the current situation, browsers handle the incoming download weirdly. No browser knows the file's name, for example, so they use a random or default one. (Firefox even saved the file with a .part extension, even though it was complete and completely usable.)
What would be the best way to fix this and other errors I may not even be aware of, yet?
What headers am I not sending?
Thanks!
This is how I send ZIP file,
req.send_response(200)
req.send_header('Content-Type', 'application/zip')
req.send_header('Content-Disposition', 'attachment;'
'filename=%s' % filename)
Most browsers handle it correctly.
If you don't have to return the response body (that is, if you are given a stream for the response body by your framework) you can avoid holding the file in memory with something like this:
fp = file(path_to_the_file, 'rb')
while True:
bytes = fp.read(8192)
if bytes:
response.write(bytes)
else:
return
What web framework are you using?