Apache + Python : Serving binairy files - python

I have an Apache server with python cgi (Python3). A client start a get request to get a virtual file, and I need to give him back the good one regarding his user-agent. I was able to do it with text files but when I try to serve back binairies files like images (.jpg) or .zip, the downloaded file seems corrupted.
When I parse it I can see b'\x00\x....' so I think the byte conversion went wrong somewhere.
I have tried with sys.stdout.write but it expects a str not bytes. I have tried also to "play" with the headers by changing the content type for example but it is not working.
reqFile = open(filePath,'rb')
content = reqFile.read()
print("Content-Type:image/jpg")
print("Accept-Rangers:byte")
print("Content-Length:"+str(len(content))
print()
print (content)
Thanks in advance !!

Ok, I have found that print() insert '\n' character and other stuff. So, for binairies file, I recommend to use sys.stdout.
file = open(filePath, "rb")
content = file.read()
length = len(content)
file.close()
print("Content-type:application/x-download")
print("Content-length:%d" % length)
print()
sys.stdout.flush()
sys.stdout.buffer.write(content)
Don't forget to do sys.stdout.flush() in order to have a clean output.

Related

How can I store a .CHM Help File as text with Python?

I am using Pastebin to store the code of my python program to keep it updated on several computers. I am now trying to similarly maintain an updated help window. I saw that I could use .chm files to keep a full help dialog in a single file, but the files do not translate to text well.
I used a sample .chm file from Microsoft, I opened the file ("Viewhlp.chm") with notepad and copied the text to Pastebin, and then used the script below to attempt to recreate the .chm file. This does not work. It gives a "cannot open the file" message when opening directly and is simply ignored with PyWin32.
Is there another single file format for help dialogs that I can load with python?
import urllib2, sys
helpUrl = "http://pastebin.com/raw.php?i=a8rF2i8a"
originalPath = "Viewhlp.chm"
newPath = "NewHlp.chm"
try:
helpData = urllib2.urlopen(helpUrl)
except urllib2.URLError:
sys.exit()
currentHelp = helpData.read()
with open(newPath, mode="wb") as helpFile:
helpFile.write(currentHelp)
# briefly display using PyWin32 or just open the chm files directly
import win32help
win32help.HtmlHelp(0, None, win32help.HH_INITIALIZE, None)
link = win32help.HH_AKLINK()
link.indexOnFail = 1
link.url = ""
link.msgText = ""
link.msgTitle = ""
link.window = ""
win32help.HtmlHelp(0, originalPath, win32help.HH_KEYWORD_LOOKUP, link)
win32help.HtmlHelp(0, newPath, win32help.HH_KEYWORD_LOOKUP, link)
Notepad won't display the non-printing characters properly. Probably the easiest thing to do would be to base64 encode the .chm, then open the encoded version in notepad before you copy it to pastebin. Then unencode it when you read it:
currentHelp = base64.b64decode(helpData.read())
One way I convert things/documents like this is by installing a "Generic / Text Only" printer on my Windows system, and then selecting it and picking the "print to file" option in the printing dialog that appears when I try to print something from the associated application.
This results in a plain text file with what would have been printed in it. There's probably some way to automate it, although I've never tried.

Python ftplib Corrupting Files?

I'm downloading files in Python using ftplib and up until recently everything seemed to be working fine. I am downloading files as such:
ftpSession = ftplib.FTP(host,username,password)
ftpSession.cwd('rlmfiles')
ftpFileList = filter(lambda x: 'PEDI' in x, ftpSession.nlst())
ftpFileList.sort()
for f in ftpFileList:
tempFile = open(os.path.join(localDirectory,f),'wb')
ftpSession.retrbinary('RETR '+f,tempFile.write)
tempFile.close()
ftpSession.quit()
sys.exit(0)
Up until recently it was downloading the files I needed just fine, as expected. Now, however, My files I'm downloading are corrupted and just contain long strings of garbage ASCII. I know that it is not the files posted onto the FTP I'm pulling them from because I also have a Perl script that does this successfully from the same FTP.
If it is any additional info, here's what the debugger puts out in the command prompt when downloading a file:
Has anyone encountered any issues with corrupted file contents using retrbinary() in Python's ftplib?
I'm really stuck/frustrated and haven't come across anything related to possible corruption here. Any help is appreciated.
I just ran into this issue yesterday when I was attempting to download text files. Not sure if that is what you were doing, but since you say it has ASCII garbage in it, I assume you opened it in a text editor because it was supposed to be text.
If this is the case, the problem is that the file is a text file and you are trying to download it in binary mode.
What you want to do instead is retrieve the file in ASCII transfer mode.
tempFile = open(os.path.join(localDirectory,f),'w') # Changed 'wb' to 'w'
ftpSession.retrlines('RETR '+f,tempFile.write) # Changed retrbinary to retrlines
Unfortunately, this strips all the new-line characters out of the file. Yuck!
So then you need to add the stripped out new-line characters again:
tempFile = open(os.path.join(localDirectory,f),'w')
textLines = []
ftpSession.retrlines('RETR '+f,textLines.append)
tempFile.write('\n'.join(textLines))
This should work, but it doesn't look as nice as it could. So a little cleanup effort would get us:
temporaryFile = open(os.path.join(localDirectory, currentFile), 'w')
textLines = []
retrieveCommand = 'RETR '
ftpSession.retrlines(retrieveCommand + currentFile, textLines.append)
temporaryFile.write('\n'.join(textLines))

Python - writing lines from file into IRC buffer

Ok, so I am trying to write a Python script for XCHAT that will allow me to type "/hookcommand filename" and then will print that file line by line into my irc buffer.
EDIT: Here is what I have now
__module_name__ = "scroll.py"
__module_version__ = "1.0"
__module_description__ = "script to scroll contents of txt file on irc"
import xchat, random, os, glob, string
def gg(ascii):
ascii = glob.glob("F:\irc\as\*.txt")
for textfile in ascii:
f = open(textfile, 'r')
def gg_cb(word, word_eol, userdata):
ascii = gg(word[0])
xchat.command("msg %s %s"%(xchat.get_info('channel'), ascii))
return xchat.EAT_ALL
xchat.hook_command("gg", gg_cb, help="/gg filename to use")
Well, your first problem is that you're referring to a variable ascii before you define it:
ascii = gg(ascii)
Try making that:
ascii = gg(word[0])
Next, you're opening each file returned by glob... only to do absolutely nothing with them. I'm not going to give you the code for this: please try to work out what it's doing or not doing for yourself. One tip: the xchat interface is an extra complication. Try to get it working in plain Python first, then connect it to xchat.
There may well be other problems - I don't know the xchat api.
When you say "not working", try to specify exactly how it's not working. Is there an error message? Does it do the wrong thing? What have you tried?

Return an image to the browser in python, cgi-bin

I'm trying to set up a python script in cgi-bin that simply returns a header with content-type: image/png and returns the image. I've tried opening the image and returning it with print f.read() but that isn't working.
EDIT:
the code I'm trying to use is:
print "Content-type: image/png\n\n"
with open("/home/user/tmp/image.png", "r") as f:
print f.read()
This is using apache on ubuntu server 10.04. When I load the page in chrome I get the broken image image, and when I load the page in firefox I get The image http://localhost/cgi-bin/test.py" cannot be displayed, because it contains errors.
You may need to open the file as "rb" (in windows based environments it's usually the case.
Simply printing may not work (as it adds '\n' and stuff), better just write it to sys.stdout.
The statement print "Content-type: image/png\n\n" actually prints 3 newlines (as print automatically adds one "\n" in the end. This may break your PNG file.
Try:
sys.stdout.write( "Content-type: image/png\r\n\r\n" + file(filename,"rb").read() )
HTML responses require carriage-return, new-line
Are you including the blank line after the header? If not, it's not the end of your headers!
print 'Content-type: image/png'
print
print f.read()

Downloading text files with Python and ftplib.FTP from z/os

I'm trying to automate downloading of some text files from a z/os PDS, using Python and ftplib.
Since the host files are EBCDIC, I can't simply use FTP.retrbinary().
FTP.retrlines(), when used with open(file,w).writelines as its callback, doesn't, of course, provide EOLs.
So, for starters, I've come up with this piece of code which "looks OK to me", but as I'm a relative Python noob, can anyone suggest a better approach? Obviously, to keep this question simple, this isn't the final, bells-and-whistles thing.
Many thanks.
#!python.exe
from ftplib import FTP
class xfile (file):
def writelineswitheol(self, sequence):
for s in sequence:
self.write(s+"\r\n")
sess = FTP("zos.server.to.be", "myid", "mypassword")
sess.sendcmd("site sbd=(IBM-1047,ISO8859-1)")
sess.cwd("'FOO.BAR.PDS'")
a = sess.nlst("RTB*")
for i in a:
sess.retrlines("RETR "+i, xfile(i, 'w').writelineswitheol)
sess.quit()
Update: Python 3.0, platform is MingW under Windows XP.
z/os PDSs have a fixed record structure, rather than relying on line endings as record separators. However, the z/os FTP server, when transmitting in text mode, provides the record endings, which retrlines() strips off.
Closing update:
Here's my revised solution, which will be the basis for ongoing development (removing built-in passwords, for example):
import ftplib
import os
from sys import exc_info
sess = ftplib.FTP("undisclosed.server.com", "userid", "password")
sess.sendcmd("site sbd=(IBM-1047,ISO8859-1)")
for dir in ["ASM", "ASML", "ASMM", "C", "CPP", "DLLA", "DLLC", "DLMC", "GEN", "HDR", "MAC"]:
sess.cwd("'ZLTALM.PREP.%s'" % dir)
try:
filelist = sess.nlst()
except ftplib.error_perm as x:
if (x.args[0][:3] != '550'):
raise
else:
try:
os.mkdir(dir)
except:
continue
for hostfile in filelist:
lines = []
sess.retrlines("RETR "+hostfile, lines.append)
pcfile = open("%s/%s"% (dir,hostfile), 'w')
for line in lines:
pcfile.write(line+"\n")
pcfile.close()
print ("Done: " + dir)
sess.quit()
My thanks to both John and Vinay
Just came across this question as I was trying to figure out how to recursively download datasets from z/OS. I've been using a simple python script for years now to download ebcdic files from the mainframe. It effectively just does this:
def writeline(line):
file.write(line + "\n")
file = open(filename, "w")
ftp.retrlines("retr " + filename, writeline)
You should be able to download the file as a binary (using retrbinary) and use the codecs module to convert from EBCDIC to whatever output encoding you want. You should know the specific EBCDIC code page being used on the z/OS system (e.g. cp500). If the files are small, you could even do something like (for a conversion to UTF-8):
file = open(ebcdic_filename, "rb")
data = file.read()
converted = data.decode("cp500").encode("utf8")
file = open(utf8_filename, "wb")
file.write(converted)
file.close()
Update: If you need to use retrlines to get the lines and your lines are coming back in the correct encoding, your approach will not work, because the callback is called once for each line. So in the callback, sequence will be the line, and your for loop will write individual characters in the line to the output, each on its own line. So you probably want to do self.write(sequence + "\r\n") rather than the for loop. It still doesn' feel especially right to subclass file just to add this utility method, though - it probably needs to be in a different class in your bells-and-whistles version.
Your writelineswitheol method appends '\r\n' instead of '\n' and then writes the result to a file opened in text mode. The effect, no matter what platform you are running on, will be an unwanted '\r'. Just append '\n' and you will get the appropriate line ending.
Proper error handling should not be relegated to a "bells and whistles" version. You should set up your callback so that your file open() is in a try/except and retains a reference to the output file handle, your write call is in a try/except, and you have a callback_obj.close() method which you use when retrlines() returns to explicitly file_handle.close() (in a try/except) -- that way you get explict error handling e.g. messages "can't (open|write to|close) file X because Y" AND you save having to think about when your files are going to be implicitly closed and whether you risk running out of file handles.
Python 3.x ftplib.FTP.retrlines() should give you str objects which are in effect Unicode strings, and you will need to encode them before you write them -- unless the default encoding is latin1 which would be rather unusual for a Windows box. You should have test files with (1) all possible 256 bytes (2) all bytes that are valid in the expected EBCDIC codepage.
[a few "sanitation" remarks]
You should consider upgrading your Python from 3.0 (a "proof of concept" release) to 3.1.
To facilitate better understanding of your code, use "i" as an identifier only as a sequence index and only if you irredeemably acquired the habit from FORTRAN 3 or more decades ago :-)
Two of the problems discovered so far (appending line terminator to each character, wrong line terminator) would have shown up the first time you tested it.
Use retrlines of ftplib to download file from z/os, each line has no '\n'.
It's different from windows ftp command 'get xxx'.
We can rewrite the function 'retrlines' to 'retrlines_zos' in ftplib.py.
Just copy the whole code of retrlines, and chane the 'callback' line to:
...
callback(line + "\n")
...
I tested and it worked.
you want a lambda function and a callback. Like so:
def writeLineCallback(line, file):
file.write(line + "\n")
ftpcommand = "RETR {}{}{}".format("'",zOsFile,"'")
filename = "newfilename"
with open( filename, 'w' ) as file :
callback_lambda = lambda x: writeLineCallback(x,file)
ftp.retrlines(ftpcommand, callback_lambda)
This will download file 'zOsFile' and write it to 'newfilename'

Categories

Resources