reading output from a text file

reading output from a text file - python

For instance, the following test.py script can be displayed at ../cgi-bin/test.py, but is there a way I could display the same output from a text file (like test.txt) instead of text.py, so that the url would be something like ../text.txt?
--- test.py ---
def printxt():
print "Content-Type: text/plain"
print """my text here..."""

If you point your browser to www.yoursite.com/yourtext.txt, you'll discover that most browsers are fully capable of displaying text files without any additional code.

You appear to be asking if web servers can serve, and web browsers display text files. The answer is yes. Put the text file underneath DOCUMENTROOT and it's all good.
Or did I misunderstand the question?

If you want to use Python to do this in the context of a larger program (the example you just gave would be useless if that's all that you wanted it to do), you can simply use the standard:
file = open("filename.txt", "r")
for line in file:
print line
You already know about the Content-type line which of course would be the same.
If the file is small enough to be read into memory all at once without causing problems, you can use "print file.read()" in order to have file.read() read the entire file as a single string, then print that out.

Related

Why is Django FieldFIle readline() returning the hex version of a text file?

Having an odd problem.
I have a Django app that opens a file (represented as a Django FieldFile) and reads each row using readline() as below:
with file.open(mode='r') as f:
row = f.readline()
# do something with row...
The file is text, utf-8 encoded and lines are terminated with \r\n.
The problem is each row is being read as the hex representation of the string, so instead of "Hello" I get "48656c6c6f".
A few stranger things:
It previously worked properly, but at some point an update has broken it (I've tried rolling back to previous commits and it is still wonky, so possibly a dependency has updated and not something from my requirements.txt). Missed it in my testing because it is in a very rarely used part of the app.
If I read the same file using readlines() instead of readline() I see the correct string representation of the file wrapped in [b'...']
The file reads normally if I do it using straight Python open() and readline() from an interpreter
Forcing text mode with mode='rt' doesn't change the behaviour, neither does mode='rb'
The file is stored in a Minio bucket, so the defaut storage is storages.backends.s3boto3.S3Boto3Storage from django-storages and not the default Django storage class. This means that boto3, botocore and s3fs are also in the mix, making it more confusing for me to debug.
Scratching my head at why this worked before and what I'm doing wrong.
Environment is Python 3.8, Django 2.2.8 and 3.0 (same result) running in Docker containers.
EDIT
Let me point out that the fix for this is simply using
row = f.readline().decode()
but I would still like to figure out what's happening.
EDIT 2
Further to this, FieldFile.open() is reading the file as a binary file, whereas plain Python open() is reading the file as a text file.

This seems very weird.
I think you will see the solution immediately after trying following (I will then update my answer or delete it if it really doesn't help to find it, but I'm quite confident)
A assume, that there is some code, that is monkeypatching file.open or the django view function.
What I suggest is:
Start your code with manage.py runserver
Ad following code to manage.py (as the very first lines)
import file
print("ID of file.open at manage startup is", id(file.open)
Then add code to your view directly one line above the file.open
print("ID of file.open before opening is", id(file.open)
If both ids are different, then something monkeypatched your open function.
If both are the same, then the problem must be somewhere else.
If you don not see the output of these two prints, something might have monkeypatched your view.
If this doesn't work, then try to use
open() instead of file.open()
Is there any particular reason you use file.open()
Addendum 1:
So what you sai is, that file is an object instance of a class is it a FileField?
In any case can you obtain the name of the file and open it with a normal open() to see whether it is only file.open() that does funny things or whether it is also open() reading it this stange way.
Did you just open the file from command line with cat filename (or if under windows with type filename?
If that doesn't work we could add traces to follow each line of the source code that is being executed.
Addendum 2:
Well if you can't try this in a manage.py runserver, what happens if you try to read the file with a manage.py shell?
Just open the shell and type something like:
from <your_application>.models import <YourModel>
entry = <YourModel>.objects.get(id=<idofentry>)
line1 = entry.<filefieldname>.open("r").read().split("\n")[0]
print("line1 = %r" % line1)
If this is still not conclusive, (but only if you can reproduce the issue with the management shell, then create a small file containing the lines.
from <your_application>.models import <YourModel>
entry = <YourModel>.objects.get(id=<idofentry>)
import pdb; pdb.set_trace()
line1 = entry.<filefieldname>.open("r").read().split("\n")[0]
print("line1 = %r" % line1)
And import it from the management shell.
The code should enter the debugger and now you can single step through the open function and see whether you end up on sime weird function in some monkeypatch.

How to use a python program stored in a text file?

I'm making a troubleshooting program in which I need to take a python program which is stored in a text file, but I can't use the 'import' module. To clarify this, there would be a python program stored as a '.txt' file, and in the main program I would take this text file and be able to use it as a subprogram. I've tried doing this, but I have had no clue of how to go about it, especially since I do not have much experience of Python.
Below is roughly the program. I don't know how to format it either, but here goes:
phonechoice = input("What type of phone do you have?")
if 'iphone' in phonechoice:
#here I would load a text file which contains the program for the iphone
#which asks them what problem they have with their phone and gives a solution
I'm wondering how I can do this. I thought how I could do this and maybe I could 'copy and paste' the program, line by line, into a definition, which I could then use. Would this work, and if it doesn't then in what other way could I do it?

Rename the text file to a python file, i.e. change the extension to ".py". This does not change the fact that it is a text file, just like renaming a picture.jpg file to picture.txt does not change the fact that it's an image file.
If you have some wacky requirement to import a module saved in file with a .txt extension, you can not use an import statement. But it is still possible to import like this:
import imp
my_module = imp.load_source('my_module', 'example.txt')

I am a bit reluctant to answer a "homework" type question, but I will give you some pointers on what you need to do. If I have a text file with this in it:
def main():
print("Hello")
main()
I could execute the code with the exec function like this:
with open("filename.txt") as file: #filename should be the name of the file
data = file.read()
exec(data) #this executes the code
The output would be as expected:
Hello
Hopefully this will shed some light on your problem!

Python: How to get the URL to a file when the file is received from a pipe?

I created, in Python, an executable whose input is the URL to a file and whose output is the file, e.g.,
file:///C:/example/folder/test.txt --> url2file --> the file
Actually, the URL is stored in a file (url.txt) and I run it from a DOS command line using a pipe:
type url.txt | url2file
That works great.
I want to create, in Python, an executable whose input is a file and whose output is the URL to the file, e.g.,
a file --> file2url --> URL
Again, I am using DOS and connecting executables via pipes:
type url.txt | url2file | file2url
Question: file2url is receiving a file. How do I get the file's URL (or path)?

In general, you probably can't.
If the url is not stored in the file, I seems very difficult to get the url. Imagine someone reads a text to you. Without further information you have no way to know what book it comes from.
However there are certain usecases where you can do it.
Pipe the url together with the file.
If you need the url and you can do that, try to keep the url together with the file. Make url2file pipe your url first and then the file.
Restructure your pipeline
Maybe you don't need to find the url for the file, if you restructure your pipeline.
Index your files
If only a certain files could potentially be piped into file2url, you could precalculate a hash for all files and store it in your program together with the url. In python you would do this using a dict where the key is the file (as a string) and the value is the url. You could use pickle to write the dict object to a file and load it at the start of your program.
Then you could simply lookup the url from this dict.
You might want to research how databases or search functions in explorers handle indexing or alternative solutions.
Searching for the file
You could use one significant line of the file and use something like grep or head on linux to search all files of your computer for this line. Note that grep and head are programs, not python functions. For DOS, you might need to google the equivalent programs.
FYI: grep searches for one line of text inside a file.
head puts out the first few lines of a file. I suggest comparing only the first few lines of files to avoid searching through huge file.
Searching all files on the computer might take very long.
You could only search files with the same size as your piped input.
Use url.txt
If file2url knows the location of the file url.txt, then you could look up all files in url.txt until you find a file identical to the file that was piped into your program. You could combine this with the hashing/ indexing solution.

'file2url' receives the data via standard input (like keyboard).
The data is transferred by the kernel and it doesn't necessarily have to have any file-system representation. So if there's no file there's no URL or path to that for you to get.

Let's try to do it by obvious way:
$ cat test.py | python test.py
import sys
print ''.join(sys.stdin.readlines())
print sys.stdin.name
<stdin>
So, filename is "< stdin>" because, for the python there is no filename - only input.
Another way is a system-dependent. Find a command line, which was used, for example, but no garantee that is will be works.

save text file and display its content on the browser

If the following script.py writes "some text here" to output.txt file, my URL will be http://my_name/script.py. My question is, how can I read the output.txt as soon as (right after) the following function creates it, so that my URL reads like http://my_name/output.txt.
Many thanks in advance.
#------ script.py -------
def write_txt(){
f=('./output.txt', 'w')
f.write("some text here")
}

try webbrowser lib.
import webbrowser
myurl = "file:///mydir/output.txt"
webbrowser.open(myurl)
However:
Note that on some platforms, trying to
open a filename using this function,
may work and start the operating
system’s associated program.
That is: your file will probably be open in your default text editor (p.e. notepad). A possible solution is to give a custom extension to your file (p.e. output.url) and to associate the extension to your browser (not tested)

Depends on various factors, like OS and webserver used.
Pipe the output to the browser specifying a correct content-type, or, given you script writes to an accessible location, issue a HTTP redirect code pointing to that location.

How to separate content from a file that is a container for binary and other forms of content

I am trying to parse some .txt files. These files serve as containers for a variable number of 'children' files that are set off or identified within the container with SGML tags. With python I can easily separate the children files. However I am having trouble writing the binary content back out as a binary file (say a gif or jpg). In the simplest case the container might have an embedded html file followed by a graphic that is called by the html. I am assuming that my problem is because I am reading the original .txt file using open(filename,'r'). But that seems the only option to find the sgml tags to split the file.
I would appreciate any help to identify some relevant reading material.
I appreciate the suggestions but I am still struggling with the most basic questions. For example when I open the file with wordpad and scroll down to the section tagged as a gif I see this:
<FILENAME>h65803h6580301.gif
<DESCRIPTION>GRAPHIC
<TEXT>
begin 644 h65803h6580301.gif
M1TE&.#EA(P)I`=4#`("`#,#`P$!`0+^_OW]_?_#P\*"#H.##X-#0T&!#8!`0
M$+"PL"`#('!P<)"0D#`P,%!04#\_/^_O[Y^?GZ^OK]_?WX^/C\_/SV]O;U]?
I can handle finding the section easily enough but where does the gif file begin. Does the header start with 644, the blanks after the word begin or the line beginning with MITE?
Next, when the file is read into python does it do anything to the binary code that has to be undone when it is read back out?
I can find the lines where the graphics begin:
filerefbin=file('myfile.txt','rb')
wholeFile=filerefbin.read()
import re
graphicReg=re.compile('<DESCRIPTION>GRAPHIC')
locationGraphics=graphicReg.finditer(wholeFile)
graphicsTags=[]
for match in locationGraphics:
graphicsTags.append(match.span())
I can easily use the same process to get to the word begin, or to identify the filename and get to the end of the filename in the 'first' line. I have also successefully gotten to the end of the embedded gif file. But I can't seem to write out the correct combination of things so when I double click on h65803h6580301.gif when it has been isolated and saved I get to see the graphic.
Interestingly, when I open the file in rb, the line endings appear to still be present even though they don't seem to have any effect in notebpad. So that is clearly one of my problems I might need to readlines and join the lines together after stripping out the \n
I love this site and I love PYTHON
This was too easy once I read bendin's post. I just had to snip the section that began with the word begin and save that in a txt file and then run the following command:
import uu
uu.decode(r'c:\test2.txt',r'c:\test.gif')
I have to work with some other stuff for the rest of the day but I will post more here as I look at this more closely. The first thing I need to discover is how to use something other than a file, that is since I read the whole .txt file into memory and clipped out the section that has the image I need to work with the clipped section instead of writing it out to test2.txt. I am sure that can be done its just figuring out how to do it.

What you're looking at isn't "binary", it's uuencoded. Python's standard library includes the module uu, to handle uuencoded data.
The module uu requires the use of temporary files for encoding and decoding. You can accomplish this without resorting to temporary files by using Python's codecs module like this:
import codecs
data = "Let's just pretend that this is binary data, ok?"
uuencode = codecs.getencoder("uu")
data_uu, n = uuencode(data)
uudecode = codecs.getdecoder("uu")
decoded, m = uudecode(data_uu)
print """* The initial input:
%(data)s
* Encoding these %(n)d bytes produces:
%(data_uu)s
* When we decode these %(m)d bytes, we get the original data back:
%(decoded)s""" % globals()

You definitely need to be reading in binary mode if the content includes JPEG images.
As well, Python includes an SGML parser, http://docs.python.org/library/sgmllib.html .
There is no example there, but all you need to do is setup do_ methods to handle the sgml tags you wish.

You need to open(filename,'rb') to open the file in binary mode. Be aware that this will cause python to give You confusing, two-byte line endings on some operating systems.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.