How to print the last byte of a file .txt - python

python
I have a text file (.txt) and I need to print to the last byte of the txt.
how I do that ?
When I do not know the size of the document.

The documenation provides an API, that can be used to solve that problem. You will need to do the following things in order:
Open the file in text mode like in the example here.
Change the file pointer to the last byte. This can be achieved using the seek memeber function of the file object. Use the SEEK_END token with an offset of -1 to get one byte before the end of the file
Read one byte with the read function.
Print that byte.
If you did not use a context manager (with keyword) while opening the file, you should use close to close the file before exiting the program
The trick here is to use the seek method, that can be used to specify an offset relative to the end of the file.

The following should work:
with open("text.txt") as file:
text = outfile.read()
byte_array = bytearray(text, "utf8")
print(byte_array[-1:])
If you need the binary representation
with open("text.txt") as file:
text = outfile.read()
byte_array = bytearray(text, "utf8")
binary_byte_list = []
for byte in byte_array:
binary_representation = bin(byte)
binary_byte_list.append(binary_representation)
print(binary_byte_list[-1:])

You could do it like this using seek which obviates the need to read the entire file into memory:
import os
with open('foo.txt', 'rb') as foo:
foo.seek(-1, os.SEEK_END)
b = foo.read()
print(b)
In this case the last character is newline and therefore:
Output:
b'\n'
Note:
File opened in binary mode

Related

how to change byte type in pythone

i have a small problem and it caused me a lot of trubble. basicly i want to convert an immage to bytes than store string wersion of those bytes in an txt file and than read file contents and transform it into bytes and than into image. i've goten first part of this kinda ready (it works but it's made quickly and badly) but the conversion from string to byte gives me problem.
when i read image bytes it's something like this: b'GIF89aP\x00P\x00\xe3'
but when i read it from txt by 'rb' or just transform str to byte it gives me this: b'GIF89aP\\x00P\\x00\\xe3'
and with this i can't write it to an immage.
so i've tried to read and learn anything about this but i couldn't find anything that would help.
the code is here and i know it's really messy but i just need it to work
file = open('p.gif', 'rb')
image = file.read()
str_b = str(image)
leng = len(str_b)
print(leng)
str_b = str_b[:0] + str_b[0+2:]
leng =- 1
str_b = str_b[:leng]
print(image)
#a = open('bytearray', 'w+')
#a.write(str_b)
#a.close
a = open('bytearray', 'r')
a = a.read()
temp = a.encode('utf-8')
print(temp)
#b = open('check', 'w+')
#b.write(str(string))
#print(string)
image_result = open('decoded.jpg', 'wb') # create a writable image and write the decoding result
image_result.write(temp)
basicly my goal right now is to get bytes that look like this: b'GIF89aP\x00P\x00\xe3'
Please do not use eval like suggested above, eval has serious security vulnerabilities and will execute any python code you pass within it. You could accidentally read a text file that has code to reformat the disk and it will just execute, this is just an example but you get my point its bad practice and just results in more problems see https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html if you want some examples on why eval is bad
anyways lets try to fix your code
instead of converting your byte array to string by wrapping it in the str() method I would suggest you use .decode() and .encode()
Fixed Code:
with open('p.gif', 'rb') as file:
image = file.read() # read file bytes
str_image = image.decode("utf-8") #using decode we changed the bytes to a string
with open('image.txt', 'w') as file:
file.write(str_image) # write image as string to a text file
with open('image.txt', 'r') as file
str_from_file = file.read() # read the text file and store the string
file_bytes = str_from_file.encode("utf-8") # encode the image str back to bytes
print(type(str_from_file)) #type is str
print(type(file_bytes)) # types is bytes
I hope this fixes your issue and also doesn't include vulnerabilties in what your building

save text file in upper case using Python

I'm trying to make the program that convert all words to uppercase.
a = open("file.txt",encoding='UTF-8')
for b in a:
c = b.rstrip()
print(c.upper())
a.close()
this is my code
it prints uppercase text. But it can't save the file on 'file.txt'.
I want to convert all words to uppercase.
How can I solve it????
Here's how you can do it: [provided that you are working with a small file]
Open the file in read mode store the uppercase text in a variable; then, open another file handler in write mode and write the content into it.
with open('file.txt' , 'r') as input:
y = input.read().upper()
with open('file.txt', 'w') as out:
out.write(y)
You can actually do this "in place" by reading and writing a character at a time.
with open("file.txt", "r") as f:
while (b := f.read(1)) != '':
f.write(b.upper())
This is safe because you are processing the file one byte at a time (and writing one byte for every byte read) and not using seek to potentially overwrite a byte before it is read. The file-like object's underlying buffering and your system's disk cache means this isn't as inefficient as it looks.
(This does make one assumption: that the encoded length of b is always the same as b.upper(). I suspect that should always be true. If not, you should be able to read and write at least a line at a time, though not in place:
with open("input.txt") as inh, open("output.txt", "w") as outh:
for line in inh:
print(line.upper(), file=outh)
)
First convert the txt into the string:
with open('file.txt', 'r') as file:
data = file.read()
And then revise the data to the uppercase:
data_revise = data.upper()
Finally revise the texts in the file:
fout = open('data/try.txt', 'w')
fout.write(data_revise)
You can write all changes to temporary file and replace original after all data processed. You can use either map() or generator expression:
with open(r"C:\original.txt") as inp_f, open(r"C:\temp.txt", "w+") as out_f:
out_f.writelines(map(str.upper, inp_f))
with open(r"C:\original.txt") as inp_f, open(r"C:\temp.txt", "w+") as out_f:
out_f.writelines(s.upper() for s in inp_f)
To replace original file you can use shutil.move():
import shutil
...
shutil.move(r"C:\temp.txt", r"C:\original.txt")

How can I read the last bytes of a file?

I'm adding a string at the end of a binary file, the problem is that I don't know how to get that string back. I append the string, to the binary file, in ascii format using that command.
f=open("file", "ab")
f.write("Testing string".encode('ascii'))
f.close()
The string length will be max 40 characters, if it is shorter it will be padded with zeros. But I don't know how to get the string back from the binary file since it is at the end and how to rewrite the file without the string. Thank you in advance.
Since you opened the file in append mode, you can't read from it like that.
You will need to reopen in in read mode like so:
f = open("file", "rb")
fb = f.read()
f.close()
For future reference, an easier way to open files is like this:
with open("file", "rb") as f:
fb = f.read()
At which point you can use fb. In doing this, it will automatically close itself after the with has finished.

Write to file and save to ftp with python 2.6

I'm trying to store a file I create on an ftp server.
I've been able to create the temp file and store it as an empty file, but I haven't been able to write any data to the file before storing it.
Here is the partially working code:
#Loggin to server.
ftp = FTP(Integrate.ftp_site)
ftp.login(paths[0], paths[1])
ftp.cwd(paths[3])
f = tempfile.SpooledTemporaryFile()
# Throws error.
f.write(bytes("hello", 'UTF-8'))
#No error, doesn't work.
#f.write("hello")
#Also, doesn't throw error, and doesn't write anything to the file.
# f.write("hello".encode('UTF-8'))
file_name = "test.txt"
ftp.storlines("Stor " + file_name, f)
#Done.
f.close()
ftp.quit()
What am I doing wrong?
Thanks
Seeking!
To know where to read or write in the file (or file-like object), Python keeps a pointer to a location in the file. The documentation simply calls it "the file's current position". So, if you have a filed with these lines in it:
hello world
how are you
You can read it with Python like in the following code. Note that the tell() function tells you the file's position.
>>> f = open('file.txt', 'r')
>>> f.tell()
0
>>> f.readline()
'hello world\n'
>>> f.tell()
12
Python is now twelve characters "into" the file. If you'd count the characters, that means it's right after the newline character (\n is a single character). Continuing to read from the file with readlines() or any other reading function will use this position to know where to start reading.
Writing to the file will also use and increment the position. This means that if, after writing to the file you read from the file, Python will start reading at the position it has saved (which is right after whatever you just wrote), not the beginning of the file.
The ftp.storlines() function uses the same readlines() function, which only starts reading at the file's position, so after whatever you wrote. You can solve this by seeking back to the start of the file before calling ftp.storlines(). Use f.seek(0) to reset the file position to the very start of the file.

Python failed to parse txt file but the file is confirmed to be 'txt' file

I have a piece of python code that reads from a txt file properly, but my colleague gave me another set of files that appears to be of type txt file as well. But when I ran the same python code, each line is read incorrectly.
For the new files, if the line is 240,022414114120,-500,Bauer_HS5,0
It would be read as str:2[]4[]0 []0[]2[]2[]4..... All those little rectangles between each character and the leading question mark characters are all invalid characters.
And it will further get converted to something like this:
[['\xff\xfe2\x004\x000\x00', '\x000\x002\x002\x004\x001\x004\x001\x001\x004\x001\x002\x000\x00', '\x00-\x005\x000\x000\x00',......
However, if I manually create a normal text file and copy/paste the content from the input file, the parsr was able to read each line correctly. So I am thinking the input files are of different type of the normal text file. But the files' suffix are indeed 'txt'.
The files come from a device that regularly sends files to our server. This parser works fine for another device that does the same thing. And the files from both devices are all of type 'txt'.
Each line is read as {{{ for line in self._infile.xreadlines(): }}}
I am very confused why it would behave this way.
My python code is following.
def __init__(self, infile=sys.stdin, outfile=sys.stdout):
if isinstance(infile, basestring):
infile = open(infile)
if isinstance(outfile, basestring):
outfile = open(outfile, "w")
self._infile = infile
self._outfile = outfile
def sort(self):
lines = []
last_second = None
for line in self._infile.xreadlines():
line = line.replace('\r\n', '')
fields = line.split(',')
if len(fields) < 2:
continue
second = fields[1]
if last_second and second != last_second:
lines = sorted(lines, self._sort_lines)
self._outfile.write("".join([','.join(x) for x in lines]))
#self._outfile.write("\r\n")
lines = []
last_second = second
lines.append(fields)
if lines:
lines = sorted(lines, self._sort_lines)
self._outfile.write("".join([','.join(x) for x in lines]))
#self._outfile.write("\r\n")
self._infile.close()
self._outfile.close()
The start of the file you described as coming from your colleague is "\xff\xfe". These two characters make up a "byte order mark" that indicates that the file is encoded with the "UTF-16-LE" encoding (that is, 16-bit Unicode with the lower byte first). Your Python script is reading with an 8-bit encoding (probably whatever your system's default encoding is), so you're seeing lots of extra null characters (the high bytes of the 16-bit characters).
I can't speak to how the file got a different encoding. Windows text editors (like notepad.exe) are somewhat notorious for silently reencoding files in unhelpful ways if you're not careful with them, so it may be that your colleague previewed the file in an editor and then saved it before forwarding it on to you.
Anyway, the simplest fix is probably to reencode the file. There are various utilities to do this on various OSs, or you could write your own easily enough. Here's a quick and dirty function to reencode a file in Python (which will hopefully raise an exception if the encoding parameters are wrong, but perhaps not always):
def renecode_file(filename, from_encoding="UTF-16-LE", to_encoding="ascii"):
with open(filename, "rb") as f:
in_bytes = f.read() # read bytes
text = in_bytes.decode(from_encoding) # decode to unicode
out_bytes = text.encode(to_encoding) # reencode to new encoding
with open(filename, "wb") as f:
f.write(out_bytes) # write back to the file
If the file you get is going to always be encoded in UTF-16, you could change your regular script to decode it automatically. In Python 2.7, I'd suggest using the io module's open function for this (it is the same code that the regular open uses in Python 3). Note however that the file object returned won't support the xreadlines method which has been deprecated for a long time (just iterate over the file directly instead).

Categories

Resources