Python get rid of bytes b' ' - python

import save
string = ""
with open("image.jpg", "rb") as f:
byte = f.read(1)
while byte != b"":
byte = f.read(1)
print ((byte))
I'm getting bytes like:
b'\x00'
How do I get rid of this b''?
Let's say I wanna save the bytes to a list, and then save this list as the same image again. How do I proceed?
Thanks!

You can use bytes.decode function if you really need to "get rid of b": http://docs.python.org/3.3/library/stdtypes.html#bytes.decode
But it seems from your code that you do not really need to do this, you really need to work with bytes.

The b"..." is just a python notation of byte strings, it's not really there, it only gets printed. Does it cause some real problems to you?

The b'', is only the string representation of the data that is written when you print it.
Using decode will not help you here because you only want the bytes, not the characters they represent. Slicing the string representation will help even less because then you are still left with a string of several useless characters ('\', 'x', and so on), not the original bytes.
There is no need to modify the string representation of the data, because the data is still there. Just use it instead of the string (i.e. don't use print). If you want to copy the data, you can simply do:
data = file1.read(...)
...
file2.write(data)
If you want to output the binary data directly from your program, use the sys.stdout.buffer:
import sys
sys.stdout.buffer.write(data)

To operate on binary data you can use the array-module.
Below you will find an iterator that operates on 4096 chunks of data instead of reading everything into memory at ounce.
import array
def bytesfromfile(f):
while True:
raw = array.array('B')
raw.fromstring(f.read(4096))
if not raw:
break
yield raw
with open("image.jpg", 'rb') as fd
for byte in bytesfromfile(fd):
for b in byte:
# do something with b

This is one way to get rid of the b'':
import sys
print(b)
If you want to save the bytes later it's more efficient to read the entire file in one go rather than building a list, like this:
with open('sample.jpg', mode='rb') as fh:
content = fh.read()
with open('out.jpg', mode='wb') as out:
out.write(content)

Here is one solution
print(str(byte[2:-1]))

Related

how to change byte type in pythone

i have a small problem and it caused me a lot of trubble. basicly i want to convert an immage to bytes than store string wersion of those bytes in an txt file and than read file contents and transform it into bytes and than into image. i've goten first part of this kinda ready (it works but it's made quickly and badly) but the conversion from string to byte gives me problem.
when i read image bytes it's something like this: b'GIF89aP\x00P\x00\xe3'
but when i read it from txt by 'rb' or just transform str to byte it gives me this: b'GIF89aP\\x00P\\x00\\xe3'
and with this i can't write it to an immage.
so i've tried to read and learn anything about this but i couldn't find anything that would help.
the code is here and i know it's really messy but i just need it to work
file = open('p.gif', 'rb')
image = file.read()
str_b = str(image)
leng = len(str_b)
print(leng)
str_b = str_b[:0] + str_b[0+2:]
leng =- 1
str_b = str_b[:leng]
print(image)
#a = open('bytearray', 'w+')
#a.write(str_b)
#a.close
a = open('bytearray', 'r')
a = a.read()
temp = a.encode('utf-8')
print(temp)
#b = open('check', 'w+')
#b.write(str(string))
#print(string)
image_result = open('decoded.jpg', 'wb') # create a writable image and write the decoding result
image_result.write(temp)
basicly my goal right now is to get bytes that look like this: b'GIF89aP\x00P\x00\xe3'
Please do not use eval like suggested above, eval has serious security vulnerabilities and will execute any python code you pass within it. You could accidentally read a text file that has code to reformat the disk and it will just execute, this is just an example but you get my point its bad practice and just results in more problems see https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html if you want some examples on why eval is bad
anyways lets try to fix your code
instead of converting your byte array to string by wrapping it in the str() method I would suggest you use .decode() and .encode()
Fixed Code:
with open('p.gif', 'rb') as file:
image = file.read() # read file bytes
str_image = image.decode("utf-8") #using decode we changed the bytes to a string
with open('image.txt', 'w') as file:
file.write(str_image) # write image as string to a text file
with open('image.txt', 'r') as file
str_from_file = file.read() # read the text file and store the string
file_bytes = str_from_file.encode("utf-8") # encode the image str back to bytes
print(type(str_from_file)) #type is str
print(type(file_bytes)) # types is bytes
I hope this fixes your issue and also doesn't include vulnerabilties in what your building

How can I read the last bytes of a file?

I'm adding a string at the end of a binary file, the problem is that I don't know how to get that string back. I append the string, to the binary file, in ascii format using that command.
f=open("file", "ab")
f.write("Testing string".encode('ascii'))
f.close()
The string length will be max 40 characters, if it is shorter it will be padded with zeros. But I don't know how to get the string back from the binary file since it is at the end and how to rewrite the file without the string. Thank you in advance.
Since you opened the file in append mode, you can't read from it like that.
You will need to reopen in in read mode like so:
f = open("file", "rb")
fb = f.read()
f.close()
For future reference, an easier way to open files is like this:
with open("file", "rb") as f:
fb = f.read()
At which point you can use fb. In doing this, it will automatically close itself after the with has finished.

Python - Read text file to String

I have the following python sript, which double hashes a hex value:
import hashlib
linestring = open('block_header.txt', 'r').read()
header_hex = linestring.encode("hex") // Problem!!!
print header_hex
header_bin = header_hex.decode('hex')
hash = hashlib.sha256(hashlib.sha256(header_bin).digest()).digest()
hash.encode('hex_codec')
print hash[::-1].encode('hex_codec')
My text file "block_header.txt" (hex) looks like this:
0100000081cd02ab7e569e8bcd9317e2fe99f2de44d49ab2b8851ba4a308000000000000e320b6c2fffc8d750423db8b1eb942ae710e951ed797f7affc8892b0f1fc122bc7f5d74df2b9441a42a14695
Unfortunately, the result from printing the variable header_hex looks like this (not like the txt file):
303130303030303038316364303261623765353639653862636439333137653266653939663264653434643439616232623838353162613461333038303030303030303030303030653332306236633266666663386437353034323364623862316562393432616537313065393531656437393766376166666338383932623066316663313232626337663564373464663262393434316134326131343639350a
I think the problem is in this line:
header_hex = linestring.encode("hex")
If I remove the ".encode("hex")"-part, then I get the error
unhandled TypeError "Odd-length string"
Can anyone give me a hint what might be wrong?
Thank you a lot :)
You're doing too much encoding/decoding.
Like others mentioned, if your input data is hex, then it's a good idea to strip leading / trailing whitespace with strip().
Then, you can use decode('hex') to turn the hex ASCII into binary. After performing whatever hashing you want, you'll have the binary digest.
If you want to be able to "see" that digest, you can turn it back into hex with encode('hex').
The following code works on your input file with any kinds of whitespace added at the beginning or end.
import hashlib
def multi_sha256(data, iterations):
for i in xrange(iterations):
data = hashlib.sha256(data).digest()
return data
with open('block_header.txt', 'r') as f:
hdr = f.read().strip().decode('hex')
_hash = multi_sha256(hdr, 2)
# Print the hash (in hex)
print 'Hash (hex):', _hash.encode('hex')
# Save the hash to a hex file
open('block_header_hash.hex', 'w').write(_hash.encode('hex'))
# Save the hash to a binary file
open('block_header_hash.bin', 'wb').write(_hash)

Python 2.6: Creating image from array

Python rookie here! So, I have a data file which stores a list of bytes, representing pixel values in an image. I know that the image is 3-by-3 pixels. Here's my code so far:
# Part 1: read the data
data = []
file = open("test.dat", "rb")
for i in range(0, 9)
byte = file.read(1)
data[i] = byte
file.close()
# Part2: create the image
image = PIL.Image.frombytes('L', (3, 3), data)
image.save('image.bmp')
I have a couple of questions:
In part 1, is this the best way to read a binary file and store the data in an array?
In part 2, I get the error "TypeError: must be string or read-only buffer, not list.
Any help on either of these?
Thank you!
Part 1
If you know that you need exactly nine bytes of data, that looks like a fine way to do it, though it would probably be cleaner/clearer to use a context manager and skip the explicit loop:
with open('test.dat', 'rb') as infile:
data = list(infile.read(9)) # read nine bytes and convert to a list
Part 2
According to the documentation, the data you must pass to PIL.Image.frombytes is:
data – A byte buffer containing raw data for the given mode.
A list isn't a byte buffer, so you're probably wasting your time converting the input to a list. My guess is that if you pass it the byte string directly, you'll get what you're looking for. This is what I'd try:
with open('test.dat', 'rb') as infile:
data = infile.read(9) # Don't convert the bytestring to a list
image = PIL.Image.frombytes('L', (3, 3), data) # pass in the bytestring
image.save('image.bmp')
Hopefully that helps; obviously I can't test it over here since I don't know what the content of your file is.
Of course, if you really need the bytes as a list for some other reason (doubtful--you can iterate over a string just as well as a list), you can always either convert them to a list when you need it (datalist = list(data)) or join them into a string when you make the call to PIL:
image = PIL.Image.frombytes('L', (3, 3), ''.join(datalist))
Part 3
This is sort of an aside, but it's likely to be relevant: do you know what version of PIL you're using? If you're using the actual, original Python Imaging Library, you may also be running into some of the many problems with that library--it's super buggy and unsupported since about 2009.
If you are, I highly recommend getting rid of it and grabbing the Pillow fork instead, which is the live, functional version. You don't have to change any code (it still installs a module called PIL), but the Pillow library is superior to the original PIL by leaps and bounds.

reading *.his (image) file in Python

I am trying to read an image file which is in *.his format. Honestly, I do not know much about this format, on spending some time on google I figured out that its a binary format and it can be read in ImageJ software as a raw format import. On further inquiry, I found the following details of the *.his file:
Image type = 16-bit unsigned
Matrix dimensions in pixels = w1024 x h1024
Skip header info = 100 Bytes (The number of bytes in the file before the first byte of image data).
Little-Endian Byte Order
With this information in hand, I started out ...
Just wanted to print the values in one by one, just to see the output:
f = open("file.his", 'rb')
f.seek(100)
try:
byte = f.read(2)
while byte != "":
byte = f.read(2)
print unpack('<H', byte)
finally:
f.close()
It prints some numbers out and then the error message :
.....
(64846,)
(64846,)
(64830,)
Traceback (most recent call last):
print unpack('
Plz can someone suggest me how to read this kind of file. I still think 'unpack' is the right function however if someone has similar experience, any response greatly appreciated.
Rky.
I've done a very similar task with *.inr image file maybe the logic could help you, here its what you could apply:
1-Reading the file
First you need to read the file.
file = open(hisfile, 'r')
inp = file.readlines()
2-Get header
In my case i done a for loop until the number of characters was 256, in your case you need to count the bits so you could "print" line by line to find out when you need to stop or try to use this to count the bits:
import sys
sys.getsizeof(line) #returns the size of the object
3-Data
When you already know that the following lines are the raw data you need to put them in one variable with a for loop:
for line in inp:
raw_data += line
4-Convert the data
To convert the string to a numpy array you could do:
data = fromstring(raw_data, dtype='uint16')
And then aplying the shape data:
data = data.reshape((1024,1024)).transpose() #You need to see if the transpose part its relevant,because in my case was fundamental.
Maybe if you have an example of the file i could try to read it and help you more. Of course you could do all the process in 1 for loop using if's.

Categories

Resources