How to get string data from a python PIL image object?

How to get string data from a python PIL image object? - python

I'm trying to send the data of a gif file of a desktop through a socket to a remote desktop (for desktop sharing) but i can't get the string for the data using PIL, i don't know how to convert the Pil objects to string, here is my code (btw i know i can just write to a file then read the data like that but i think that is inefficient and i think that there is a better way maybe?).
from PIL import ImageGrab
import cStringIO
fakie = cStringIO.StringIO()
ImageGrab.grab().save(fakie, 'GIF')
data = fakie.getvalue()
fakie.close()
# This last bit of code is to see if the var data stored the right info in a str bc i need to send it through a socket
with open('C:\something\something\Desktop\image.gif', 'w') as f:
f.write(data)
The problem is that after the file is written the gif picture only displays the top 1/10 of the page (the gif file is messed up), and so i'm wondering if the problem lies within my computer or my code (i'm using vista on a VERY old computer, at least 6 years i think and i'm getting a new one soon). Any input is appreciated.

As #HYRY puts it, you must open the image file with "wb" mode instead of "w" - Without the "b" Python defaults to open it in text mode - in windows it means that whenever a 0x0a byte is written to the file, the O.S. writes a 0x0d 0x0a sequence instead, because it translates line ending sequences to Windows native line endings.
In the "wb" mode, there is no translation, and your image file won't be corrupted.

Related

python write file vs matlab write file, read by another software

I've encountered this strange problem with opening/closing files in python. I am trying to do the same thing in python that i was doing successfully in matlab, and i am getting a problem communicating with some software through text files. I've come up with a strange workaround to solve the problem, but i dont understand why it works.
I have software that communicates with a some lab equipment. To communicate with this software, i write a file ('wavefile.txt') to a specific folder, containing parameters to send to the device. I then write another file named 'request.txt' containing the location of this first file ('wavefile.txt') which contains the parameters to send to the device. The software is constantly checking this folder to find the file named 'request.txt' and once it finds it, it will read the parameters in the file which is specified by the text in 'request.txt' and then delete 'request.txt'. The software/equipment developer instructs to give a 50 ms second delay before closing the 'request.txt' file.
original matlab code that works:
home = cd;
cd \\CREOL-FAST-01\data
fileID = fopen('request.txt', 'wt');
proj = 'C:\\dazzler\\data\\wavefile.txt';
fprintf(fileID, proj);
pause(0.05);
fclose('all');
cd(home);
original python code that does not work:
home = os.getcwd()
os.chdir(r'\\CREOL-FAST-01\data')
with open('request.txt', 'w') as file:
proj = r'C:\dazzler\data\wavefile.txt'
file.write(proj)
time.sleep(0.05)
os.chdir(home)
Every time the device program reads the 'request.txt' when its working with matlab, it deletes it immediately after matlab closes it. When i run that code with python it works SOMETIMES, maybe 1 in every 5 tries will be successful and the parameters are sent. The 'request.txt' file is always deleted with the python code above,but the parameters i've input are clearly not sent to my lab device. My guess is that when i write the file in python, the device program is able to read it before python writes the text to it, so its just opening the blank file, not applying any parameters, and then deleting it.
My workaround in python:
home = os.getcwd()
os.chdir(r'\\CREOL-FAST-01\data')
fileh = open('request.txt', 'w+')
proj = r'C:\dazzler\data\wavefile.txt'
fileh.write(proj)
time.sleep(0.05)
print(fileh.read())
time.sleep(0.05)
fileh.close()
This method in python seems to work 100% of the time. I open the file in w+ mode, and using fileh.read() is absolutely necessary. if i delete that line and still include the extra sleeptime, it will again work about 1 in 5 tries. This seems really strange to me. Any explanation, or better solutions?

My guess (which could be wrong) is that the file is being read before it is completely flushed. I would try using the flush() method after the write to make sure that the complete data is written to the file. You might also need the os.fsync() method to make sure the data is flushed properly. Try something like this:
home = os.getcwd()
os.chdir(r'\\CREOL-FAST-01\data')
with open('request.txt', 'w') as file:
proj = r'C:\dazzler\data\wavefile.txt'
file.write(proj)
file.flush()
os.fsync()
time.sleep(0.05)
os.chdir(home)

Not knowing any details about the particular equipment and other software you are using it's hard to say. One guess is the difference in buffering on write calls.
From this blog post on Matlab's fwrite: "The default behavior for fprintf and fwrite is to flush the file buffer after each call to either of these functions"
Whereas for Python's open:
When no buffering argument is given, the default buffering policy
works as follows:
Binary files are buffered in fixed-size chunks; the size of the buffer
is chosen using a heuristic trying to determine the underlying
device’s “block size” and falling back on io.DEFAULT_BUFFER_SIZE. On
many systems, the buffer will typically be 4096 or 8192 bytes long.
“Interactive” text files (files for which isatty() returns True) use
line buffering. Other text files use the policy described above for
binary files.
To test this guess change:
with open('request.txt', 'w') as file:
proj = r'C:\dazzler\data\wavefile.txt'
to:
with open('request.txt', 'w', buffer=1) as file:
proj = 'C:\\dazzler\\data\\wavefile.txt\n'

The problem is probably that you are doing the delay while the file is still open and thus not written to disk. Try something like this:
home = os.getcwd()
os.chdir(r'\\CREOL-FAST-01\data')
with open('request.txt', 'w') as file:
proj = r'C:\dazzler\data\wavefile.txt'
file.write(proj)
time.sleep(0.05)
os.chdir(home)
The only difference here is that the sleep is done after the file is closed (the file is closed when the with block ends), and thus the delay doesn't happen until after the text is written to disk.
To put it in words, what you are doing is:
Open (and create) file
Write text to a buffer (in memory, not on disk)
Wait 50 ms
Close (and write) the file
What you want to do is:
Open (and create) file
Write text to a buffer (in memory, not on disk)
Close (and write) the file
Wait 50 ms
So what you end up with is a period of at least 50 ms where the text file has been created, but where there is nothing in it because the text is sitting in your computer memory not on disk.
To be honest, I would put as little in the with block as possible, to avoid issues like this. So I would write it like so:
home = os.getcwd()
os.chdir(r'\\CREOL-FAST-01\data')
proj = r'C:\dazzler\data\wavefile.txt'
with open('request.txt', 'w') as file:
file.write(proj)
time.sleep(0.05)
os.chdir(home)
Also keep in mind that you also can't do the opposite: assume that no text is written until you close. For small files like this, that will probably happen. But when the file is written to disk depends on a lot of factors. When a file is closed, it is written, but it may be written before that too.

Why does setting the "hidden file" attribute on a file make it read only for Python (3.5) on Windows 7 [duplicate]

I want to replace the contents of a hidden file, so I attempted to open it in w mode so it would be erased/truncated:
>>> import os
>>> ini_path = '.picasa.ini'
>>> os.path.exists(ini_path)
True
>>> os.access(ini_path, os.W_OK)
True
>>> ini_handle = open(ini_path, 'w')
But this resulted in a traceback:
IOError: [Errno 13] Permission denied: '.picasa.ini'
However, I was able to achieve the intended result with r+ mode:
>>> ini_handle = open(ini_path, 'r+')
>>> ini_handle.truncate()
>>> ini_handle.write(ini_new)
>>> ini_handle.close()
Q. What is the difference between the w and r+ modes, such that one has "permission denied" but the other works fine?
UPDATE: I am on win7 x64 using Python 2.6.6, and the target file has its hidden attribute set. When I tried turning off the hidden attribute, w mode succeeds. But when I turn it back on, it fails again.
Q. Why does w mode fail on hidden files? Is this known behaviour?

It's just how the Win32 API works. Under the hood, Python's open function is calling the CreateFile function, and if that fails, it translates the Windows error code into a Python IOError.
The r+ open mode corresponds to a dwAccessMode of GENERIC_READ|GENERIC_WRITE and a dwCreationDisposition of OPEN_EXISTING. The w open mode corresponds to a dwAccessMode of GENERIC_WRITE and a dwCreationDisposition of CREATE_ALWAYS.
If you carefully read the remarks in the CreateFile documentation, it says this:
If CREATE_ALWAYS and FILE_ATTRIBUTE_NORMAL are specified, CreateFile fails and sets the last error to ERROR_ACCESS_DENIED if the file exists and has the FILE_ATTRIBUTE_HIDDEN or FILE_ATTRIBUTE_SYSTEM attribute. To avoid the error, specify the same attributes as the existing file.
So if you were calling CreateFile directly from C code, the solution would be to add in FILE_ATTRIBUTE_HIDDEN to the dwFlagsAndAttributes parameter (instead of just FILE_ATTRIBUTE_NORMAL). However, since there's no option in the Python API to tell it to pass in that flag, you'll just have to work around it by either using a different open mode or making the file non-hidden.

Here are the detailed differences:-
``r'' Open text file for reading. The stream is positioned at the
beginning of the file.
``r+'' Open for reading and writing. The stream is positioned at
the
beginning of the file.
``w'' Truncate file to zero length or create text file for writing.
The stream is positioned at the beginning of the file.
``w+'' Open for reading and writing. The file is created if it does
not
exist, otherwise it is truncated. The stream is positioned at
the beginning of the file.
``a'' Open for writing. The file is created if it does not exist.
The
stream is positioned at the end of the file. Subsequent writes
to the file will always end up at the then current end of file,
irrespective of any intervening fseek(3) or similar.
``a+'' Open for reading and writing. The file is created if it does
not
exist. The stream is positioned at the end of the file. Subse-
quent writes to the file will always end up at the then current
end of file, irrespective of any intervening fseek(3) or similar.
From python documentation - http://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files:-
On Windows, 'b' appended to the mode opens the file in binary mode, so
there are also modes like 'rb', 'wb', and 'r+b'. Python on Windows
makes a distinction between text and binary files; the end-of-line
characters in text files are automatically altered slightly when data
is read or written. This behind-the-scenes modification to file data
is fine for ASCII text files, but it’ll corrupt binary data like that
in JPEG or EXE files. Be very careful to use binary mode when reading
and writing such files. On Unix, it doesn’t hurt to append a 'b' to
the mode, so you can use it platform-independently for all binary
files.
So if you are using w mode, you are actually trying to create a file and you may not have the permissions to do it. r+ is the appropriate choice.
If you are in a situation where you do not yet know where your .picasi.ini exists or not and your windows user has file creation permissions in that directory and you want to append new information instead of starting at the beginning of the file (a.k.a "append"), then a+ will be the appropriate choice.
It has nothing to do with whether your file is hidden or not.

Thanks for this thread; I had the same issue today. My workaround is as follows. Works with Python 3.7
import os
GuiPanelDefaultsFileName = 'panelDefaults.json'
GuiPanelValues = {
'-FileName-' : os.getcwd() + '\\_AcMovement.xlsx',
'-DraftEmail-' : True,
'-MonthComboBox-' : 'Jun',
'-YearComboBox-' : '2020'
}
# Unhide the file via OS
if os.path.isfile(GuiPanelDefaultsFileName):
os.system(f'attrib -h {GuiPanelDefaultsFileName}')
# Write dict values to json
with open(GuiPanelDefaultsFileName, 'w') as fp:
json.dump(GuiPanelValues, fp, indent=4)
# Make it hidden again
os.system(f'attrib +h {GuiPanelDefaultsFileName}')

Problems with cross platform tell / seek in Python

Having a weird bug with Python 2.7.3 file reading. If I do this sort of thing:
end_of_header = f.tell()
print f.readline()
f.seek(end_of_header)
print f.readline()
the results are different. The file was written in Linux / Mac (not sure) and I'm trying to run it on Windows 7. If I run it in Linux it works. I have tried opening the file with both 'b' and 'U' tags and its not working. I have tried various encodings by opening with the codecs module.
Is the readline() causing the problem?
Some context is that there is a header after which there are a long trajectory (can be in the GB range) I need to be able read the header and process it, then read the file one line at a time. I may need to go back to the start of the file (end of the header) at any time though.

As you say of Windows and Linux/Mac , I think you have
a problem of different newlines ( http://www.editpadpro.com/tricklinebreak.html )
used by the operating system in which the file was written and the one in which it is read.
And the problem arises because you opened the file in a not-binary mode.
Try to open the file in binary mode, that is to say with 'rb' or 'rb+' or 'ab' or 'ab+' according what you want to do.

Do different OS's affect the md5 checksum of a file in Python?

So I have a python script that uses the pyserial library to send a file over serial to another computer. I wrote some script to calculate the md5 checksum of the file before and after being sent over serial and I have encountered some problems.
Example:
I sent a simple file named third.txt containing a list of numbers 1 through 10. Simple file, nothing fancy or large. The checksum of the file before transmitting is completely different than the checksum of the file after transmitting on the other computer, even though the files are clearly the same.
I checked to see if there was something wrong with my code by simply moving the file over a USB and doing the checksum calulations this way. This time it worked.
Any ideas why this is happening and how I might possibly fix it?
Here is my checksum code before sending. This is not the exact code, but basically what I did.
<<Code that waits for command from client>>
with open(file_loc) as file_to_read:
data = file_to_read.read()
md5a = hashlib.md5(data).hexdigest()
ser.write('\n' + md5a + '\n')
Here is my checksum code after sending.
with open(file_loc) as file_to_read:
data = file_to_read.read()
md5b = hashlib.md5(data).hexdigest()
print('Sending Checksum Command')
ser.write("\n<<SENDCHECKSUM>>\n")
md5a = ser.readline()
print(md5a)
print(md5b)
if md5a == md5b:
print("Correct File Transmission")
else:
print("The checksum indicated incorrect file transmission, please check.")
ser.flush()

Yes, opening a file in text mode potentially can result in different data being read as newlines are translated for you from the platform native format to \n. Thus, files containing \r\n will give you a different checksum when read on Windows vs. a POSIX platform.
Open files in binary mode instead:
with open(file_loc, 'rb') as file_to_read:
Note that the same applies when writing a file. If you receive data from a POSIX system using \n line endings, and you write this to a file opened for writing in text mode on Windows, you'll end up with \r\n line endings in the written file.
If you are using Python 3, you are complicating matters some more. When you are opening files in text mode, you are translating the data from encoded bytes to decoded Unicode values. What codec is used for that can also differ from OS to OS, and even from machine to machine. The default is locale-defined (using locale.getpreferredencoding(False)), and as long as the data is decodable by the default locale, you can get very different results from reading a file using a different codec. You really want to ensure you use the same codec by setting it explicitly, or better still, open files in binary mode.
Since hashlib requires you to feed it byte strings, this is less of a problem when trying to calculate the digest (you'd have run into that problem and at least have to think about codecs there), but this applies to file transfers too; writing to text file will encode the data to the default codec.

How to separate content from a file that is a container for binary and other forms of content

I am trying to parse some .txt files. These files serve as containers for a variable number of 'children' files that are set off or identified within the container with SGML tags. With python I can easily separate the children files. However I am having trouble writing the binary content back out as a binary file (say a gif or jpg). In the simplest case the container might have an embedded html file followed by a graphic that is called by the html. I am assuming that my problem is because I am reading the original .txt file using open(filename,'r'). But that seems the only option to find the sgml tags to split the file.
I would appreciate any help to identify some relevant reading material.
I appreciate the suggestions but I am still struggling with the most basic questions. For example when I open the file with wordpad and scroll down to the section tagged as a gif I see this:
<FILENAME>h65803h6580301.gif
<DESCRIPTION>GRAPHIC
<TEXT>
begin 644 h65803h6580301.gif
M1TE&.#EA(P)I`=4#`("`#,#`P$!`0+^_OW]_?_#P\*"#H.##X-#0T&!#8!`0
M$+"PL"`#('!P<)"0D#`P,%!04#\_/^_O[Y^?GZ^OK]_?WX^/C\_/SV]O;U]?
I can handle finding the section easily enough but where does the gif file begin. Does the header start with 644, the blanks after the word begin or the line beginning with MITE?
Next, when the file is read into python does it do anything to the binary code that has to be undone when it is read back out?
I can find the lines where the graphics begin:
filerefbin=file('myfile.txt','rb')
wholeFile=filerefbin.read()
import re
graphicReg=re.compile('<DESCRIPTION>GRAPHIC')
locationGraphics=graphicReg.finditer(wholeFile)
graphicsTags=[]
for match in locationGraphics:
graphicsTags.append(match.span())
I can easily use the same process to get to the word begin, or to identify the filename and get to the end of the filename in the 'first' line. I have also successefully gotten to the end of the embedded gif file. But I can't seem to write out the correct combination of things so when I double click on h65803h6580301.gif when it has been isolated and saved I get to see the graphic.
Interestingly, when I open the file in rb, the line endings appear to still be present even though they don't seem to have any effect in notebpad. So that is clearly one of my problems I might need to readlines and join the lines together after stripping out the \n
I love this site and I love PYTHON
This was too easy once I read bendin's post. I just had to snip the section that began with the word begin and save that in a txt file and then run the following command:
import uu
uu.decode(r'c:\test2.txt',r'c:\test.gif')
I have to work with some other stuff for the rest of the day but I will post more here as I look at this more closely. The first thing I need to discover is how to use something other than a file, that is since I read the whole .txt file into memory and clipped out the section that has the image I need to work with the clipped section instead of writing it out to test2.txt. I am sure that can be done its just figuring out how to do it.

What you're looking at isn't "binary", it's uuencoded. Python's standard library includes the module uu, to handle uuencoded data.
The module uu requires the use of temporary files for encoding and decoding. You can accomplish this without resorting to temporary files by using Python's codecs module like this:
import codecs
data = "Let's just pretend that this is binary data, ok?"
uuencode = codecs.getencoder("uu")
data_uu, n = uuencode(data)
uudecode = codecs.getdecoder("uu")
decoded, m = uudecode(data_uu)
print """* The initial input:
%(data)s
* Encoding these %(n)d bytes produces:
%(data_uu)s
* When we decode these %(m)d bytes, we get the original data back:
%(decoded)s""" % globals()

You definitely need to be reading in binary mode if the content includes JPEG images.
As well, Python includes an SGML parser, http://docs.python.org/library/sgmllib.html .
There is no example there, but all you need to do is setup do_ methods to handle the sgml tags you wish.

You need to open(filename,'rb') to open the file in binary mode. Be aware that this will cause python to give You confusing, two-byte line endings on some operating systems.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to get string data from a python PIL image object? - python

Related

python write file vs matlab write file, read by another software

Why does setting the "hidden file" attribute on a file make it read only for Python (3.5) on Windows 7 [duplicate]

Problems with cross platform tell / seek in Python

Do different OS's affect the md5 checksum of a file in Python?

How to separate content from a file that is a container for binary and other forms of content

Categories

Resources