PIL changed how it converts RGB to L (greyscale)? - python

I have the exact same image processing code running on two different machines inside conda environments, and I expect them both to give identical results, but they don't. So I did a lot of digging and found that the PIL function Image.convert() is giving different outputs on the two machines for exactly the same input, which can be explained by the PIL version on one machine being 6.2.1, and on the other machine being 7.0.0.
Looking at the documentation of 6.2.1 and 7.0.0 though, absolutely nothing changed about the formula they use to convert RGB to L in Image.convert(). I cloned the git repo and checked the implementation diff and it seems nothing changed:
diff --git a/src/libImaging/Convert.c b/src/libImaging/Convert.c
index 60513c66..9f572254 100644
--- a/src/libImaging/Convert.c
+++ b/src/libImaging/Convert.c
## -44,7 +44,8 ##
#define L(rgb)\
((INT32) (rgb)[0]*299 + (INT32) (rgb)[1]*587 + (INT32) (rgb)[2]*114)
#define L24(rgb)\
- ((rgb)[0]*19595 + (rgb)[1]*38470 + (rgb)[2]*7471)
+ ((rgb)[0]*19595 + (rgb)[1]*38470 + (rgb)[2]*7471 + 0x8000)
+
#ifndef round
double round(double x) {
Note that for my conversion from RGB to L (8-bit), the change in the L24() macro above should be irrelevant.
I don't understand where the difference is coming from and would like to understand it. Can anyone with more PIL insight show how/why/where it changed? Is there any note in the documentation that I've missed that this was changed?

Related

Passing a float array pointer through a python extension/wrapper – SndObj-library

So I'm feeling that Google is getting tired of trying to help me with this.
I've been trying to experiment some with the SndObj library as of late, and more specifically the python wrapper of it.
The library is kind enough to include a python example to play around with, the only issue being it to get it to work. The last line below is giving me a world of hurt:
from sndobj import SndObj, SndRTIO, HarmTable, Oscili, SND_OUTPUT
from scipy import zeros, pi, sin, float32
import numpy
sine = numpy.array([256],float32)
for i in range(sine.size):
sine[i] = 0.5 * sin((2 * pi * i) / sine.size)
sine *= 32768
obj = SndObj()
obj.PushIn(sine,256)
In the original code it was:
obj.PushIn(sine)
That gave me the error
TypeError: SndObj_PushIn() takes exactly 3 arguments (2 given)
Alright, fair enough. I check the (automatically generated) documentation and some example code around the web and find that it also wants an integer size. Said and done (I like how they have, what I'm guessing is at least, dated code in the example).
Anyway, new argument; new error:
TypeError: in method 'SndObj_PushIn', argument 2 of type 'float *'
I'm not experienced at all in c++, which I believe is the library's "native" (excuse my lack of proper terminology) language, but I'm pretty sure I've picked up that it wants a float array/vector as its second argument (the first being self). However, I am having a hard time accomplishing that. Isn't what I've got a float array/vector already? I've also, among other things, tried using float instead of float32 in the first line and float(32768) in the fourth to no avail.
Any help, suggestion or tip would be much appreciated!
EDIT:
Became unsure of the float vector/array part and went to the auto-docs again:
int SndObj::PushIn ( float * vector,
int size
)
So I'd say that at least the c++ wants a float array/vector, although I can of course still be wrong about the python wrapper.
UPDATE
As per Prune's request (saying that the error message isn't asking for a float vector, but saying that that's the error), I tried inputing different integer (int,int32, etc.) vectors instead. However, seeing that I still got the same error message and keeping the EDIT above in mind, I'd say that its actually supposed to be a float vector after all.
UPDATE2
After some hints from saulspatz I've changed the question title and tags to better formulate my problem. I did some further googling according to this as well, but am yet to dig out anything useful.
UDATE3
SOLVED
Actually, the problem is the opposite: PushIn takes an array of integers. The error message is complaining that you gave it floats. Try this in place of your call to PushIn
int_sine = numpy.array([256],int32)
int_sine = [int(x) for x in sine]
and then feed int_sine instead of sine to PushIn.
I don't really have an answer to your question, but I have some information for you that's too long to fit in a comment, and that I think may prove useful. I looked at the source of what I take to be the latest version, SndObj 2.6.7. In SndObj.h the definition of PushIn is
int PushIn(float *in_vector, int size){
for(int i = 0; i<size; i++){
if(m_vecpos >= m_vecsize) m_vecpos = 0;
m_output[m_vecpos++] = in_vector[i];
}
return m_vecpos;
}
so it's clear that size is the number of elements to push. (I presume this would be the number of elements in your array, and 256 is right.) The float* means a pointer to float; in_vector is just an identifier. I read the error message to mean that the function received a float when it was expecting a pointer to float. In a C++ program, you might pass a pointer to float by passing the name of an array of floats, though this is not the only way to do it.
I don't know anything about how python extensions are programmed, I'm sorry to say. From what I'm seeing, obj.PushIn(sine,256) looks right, but that's a naive view.
Perhaps with this information, you can formulate another question (or find another tag) that will attract the attention of someone who knows about writing python extensions in C/C++.
I hope this helps.
So finally managed to get it working (with some assistance the very friendly wrapper author)!
It turns out that there is a floatArray class in the sandbox-library which is used for passing float arrays to the c++-functions. I'm guessing that they included that after the numpy-test.py was written which threw me for a loop.
Functioning code:
from sndobj import SndObj, SndRTIO, SND_OUTPUT, floatArray
from scipy import pi, sin
# ---------------------------------------------------------------------------
# Test PushIn
# Create 1 frame of a sine wave in a numpy array
sine = floatArray(256)
for i in range(256):
sine[i] = float(32768*0.5 * sin((2 * pi * i) / 256))
obj = SndObj()
obj.PushIn(sine,256)
outp = SndRTIO(1, SND_OUTPUT)
outp.SetOutput(1, obj)
# Repeatedly output the 1 frame of sine wave
duration = outp.GetSr() * 2 # 2 seconds
i = 0
vector_size = outp.GetVectorSize()
while i < duration:
outp.Write()
i += vector_size

How to use the DICOM LUT decoder with gdcm in Python - passing buffers

I need to use the GDCM for converting DICOM images to PNG-format. While this example works, it does not seem to take the LUT into account and thus I get a mixture of inverted/non-inverted images. While I'm familiar with both C++ and Python I can't quite grasp the black magic inside the wrapper. The documentation is purely written in C++ and I need some help in connecting the dots.
The main task
Convert the following section in the example:
def gdcm_to_numpy(image):
....
gdcm_array = image.GetBuffer()
result = numpy.frombuffer(gdcm_array, dtype=dtype)
....
to something like this:
def gdcm_to_numpy(image):
....
gdcm_array = image.GetBuffer()
lut = image.GetLUT()
gdcm_decoded = lut.Decode(gdcm_array)
result = numpy.frombuffer(gdcm_decoded, dtype=dtype)
....
Now this gives the error:
NotImplementedError: Wrong number or type of arguments for overloaded function 'LookupTable_Decode'.
Possible C/C++ prototypes are:
gdcm::LookupTable::Decode(std::istream &,std::ostream &) const
gdcm::LookupTable::Decode(char *,size_t,char const *,size_t) const
From looking at the GetBuffer definition I guess the first parameter is the assigned variable bool GetBuffer(char *buffer) const;. I guess that the latter 4-argument version is the one I should aim for. Unfortunately I have no clue to what the size_t arguments should be. I've tried with
gdcm_in_size = sys.getsizeof(gdcm_array)
gdcm_out_size = sys.getsizeof(gdcm_array)*3
gdcm_decoded = lut.Decode(gdcm_out_size, gdcm_array, gdcm_in_size)
also
gdcm_in_size = ctypes.sizeof(gdcm_array)
gdcm_out_size = ctypes.sizeof(gdcm_array)*3
gdcm_decoded = lut.Decode(gdcm_out_size, gdcm_array, gdcm_in_size)
but with no success.
Update - test with the ImageApplyLookupTable according to #malat's suggestion
...
lutfilt = gdcm.ImageApplyLookupTable();
lutfilt.SetInput( image );
if (not lutfilt.Apply()):
print("Failed to apply LUT")
gdcm_decoded = lutfilt.GetOutputAsPixmap()\
.GetBuffer()
dtype = get_numpy_array_type(pf)
result = numpy.frombuffer(gdcm_decoded, dtype=dtype)
...
Unfortunately I get "Failed to apply LUT" printed and the images are still inverted. See the below image, ImageJ suggests that it has an inverting LUT.
As a simple solution, I would apply the LUT first. In which case you'll need to use ImageApplyLookupTable. It internally calls the gdcm::LookupTable API. See for example.
Of course the correct solution would be to pass the DICOM LUT and convert it to a PNG LUT.
Update: now that you have posted the screenshot. I understand what is going on on your side. You are not trying to apply the DICOM Lookup Table, you are trying to change the rendering of two different Photometric Interpration DICOM DataSet, namely MONOCHROME1 vs MONOCHROME2.
In this case you can change that using software implementation via the use of: gdcm.ImageChangePhotometricInterpretation. Technically this type of rendering is best done using your graphic card (but that is a different story).

Maximum value of kernel statistics - python

Question: What is the maximum value of kernel statistic counters and how can I handle it in python code?
Context: I calculate some statistics based on kernel statistics (e.g. /proc/partitions - it'll be customized python iostat version). But I have a problem with overflowed values - negative values. Original iostat code https://github.com/sysstat/sysstat/blob/master/iostat.c comments:
* Counters overflows are possible, but don't need to be handled in
* a special way: The difference is still properly calculated if the
* result is of the same type as the two values.
My language is python and I need to care about overflow in my case. Probably it depends also on architecture (32/64). I've tried 2^64-1 (64bit system), but no success.
The following function will work for 32-bit counters:
def calc_32bit_diff(old, new):
return (new - old + 0x100000000) % 0x100000000
print calc_32bit_diff(1, 42)
print calc_32bit_diff(2147483647, -2147483648)
print calc_32bit_diff(-2147483648, 2147483647)
This obviously won't work is the counter wraps around more than once between two consecutive reads (but then no other method would work either, since the information has been irrecoverably lost).
Writing a 64-bit version of this is left as an exercise for the reader. :)

Outdated opencv script doesn't work, didn't find equivalents

After reading this, I tried the accepted answer, that gives a python script using cv to get an image from the webcam and calculate the brightness that will be applied to the screen.
When I tried it, though, the libraries just seemed to not work. Apparently, since the time that answer was posted (in 2011), changes were made to cv and highgui, for example, is not referenced anymore. I tried to get equivalences and somehow make an workaround, but it seems to be poorly documented or my problem is just too specific. Already searched here about it with no luck.
Here is my slightly modified version of the script:
import opencv
import opencv.highgui
import time
import commands
def get_image():
image = opencv.highgui.cvQueryFrame(camera)
return opencv.adaptors.Ipl2PIL(image)
camera = opencv.highgui.cvCreateCameraCapture(-1)
while 1:
image = get_image()
image.thumbnail((32, 24, ))
image = tuple(ord(i) for i in image.tostring())
x = int((int((max(image) / 256.0) * 10) + 1) ** 0.5 / 3 * 10)
cmd = ("sudo su -c 'echo " + str(x) +
" > /sys/devices/virtual/backlight/acpi_video0/brightness'")
status, output = commands.getstatusoutput(cmd)
assert status is 0
(had to change opencv to cv because even with python-opencv installed opencv couldn't be found).
Any light?

Seeming discrepancy in shutil.disk_usage()

I am using the shutil.disk_usage() function to find the current disk usage of a particular path (amount available, used, etc.). As far as I can find, this is a wrapper around os.statvfs() calls. I'm finding that it is not giving the answers I'd expect, as comparing to the output of "du" in Linux.
I have obscured some of the paths below for company privacy reasons, but the output and code are otherwise undoctored. I am using Python 3.3.2 64-bit version.
#!/apps/python/3.3.2_64bit/bin/python3
# test of shutils.diskusage module
import shutil
BytesPerGB = 1024 * 1024 * 1024
(total, used, free) = shutil.disk_usage("/data/foo/")
print ("Total: %.2fGB" % (float(total)/BytesPerGB))
print ("Used: %.2fGB" % (float(used)/BytesPerGB))
(total1, used1, free1) = shutil.disk_usage("/data/foo/utils/")
print ("Total: %.2fGB" % (float(total1)/BytesPerGB))
print ("Used: %.2fGB" % (float(used1)/BytesPerGB))
Which outputs:
/data/foo/drivecode/me % disk_usage_test.py
Total: 609.60GB
Used: 291.58GB
Total: 609.60GB
Used: 291.58GB
As you can see, the main problem is I would expect the second amount for "Used" to be much smaller, as it is a subset of the first directory.
/data/foo/drivecode/me % du -sh /data/foo/utils
2.0G /data/foo/utils
As much as I trust "du," I find it hard to believe the Python module would be incorrect either. So perhaps it is just my understanding of Linux filesystems that could be the issue. :)
I wrote a module (based heavily on someone's code here at SO) which recursively gets the disk_usage, which I was using until now. It appears to match the "du" output but is MUCH, much slower than the shutil.disk_usage() function, so I'm hoping I can make that one work.
Thanks much in advance.
The problem is that shutil uses the statvfs system call underneath to determine the space used. This system call has no file-path granularity as far as I'm aware, only file-system granularity. What this means is that the path you provide it with only helps to identify the file system you want to query, not the path's.
In other words, you gave it the path /data/foo/utils and then it determined which file system backs this file path. Then it queried the file system. This becomes apparent when you consider how the used parameter is defined in shutil:
used = (st.f_blocks - st.f_bfree) * st.f_frsize
Where:
fsblkcnt_t f_blocks; /* size of fs in f_frsize units */
fsblkcnt_t f_bfree; /* # free blocks */
unsigned long f_frsize; /* fragment size */
This is why it's giving you the total space used on the entire file system.
Indeed, it seems to me like the du command itself also traverses the file structure and adds up the file sizes. Here is GNU coreutils du command's source code.
The shutil.disk_usage returns the disk usage (i.e. the mount point which backs the path) and not actual file usage under that path. It is equivalent of running df /path/to/mount and not du /path/to/files. Notice that for both directories you got the exact same usage.
From the docs: "Return disk usage statistics about the given path as a named tuple with the attributes total, used and free, which are the amount of total, used and free space, in bytes."
Update for anyone stumbling upon this after 2013:
Depending on your Python version and OS, shutil.disk_usage might support files and directories for the path variable. Here's the breakdown:
Windows:
3.3 - 3.5: only suports mountpoint/filesystem
3.6 - 3.7: directory support
3.8+: file & directory support
Unix:
3.3 - 3.5: only suports mountpoint/filesystem
3.6+: file & directory support

Categories

Resources