I'm trying to write an encoder for a special image format. My intention is to implement it in pure Python as an image plugin for Pillow.
There is an entry in its docs (Writing Your Own Image Plugin) that gives some hints as to how to implement the decoder. With that and some help I got at the project's issues section I managed to get a decoder working.
However, all their encoders are written in C and the ImageFile.PyEncoder superclass (from which all Python encoder classes should inherit) is not even implemented. The documentation is also very sparse in this respect.
Given this state of affairs, is it possible to get such an encoder working? If so, I'd like to know what methods to write, where to get the image data from and where to write the encoded result to.
Related issues:
Writing your own image plugin documentation missing information;
PyEncoder doesn't exist
Edit 1: I'm not looking for a detailed encoder implementation. It's just that the docs don't state what the structure should be in Python. If it serves as proof of work, here is a repo with my work so far.
I managed to get an encoder working by manually implementing the encoder class. The required public methods are:
__init__: sets self.mode and gets the parameters for the encoder, if any;
(#property) pushes_fd: simply return self._pushes_fd;
setimage: sets self.im (the Pillow image object) and gets the image size;
setfd: sets self.fd (the file object to write to);
encode_to_pyfd: this is the method that you have to write. Read your input data from self.im* and write it to self.fd;
cleanup: perform decoder specific cleanup (just pass works).
The class also needs to set _pushes_fd = True.
Though a bit of a kludge, it was easier than I expected.
*I found it better to work with list[self.im] instead of self.im directly. The former allows slicing and all other list operations.
Related
I want to use a pre-trained MXNet model on s390x architecture but it doesn't seem to work. This is because the pre-trained models are in little-endian whereas s390x is big-endian. So, I'm trying to use https://numpy.org/devdocs/reference/generated/numpy.lib.format.html which works on both little-endian as well as big-endian.
One way to solve this is to I've found is to load the model parameters on an x86 machine, call asnumpy, save through numpy Then load the parameters on s390x machine using numpy and convert them to MXNet. But I'm not really sure how to code it. Can anyone please help me with that?
UPDATE
It seems the question is unclear. So, I'm adding an example that better explains what I want to do in 3 steps -
Load a preexisting model from MXNet, something like this -
net = mx.gluon.model_zoo.vision.resnet18_v1(pretrained=True, ctx=mx.cpu())
Export the model. The following code saves the model parameters in .param file. But this .param binary file has endian issues. So, instead of directly saving the model using mxnet API, I want to save the parameters file using numpy - https://numpy.org/devdocs/reference/generated/numpy.lib.format.html. Because using numpy, would make the binary file (.npy) endian independent. I am not sure how can I convert the parameters of MXNet model into numpy format and save them.
gluon.contrib.utils.export(net, path="./my_model")
Load the model. The following code loads the model from .param file.
net = gluon.contrib.utils.import(symbol_file="my_model-symbol.json",
param_file="my_model-0000.params",
ctx = 'cpu')
Instead of loading using the MXNet API, I want to use numpy to load .npy file that we created in step 2. After we have loaded the .npy file, we need to convert it to MXNet. So, I can finally use the model in MXNet.
Starting from the code snippets posted in the other question, Save/Load MXNet model parameters using NumPy :
It appears that mxnet has an option to store data internally as numpy arrays:
mx.npx.set_np(True, True)
Unfortunately, this option doesn't do what it I hoped (my IPython session crashed).
The parameters are a dict of mxnet.gluon.parameter.Parameter instances, each of them containing attributes of other special datatypes. Disentangling this so that you can store it as a large number of pure numpy arrays (or a collection of them in an .npz file) is a hopeless task.
Fortunately, python has pickle to convert complex data structures into something more or less portable:
# (mxnet/resnet setup skipped)
parameters = resnet.collect_params()
import pickle
with open('foo.pkl', 'wb') as f:
pickle.dump(parameters, f)
To restore the parameters:
with open('foo.pkl', 'rb') as f:
parameters_loaded = pickle.load(f)
Essentially, it looks like resnet.save_parameters() as defined in mxnet/gluon/block.py gets the parameters (using _collect_parameters_with_prefix()) and writes them to a file using a custom write function which appears to be compiled from C (I didn't check the details).
You can save the parameters using pickle instead.
For loading, load_parameters (also in util.py) contains this code (with sanity checks removed):
for name in loaded:
params[name]._load_init(loaded[name], ctx, cast_dtype=cast_dtype, dtype_source=dtype_source)
Here, loaded is a dict as loaded from the file. From examining the code, I don't fully grasp exactly what is being loaded - params seems to be a local variable in the function that is not used anymore. But it's worth a try to start from here, by writing a replacement for the load_parameters function. You can "monkey-patch" a function into an existing class by defining a function outside the class like this:
def my_load_parameters(self, ...):
... (put your modified implementation here)
mx.gluon.Block.load_parameters = my_load_parameters
Disclaimers/warnings:
even if you get save/load via pickle to work on a single big-endian system, it's not guaranteed to work between different-endian systems. The pickle protocol itself is endian-neutral, but if floating-point values (deep inside the mxnet.gluon.parameter.Parameter were stored as a raw data buffer in machine-endian convention, then pickle is not going to magically guess that groups of 8 bytes in the buffer need to be reversed. I think numpy arrays are endian-safe when pickled.
Pickle is not very robust if the underlying class definitions change between pickling and unpickling.
Never unpickle untrusted data.
I am trying to implement U-net and I use https://github.com/jakeret/tf_unet/tree/master/scripts this link as reference. I don't understand which dataset they used. please give me some idea or link which dataset i use.
On their github README.md they show three different datasets, that they applied their implementation to. Their implementation is dataset agnostic, therefore it shouldn't matter too much what data they use if you're trying to solve your own problem with your own data. But if you're looking for a toy data set to play around, check out their demos. There you'll see two readily available examples and how they can be used:
demo_radio_data.ipynb which uses an astronomic radio data example set from here: http://people.phys.ethz.ch/~ast/cosmo/bgs_example_data/
demo_toy_problem.ipynb which uses their built-in data generator of a noisy image with circles that are to be detected.
The latter is probably the easiest one if it comes to just having something to play with. To see how data is generated, check out the class:
image_gen.py -> GrayScaleDataProvider
(with an IDE like PyCharm you can just jump into the according classes in the demo source code)
I understand the workings of OO Programming but have little practical experience in actually using for more than one or two classes. When it comes to practically using it I struggle with the OO Design part. I've come to the following case which could benefit from OO:
I have a few sets of data from different sources, some from file, others from the internet through an API and others of even a different source. Some of them are quite alike when it comes to the data they contain and some of them are really different. I want to visualize this data, and since almost all of the data is based on a location I plan on doing this on a map (using Folium in python to create a leafletjs based map) with markers of some sort (with a little bit of information in a popup). In some cases I also want to create a pdf with an overview of data and save it to disk.
I came up with the following (start of an) idea for the classes (written in python to show the idea):
class locationData(object):
# for all the location based data, will implement coordinates and a name
# for example
class fileData(locationData):
# for the data that is loaded from disk
class measurementData(fileData):
# measurements loaded from disk
class modelData(fileData):
# model results loaded from disk
class VehicleData(locationData):
# vehicle data loaded from a database
class terrainData(locationData):
# Some information about for example a mountain
class dataToPdf(object):
# for writing data to pdf's
class dataFactory(object):
# for creating the objects
class fileDataReader(object):
# for loading the data that is on disk
class vehicleDatabaseReader(object):
# to read the vehicle data from the DB
class terrainDataReader(object):
# reads terrain data
class Data2HTML(object):
# puts the data in Folium objects.
Considering the data to output I figured that each data class render its own data (since it knows what information it has) in for example a render() method. The output of the render method (maybe a dict) would than be used in data2pdf or data2html although I'm not exactly sure how to do this yet.
Would this be a good start for OO design? Does anybody have suggestion or improvements?
the other day I described my approach for a similar question. I think you can use it. I think the best approach would be to have an object that can retrieve and return your data and another one that can show them as you wish, maybe a may, maybe a graph and anything else you would like to have.
What do you think?
Thanks
From my current understanding, png is relatively easier to decode than bitmap-based formats like jpg in python and is already implemented in python elsewhere. For my own purposes though I need the jpg format.
What are good resources for building a jpg library from scratch? At the moment I only wish to support the resizing of images, but this would presumably involve both encoding/decoding ops.
Edit: to make myself more clear: I am hoping that there is a high level design type treat of how to implement a jpg library in code: specifically considerations when encoding/decoding, perhaps even pseudocode. Maybe it doesn't exist, but better to ask and stand on the shoulders of giants rather than reinvent the wheel.
Use PIL, it already has highlevel APIs for image handling.
If you say "I don't want to use PIL" (and remember, there are private/unofficial ports to 3.x) then I would say read the wikipedia article on JPEG, as it will describe the basics, and also links to in depth articles/descriptions of the JPEG format.
Once you read over that, pull up the source code for PIL JPEGS to see what they are doing there (it is surprisingly simple stuff) The only things they import really, are Image, which is a class they made to hold the raw image data.
I have an array of pixels which I wish to save to an image file. Python appears to have a few libraries which can do this for me, so I'm going to use one of them, passing in my pixel array and using functions I didn't write to write the image headers and data to disk.
How do I do unit testing for this situation?
I can:
Test that the pixel array I'm passing to the external library is what I expect it to be.
Test that the external library functions I call give me the expected return values.
Manually verify that the image looks like I'm expecting (by opening the image and eyeballing it).
I can't:
Test that the image file is correct. To do that I'd have to either generate an image to compare to (but how do I generate that 'trustworthy' image?), or write a unit-testable image-writing module (so I wouldn't need to bother with the external library at all).
Is this enough to provide coverage for my code? Is testing the interface between my code and the external library sufficient, leaving me to trust that the output of the external library (the image file) is correct through manual eyeballing?
How do you write unit tests to ensure that the external libraries you use do what you expect them to?
Bit old on Python.
But this is how I would approach it.
Grab the image doing a manual test. Compute a check sum (MD5 perhaps). Then the automated tests need to compare it by computing the MD5 (in this example) with the one done on the manual test.
Hope this helps.