I want to build a simple face recognition project. But Every time I start my program it takes really long time to encode all the Images and the load those encoded images.Can someone please tell me if I can pre-encode all the images and then directly load the encodings. I am a beginner please help me ðŸ˜ðŸ˜.
I was trying to convert all the faces into numpy arrays but that did not quite workout. But I have a feeling that this is the direction I should go please tell me If I am right or not
In short words: if you already can get face encodings from your images into some data structure, - what a problem to store that data structure permanently for future use?! Use Pickle, some SQL (the simpliest one - sqlite3) or nonSQL database, whatever.
Related
I'm working on a project to breakdown 3D models but I'm quite lost. I hope you can help me.
I'm getting a 3D model from Autodesk BIM and the format could be native or generic CAD formats (.stp, .igs, .x_t, .stl). Then, I need to "measure" somehow the maximum dimensions to model a raw material body, it will always have the shape of a huge panel. Once I get both bodies, I will get the difference to extract the solids I need to analyze; and, on each of these bodies, I need to extract the faces, and then the lines or curves of each face.
This sounds something really easy to do on a CAD software, but the idea is to automate this process. I was looking into openSCAD, but seems that works only to model geometry and it doesn't handle well imported solids. I'm leaving a picture with the idea of what I need to do in the link below.
So, Any idea how can I do this? which langue and library can help in this project?
I can see this automation possible with a few in between steps:
OpenSCAD can handle differences well, so your "Extract Bodies" seems plausible
1.5 Before going further, you'll have to explain how you "filtered out" the cylinder. Will you do this manually? If you don't, you will have it considered for analysis and have a lot of faces as a result.
I don't think openSCAD provides you a vertex array. However, it can save to .STL, which is kinda easy to parse with the programming language of your choice, you'll have to study .stl file structure a bit (this sounds much more frightening than it is - if you open an stl with an editor you will probably immediately realize what's happening).
Since you've parsed the file, you can now calculate lines with high school math.
This is not an easy, GUI way to do what you ask, but if you have a few skills you'll have your automation, and depending on the amount of your projects it may be worth it.
I have been working in this project, and foundt the library "trimesh" is better to solve this concern. Give it a shot, and save some time.
I have read a lot of essays and articles about (Compressing Image Algorithm). There are many algorithms which I can only understand some of them because I'm a student and I haven't gone to high school yet. I read this article which it helps me a lot! Article In page 3 at this part (Run length code). It's a very EZ and helpful algorithm but I don't know how do I make new format of image. I am a python developer but I don't know how to make a new format which it has a separate algorithm and program. --> like .jpeg, ,jpg, .png, .bmp
(Sorry I have studied English for 1 years so if I have some problems such as grammar or vocabulary just excuse me )
Sure, you can make your own image file format. Choose a filename extension, define how it will be stored and write Python code to:
read the format from disk into a Numpy array, and
write an image contained in a Numpy array to disk
That way you will be interoperable with all the major image processing libraries such as OpenCV, scikit-image, PIL, wand.
Have a look how NetPBM works to get started with a simple format. Maybe look at PCX format if you like the thought of RLE.
Read up on how to write binary to a file with Python.
I am absolutely baffled by how many unhelpful error messages I've received while trying to use this supposedly simple API to write TFRecords in a manner that doesn't take 30 minutes every time I have a new dataset.
Task:
I'd like to feed a list of image paths and a list of labels to a tf.data.Dataset, parse them in parallel to read the images and encode as tf.train.Examples, use tf.data.Dataset.shard to distribute them into different TFRecord shards (e.g. train-001-of-010.tfrecord, train-002-of-010.tfrecord, etc.), and for each shard finally write them to the corresponding file.
Since I've been debugging this for hours I haven't gotten any single particular error to fix, otherwise I would provide it. I've struggled to find any up to date tutorial that doesn't either (a) come from 2017 and use queue runners, (b) use a tf.Session (I'm using tensorflow 1.15 but official docs keep telling me to phase out sessions), (c) Conveniently do the record creating in pure python, which makes a simple tutorial but is too slow for any actual application, or (d) use already created TFRecords and just skip the whole process.
If necessary, I can put together an example of what I'm talking about. But since I'm getting stuck at every level of the process, at the moment it seems unhelpful.
Tldr:
If anyone has utilized tf.data.Dataset to create TFRecord shards in parallel please point me in a better direction than google has.
I have two LMDB files, with the first one my network trains fine while with the other one doesn't really work (loss starts and stays at 0). So I figured that maybe there's something wrong with the second LMDB. I tried writing some python code (mostly taken from here) to fetch the data from my LMDBs and inspect it but so far no luck with any of the 2 databases. The LMDBs contain images as data and bounding box information as labels.
Doing this:
for key, value in lmdb_cursor:
datum.ParseFromString(value)
label = datum.label
data = caffe.io.datum_to_array(datum)
on either one of the LMDBs gives me a key which is correctly the name of the image, but that datum.ParseFromString function is not able to retrieve anything from value. label is always 0, while data is an empty ndarray. Nonetheless, the data is there, value is a binary string of around 140 KB which correctly accounts for the size of the image plus the bounding box information I guess.
I tried browsing several answers and discussions dealing with reading data from LMDBs in python, but I couldn't find any clue on how to read structured information such as bounding box labels. My guess is that the parsing function expects a digit label and interprets the first bytes as such, with the remaining data being then lost due to the binary string not making any sense anymore?
I know for a fact that at least the first LMDB is correct since my network performs correctly in both training and testing using it.
Any inputs will be greatly appreciated!
The basic element stored in your LMDB is not Datum, but rather AnnotatedDatum. Threfore, you need to approach it with a little care:
datum.ParseFromString(value.datum)
value.annotation_group # should store the annotations
I was looking for a better technique to find a JPEG incomplete more flexible rather than the common technic which consist on search for the EOI byte at the end of the file. The idea is to not mark as incomplete image with a minimum percentile of the image data missed. However, seems this is not an easy thing.
The first idea was to try to detect which MCU are corrupted or missed, but giving the fact is compressed on a Huffman table, that is not easy job.
The other option I had was to calculate the content expected after the SOS Marker, and then if the real content does not much with the specified then I can calculate how much Image Data is missing.
Anybody can help me on this? I have tried several tools like ImageMagick or PIL (python) but I cannot fine a proper solution for that.
Thanks.