making a memory only fileobject in python with pyfilesystem

making a memory only fileobject in python with pyfilesystem - python

I have written a motion detection/ video program using opencv2 which saves video output for x seconds. if motion in detected during that time, the output is saved as an alternate named file, but if not motion is detected then the file is overwritten. To avoid needless wear on a flash based memory system, I want to write the file to the RAM, and if motion is detected then save it to the non-volatile memory.
I am trying to create this file in the RAM using pyfilesystem-fs.memoryfs
import numpy as np
import cv2, time, os, threading, thread
from Tkinter import *
from fs.memoryfs import MemoryFS
class stuff:
mem=MemoryFS()
output = mem.createfile('output.avi')
rectime=0
delay=0
kill=0
cap = cv2.VideoCapture(0)
#out = cv2.VideoWriter('C:\motion\\output.avi',cv2.cv.CV_FOURCC('F','M','P','4'), 30, (640,480),True)
out = cv2.VideoWriter(output, cv2.cv.CV_FOURCC('F','M','P','4'), 30, (640,480),True)
This is the motion detection part
if value > 100:
print "saving"
movement=time.time()
while time.time()<int(movement)+stuff.rectime:
stuff.out.write(frame)
ret, frame = stuff.cap.read()
if stuff.out.isOpened() is True:
stuff.out.release()
os.rename(stuff.output, 'c:\motion\\' + time.strftime('%m-%d-%y_%H-%M-%S') + '.avi')
the os.rename function returns TypeError must be string, not None
I'm clearly using memoryfs incorrectly, but cannot find any examples of its use.
EDIT
I use the following line to open the file object and write to it
stuff.out.open(stuff.output, cv2.cv.CV_FOURCC(*'FMP4'),24,(640,480),True)
however this returns False , I'm not sure but it appears it can't open the file object.

To move your file form MemoryFS to real file system, you should read orig file and write it to dest file, something like
with mem.open('output.avi', 'b') as orig:
with open('c:\\motion\\' + time.strftime('%m-%d-%y_%H-%M-%S') + '.avi')) as dest:
dest.write(orig.read())

Related

How to cv2.VideoCapture to read from BytesIO or an array of images [duplicate]

I was wondering if it is possible to 'stream' data using the OpenCV VideoWriter class in Python?
Normally for handling data in memory that would otherwise go to disk I use BytesIO (or StringIO).
My attempt to use BytesIO fails though:
import cv2
from io import BytesIO
stream = cv2.VideoCapture(0)
fourcc = cv2.VideoWriter_fourcc('x264')
data = BytesIO()
# added these to try to make data appear more like a string
data.name = 'stream.{}'.format('av1')
data.__str__ = lambda x: x.name
try:
video = cv2.VideoWriter(data, fourcc=fourcc, fps=30., frameSize=(640, 480))
start = data.tell()
# Check if camera opened successfully
if (stream.isOpened() == False):
print("Unable to read camera feed", file=sys.stderr)
exit(1)
# record loop
while True:
_, frame = stream.read()
video.write(frame)
data.seek(start)
# do stuff with frame bytes
# ...
data.seek(start)
finally:
try:
video.release()
except:
pass
finally:
stream.release()
However instead of writing the BytesIO object I end up with the following message:
Traceback (most recent call last):
File "video_server.py", line 54, in talk_to_client
video = cv2.VideoWriter(data, fourcc=fourcc, fps=fps, frameSize=(width, height))
TypeError: Required argument 'apiPreference' (pos 2) not found
... So when I modify the VideoWriter call to be cv2.VideoWriter(data, apiPreference=0, fourcc=fourcc, fps=30., frameSize=(640, 480)) (I read that 0 means auto, but I also tried cv2.CAP_FFMPEG), I instead get the following error:
Traceback (most recent call last):
File "video_server.py", line 54, in talk_to_client
video = cv2.VideoWriter(data, apiPreference=0, fourcc=fourcc, fps=fps, frameSize=(width, height))
TypeError: bad argument type for built-in operation
So my question is, is it possible to write encoded video using the cv2.VideoWriter class in memory and if so how is it done?
At this point I'm fresh out of ideas, so any help would be most welcome :-)

Unfortunately, OpenCV doesn't support encoding to (or decoding from) memory. You must write to (or read from) disk for VideoWriter (or VideoCapture) to work.

If you want some convenience to load a video that is already in memory, use a temporary file: https://docs.python.org/3/library/tempfile.html#tempfile.NamedTemporaryFile
import tempfile
import cv2
my_video_bytes = download_video_in_memory()
with tempfile.NamedTemporaryFile() as temp:
temp.write(my_video_bytes)
video_stream = cv2.VideoCapture(temp.name)
# do your stuff.
This will still create a file on disk unfortunately. But hey, at least you don't have to manage that yourself. There is a SpooledTemporaryFile() implementation that will stay in memory, but, unfortunately, it won't create a file system name that OpenCV can reference.
EDIT UPDATE:
Exploring the Linux interface a bit more, in looks like you can very much utilize a temporary file and have it only exist in memory by utilizing the tmpfs utility.
Tested on an Ubuntu machine, a NamedTemporaryFile() object without specifying a dir will follow the TMPDIR (which happens to point to /tmp on my machine. Now, not sure about other OS systems, but my machine did not have /tmp mounted with tmpfs. It was mounted on the main partition.
But, it's super easy to just create your own "in memory file system" like so: sudo mount -t tmpfs -o size=1G tmpfs my_folder
If we check the output of df my_folder -h I see this:
Filesystem Size Used Avail Use% Mounted on
tmpfs 1.0G 0 1.0G 0% my_folder
Fricken wicked!!
So now if I do this in Python:
with tempfile.NamedTemporaryFile(dir="my_folder") as temp:
I just created a temporary file in linux with mkstemp in a completely in memory file system. Of course, I do incur the cost of copying the data from my python's memory into another memory location. But hey, on big files, that's probably cheaper than writing to disk.
Make sure that your in-memory file system has enough ram to hold the video file otherwise, by design, tmpfs will utilize swap space to put the file on disk anyways.

If you use Linux, you can create a ramdisk and write to it.
mount -t tmpfs -o size=512m tmpfs /mnt/ramdisk

pickle.load - EOFError: Ran out of input

I have an .obj file in which, previously, I transformed an image to base64 and saved with pickle.
The problem is when I try to load the .obj file with pickle, convert the code into image from base64, and load it with pygame.
The function that loads the image:
def mainDisplay_load(self):
main_folder = path.dirname(__file__)
img_main_folder = path.join(main_folder, "sd_graphics")
# loadImg
self.mainTerminal = pg.image.load(path.join(img_main_folder, self.main_uncode("tr.obj"))).convert_alpha()
The function that decodes the file:
def main_uncode(self, object):
openFile = open(object, "rb")
str = pickle.load(openFile)
openFile.close()
fileData = base64.b64decode(str)
return fileData
The error I get when the code is run:
str = pickle.load(openFile)
EOFError: Ran out of input
How can I fix it?
Python version: 3.6.2
Pygame version: 1.9.3
Update 1
This is the code I used to create the .obj file:
import base64, pickle
with open("terminal.png", "rb") as imageFile:
str = base64.b64encode(imageFile.read())
print(str)
file_pi = open("tr.obj","wb")
pickle.dump(str,file_pi)
file_pi.close()
file_pi2 = open("tr.obj","rb")
str2 = pickle.load(file_pi2)
file_pi2.close()
imgdata = base64.b64decode(str2)
filename = 'some_image.jpg' # I assume you have a way of picking unique filenames
with open(filename, 'wb') as f:
f.write(imgdata)
Once the file is created, it is loaded and a second image is created. This is to check if the image is the same or there are errors in the conversion.
As you can see, I used part of the code to load the image, but instead of saving it, it is loaded into pygame. And that's where the mistake occurs.
Update 2
I finally managed to solve it.
In the main code:
def mainDisplay_load(self):
self.all_graphics = pg.sprite.Group()
self.graphics_menu = pg.sprite.Group()
# loadImg
self.img_mainTerminal = mainGraphics(self, 0, 0, "sd_graphics/tr.obj")
In the library containing graphics classes:
import pygame as pg
import base64 as bs
import pickle as pk
from io import BytesIO as by
from lib.setting import *
class mainGraphics(pg.sprite.Sprite):
def __init__(self, game, x, y, object):
self.groups = game.all_graphics, game.graphics_menu
pg.sprite.Sprite.__init__(self, self.groups)
self.game = game
self.object = object
self.outputGraphics = by()
self.x = x
self.y = y
self.eventType()
self.rect = self.image.get_rect()
self.rect.x = self.x * tilesizeDefault
self.rect.y = self.y * tilesizeDefault
def eventType(self):
openFile = open(self.object, "rb")
str = pk.load(openFile)
openFile.close()
self.outputGraphics.write(bs.b64decode(str))
self.outputGraphics.seek(0)
self.image = pg.image.load(self.outputGraphics).convert_alpha()
For the question of why I should do such a thing, it is simple:
any attacker with sufficient motivation can still get to it easily
Python is free and open.
On the one hand, we have a person who intentionally goes and modify and recover the hidden data. But if Python is an open language, as with even more complicated and protected languages, the most motivated are able to crack the game or program and retrieve the same data.
On the other hand, we have a person who knows only the basics, or not even that. A person who cannot access the files without knowing more about the language, or decoding the files.
So you can understand that decoding files, from my point of view, does not need to be protected from a motivated person. Because even with a more complex and protected language, that motivated person will be able to get what he wants. The protection is used against people who have no knowledge of the language.

So, if the error you get is indeed "pickle: run out of input", that propably means you messed your directories in the code above, and are trying to read an empty file with the same name as your obj file is.
Actually, as it is, this line in your code:
self.mainTerminal=pg.image.load(path.join(img_main_folder,self.main_uncode
("tr.obj"))).convert_alpha()
Is completly messed up. Just read it and you can see the problem: you are passing to the main_uncode method just the file name, without directory information. And then, if it would by chance have worked, as I have poitned in the comments a while ago, you would try to use the unserialized image data as a filename from where to read your image. (You or someone else had probably thought that main_uncode should write a temporary image file and writ the image data to that, so that Pygame could read it, but as it is, it is just returning the raw image data in a string).
Threfore, by fixing the above call and passing an actual path to main_uncode, and further modifying it to write the temporary data to a file and return its path would fix the snippets of code above.
Second thing is I can't figure out why do you need this ".obj" file at all. If it is just for "security through obscurity" hopping people get your bundled file can't open the images, that is a thing far from a recommended practice. To sum up just one thing: it will delay legitimate uses of your file (like, you yourself does not seem to be able to use it ), while any attacker with sufficient motivation can still get to it easily. By opening an image, base-64 encoding and pickling it, and doing the reverse process you are doing essentially a no-operation. Even more, a pickle file can serialize and write to disk complex Python objects - but a base64 serialization of an image could be written directly to a file, with no need for pickle.
Third thing: just use with to open all the files, not just the ones you read with the imaging library, Take your time to learn a little bit more about Python.

Difficulty with handling very large image using VIPS

I'm writing a Python(3.4.3) program that uses VIPS(8.1.1) on Ubuntu 14.04 LTS to read many small tiles using multiple threads and put them together into a large image.
In a very simple test :
from concurrent.futures import ThreadPoolExecutor
from multiprocessing import Lock
from gi.repository import Vips
canvas = Vips.Image.black(8000,1000,bands=3)
def do_work(x):
img = Vips.Image.new_from_file('part.tif') # RGB tiff image
with lock:
canvas = canvas.insert(img, x*1000, 0)
with ThreadPoolExecutor(max_workers=8) as executor:
for x in range(8):
executor.submit(do_work, x)
canvas.write_to_file('complete.tif')
I get correct result. In my full program, the work for each thread involves read binary from a source file, turn them into tiff format, read the image data and insert into canvas. It seems to work but when I try to examine the result, I ran into trouble. Because the image is extremely large(~50000*100000 pixels), I couldn't save the entire image in one file, so I tried
canvas = canvas.resize(.5)
canvas.write_to_file('test.jpg')
This takes extremely long time, and the resulting jpeg has only black pixels. If I do resize three times, the program get killed. I also tried
canvas.extract_area(20000,40000,2000,2000).write_to_file('test.tif')
This results in error message segmentation fault(core dumped) but it does save an image. There are image contents in it, but they seem to be in the wrong place.
I'm wondering what the problem could be?
Below are the codes for the complete program. The same logic was also implemented using OpenCV + sharedmem (sharedmem handled the multiprocessing part) and it worked without a problem.
import os
import subprocess
import pickle
from multiprocessing import Lock
from concurrent.futures import ThreadPoolExecutor
import threading
import numpy as np
from gi.repository import Vips
lock = Lock()
def read_image(x):
with open(file_name, 'rb') as fin:
fin.seek(sublist[x]['dataStartPos'])
temp_array = np.fromfile(fin, dtype='int8', count=sublist[x]['dataSize'])
name_base = os.path.join(rd_path, threading.current_thread().name + 'tempimg')
with open(name_base + '.jxr', 'wb') as fout:
temp_array.tofile(fout)
subprocess.call(['./JxrDecApp', '-i', name_base + '.jxr', '-o', name_base + '.tif'])
temp_img = Vips.Image.new_from_file(name_base + '.tif')
with lock:
global canvas
canvas = canvas.insert(temp_img, sublist[x]['XStart'], sublist[x]['YStart'])
def assemble_all(filename, ramdisk_path, scene):
global canvas, sublist, file_name, rd_path, tilesize_x, tilesize_y
file_name = filename
rd_path = ramdisk_path
file_info = fetch_pickle(filename) # A custom function
# this info includes where to begin reading image data, image size and coordinates
tilesize_x = file_info['sBlockList_P0'][0]['XSize']
tilesize_y = file_info['sBlockList_P0'][0]['YSize']
sublist = [item for item in file_info['sBlockList_P0'] if item['SStart'] == scene]
max_x = max([item['XStart'] for item in file_info['sBlockList_P0']])
max_y = max([item['YStart'] for item in file_info['sBlockList_P0']])
canvas = Vips.Image.black((max_x+tilesize_x), (max_y+tilesize_y), bands=3)
with ThreadPoolExecutor(max_workers=4) as executor:
for x in range(len(sublist)):
executor.submit(read_image, x)
return canvas
The above module (imported as mcv) is called in the driver script :
canvas = mcv.assemble_all(filename, ramdisk_path, 0)
To examine the content, I used
canvas.extract_area(25000, 40000, 2000, 2000).write_to_file('test_vips1.jpg')

I think your problem has to do with the way libvips calculates pixels.
In systems like OpenCV, images are huge areas of memory. You perform a series of operations, and each operation modifies a memory image in some way.
libvips is not like this, though the interface looks similar. In libvips, when you perform an operation on an image, you are actually just adding a new section to a pipeline. It's only when you finally connect the output to some sink (a file on disk, or a region of memory you want filled with image data, or an area of the display) that libvips will actually do any calculations. libvips will then use a recursive algorithm to run a large set of worker threads up and down the whole length of the pipeline, evaluating all of the operations you created at the same time.
To make an analogy with programming languages, systems like OpenCV are imperative, libvips is functional.
The good thing about the way libvips does things is that it can see the whole pipeline at once and it can optimise away most of the memory use and make good use of your CPU. The bad thing is that long sequences of operations can need large amounts of stack to evaluate (whereas with systems like OpenCV you are more likely to be bounded by image size). In particular, the recursive system used by libvips to evaluate means that pipeline length is limited by the C stack, about 2MB on many operating systems.
Here's a simple test program that does more or less what you are doing:
#!/usr/bin/python3
import sys
import pyvips
if len(sys.argv) < 4:
print "usage: %s image-in image-out n" % sys.argv[0]
print " make an n x n grid of image-in"
sys.exit(1)
tile = pyvips.Image.new_from_file(sys.argv[1])
outfile = sys.argv[2]
size = int(sys.argv[3])
img = pyvips.Image.black(size * tile.width, size * tile.height, bands=3)
for y in range(size):
for x in range(size):
img = img.insert(tile, x * size, y * size)
# we're not interested in huge files for this test, just write a small patch
img.crop(10, 10, 100, 100).write_to_file(outfile)
You run it like this:
time ./bigjoin.py ~/pics/k2.jpg out.tif 2
real 0m0.176s
user 0m0.144s
sys 0m0.031s
It loads k2.jpg (a 2k x 2k JPG image), repeats that image into a 2 x 2 grid, and saves a small part of it. This program will work well with very large images, try removing the crop and running as:
./bigjoin.py huge.tif out.tif[bigtiff] 10
and it'll copy the huge tiff image 100 times into a REALLY huge tiff file. It'll be quick and use little memory.
However, this program will become very unhappy with small images being copied many times. For example, on this machine (a Mac), I can run:
./bigjoin.py ~/pics/k2.jpg out.tif 26
But this fails:
./bigjoin.py ~/pics/k2.jpg out.tif 28
Bus error: 10
With a 28 x 28 output, that's 784 tiles. The way we've built the image, repeatedly inserting a single tile, that's a pipeline 784 operations long -- long enough to cause a stack overflow. On my Ubuntu laptop I can get pipelines up to about 2,900 operations long before it starts failing.
There's a simple way to fix this program: build a wide rather than a deep pipeline. Instead of inserting a single image each time, make a set of strips, then join the strips. Now the pipeline depth will be proportional to the square root of the number of tiles. For example:
img = pyvips.Image.black(size * tile.width, size * tile.height, bands=3)
for y in range(size):
strip = pyvips.Image.black(size * tile.width, tile.height, bands=3)
for x in range(size):
strip = strip.insert(tile, x * size, 0)
img = img.insert(strip, 0, y * size)
Now I can run:
./bigjoin2.py ~/pics/k2.jpg out.tif 200
Which is 40,000 images joined together.

Python server app leaking memory

I'm trying to diagnose why my Python server app is leaking memory. The app takes a request for an image url resizes it using Vips and returns the image. After every request the memory usage grows roughly by the size of the original image.
from fapws import base
import fapws._evwsgi as evwsgi
from gi.repository import Vips
import urllib2
import hmac
import hashlib
import base64
import StringIO
from boto.s3.connection import S3Connection
from boto.s3.bucket import Bucket
def start():
evwsgi.start('0.0.0.0', '80')
evwsgi.set_base_module(base)
def lfrThumbnail(environ, start_response):
try:
parameters = environ['PATH_INFO'].split('/')
s3File = 'my s3 url' + parameters[0]
width = float(parameters[1])
height = float(parameters[2])
hmacSignatureUser = parameters[3]
hmacSignature = some hasing code...
if not (hmacSignatureUser == hmacSignature):
print hmacSignatureUser
print hmacSignature
print hmacSignatureUser == hmacSignature
raise Exception
bufferedImage = urllib2.urlopen(s3File).read()
image = Vips.Image.new_from_buffer(bufferedImage, '')
imageWidth = float(image.width)
imageHeight = float(image.height)
imageAspectRatio = imageWidth / imageHeight
if (width > imageWidth) or (height > imageHeight):
image = image
elif abs((imageAspectRatio / (width/height)) - 1) < 0.05:
image = image.resize(width / imageWidth)
else:
scaleRatioWidth = width / imageWidth
scaleRatioHeight = height / imageHeight
maxScale = max(scaleRatioWidth, scaleRatioHeight)
image = image.resize(maxScale)
cropStartX = (image.width - width) / 2
cropStartY = (image.height - height) / 2
image = image.crop(cropStartX, cropStartY, width, height)
except Exception, e:
start_response('500 INTERNAL SERVER ERROR', [('Content-Type','text')])
return ['Error generating thumbnail']
start_response('200 OK', [
('Content-Type','image/jpeg'),
('Cache-Control: max-stale', '31536000')
])
return [image.write_to_buffer('.jpg[Q=90]')]
evwsgi.wsgi_cb(('/lfr/', lfrThumbnail))
evwsgi.set_debug(0)
evwsgi.run()
if __name__ == '__main__':
start()
I've tried using muppy , the pympler tracker but each diff after the image open/close operations showed only a couple of bytes being used.
Could the external C libraries be the cause of the memory leak? if so, how does one debug that.
If it's anything related I'm running the python server inside a docker container

I'm the libvips maintainer. It sounds like the vips operation cache: vips keeps the last few operations in memory and reuses the results if it can. This can be a huge performance win in some cases.
For a web service, you're probably caching elsewhere so you won't want this, or you won't want a large cache at least. You can control the cache size with vips_cache_set_max() and friends:
http://www.vips.ecs.soton.ac.uk/supported/current/doc/html/libvips/VipsOperation.html#vips-cache-set-max
From Python it's:
Vips.cache_set_max(0)
To turn off the cache completely. You can set the cache to limit by memory use, file descriptor use, or number of operations.
There are a couple of other useful things you can set to watch resource usage. Vips.leak_set(True) makes vips report leaked objects on exit, and also report peak pixel buffer memory use. Vips.cache_set_trace(True) makes it trace all operations as they are called, and shows cache hits.
In your code, I would also enable sequential mode. Add access = Vips.Access.SEQUENTIAL to your new_from_buffer().
The default behaviour is to open images for full random access (since vips doesn't know what operations you'll end up running on the image). For things like JPG, this means that vips will decode the image to a large uncompressed array on open. If the image is under 100mb, it'll keep this array in memory.
However for a simple resize, you only need to access pixels top-to-bottom, so you can hint sequential access on open. In this mode, vips will only decompress a few scanlines at once from your input and won't ever keep the whole uncompressed image around. You should see a nice drop in memory use and latency.
There are a lot of other things you could handle, like exif autorotate, colour management, transparency, jpeg shrink-on-load, and many others, I'm sure you know. The sources to vipsthumbnail might be a useful reference:
https://github.com/jcupitt/libvips/blob/master/tools/vipsthumbnail.c

Python script runs fine for a while then returns MemoryError (python 3.3)

I have a small script which utilizes the Python Imaging Library module for Python3.3 (on Win7, 8 gb of RAM) to take a screenshot of a small (~40x50) pixel area of the screen once each second and compare it to an image which I already have to detect a particular pattern and execute two other modules I created if it is found. The script seems to work flawlessly for the first 30 minutes or so, but then the script crashes and I get the following error:
Traceback (most recent call last):
File "<string>", line 420, in run_nodebug
File "C:\Users\Nate Simon\Dropbox\CaptchaLibrary\detectNRun.py", line 68, in <module>:
im2 = ImageGrab.grab((left,upper,right,lower))
File "C:\Python33\lib\site-packages\PIL\ImageGrab.py", line 47, in grab:
size, data = grabber()
MemoryError
I've adjusted the time between screenshots and all it does is delay when the program crashes.
Here is what seems to be the offending code:
im2 = ImageGrab.grab((left,upper,right,lower)) # Take a screenshot at given coordinates
for x in range(im2.size[0]): # This section just changes to image to black/white for better comparing but might be relevant.
for y in range(im2.size[1]):
pixel = im2.getpixel((x,y))
if pixel[0] < 40 or pixel[1] < 40 or pixel[2] < 40:
color = (0, 0, 0)
else:
color = (255, 255, 255)
im2.putpixel((x,y), color)
There are no lists, dictionaries, or databases being added to in this script, every time it runs the old screenshot is overwritten in the memory (it is never saved to disk).
Also possibly relevant: from the time module I am using sleep() for delays and time() to keep track of system time. I am also using win32api for mouse/keyboard inputs and using tkinter to read the clipboard in the following lines:
c = Tk()
c.withdraw()
result = c.clipboard_get()
c.destroy()
In another section, the clipboard is cleared before new data is added with c.clipboard_clear()

I was unable to find an adequate solution to completely solve the memory issue (or even replicate it under different conditions), so I simply increased the interval between actions from 1 second to 15 seconds and I have yet to get the memory error again.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.