Python3.4 : OSError: [Errno 12] Cannot allocate memory - python

I am initializing a bunch of multiprocessing arrays that are 1048576 by 16 long in the file dijk_inner_mp.py:
N1=1048576
DEG1=16
P1=1
W = [[0 for x in range(DEG1)] for x in range(N1)]
W_index = [[0 for x in range(DEG1)] for x in range(N1)]
u = multiprocessing.Array('i',range(P1))
D = multiprocessing.Array('i',range(N1))
Q = multiprocessing.Array('i',range(N1))
l = [multiprocessing.Lock() for i in range(0,N1)]
After the initialization I create P1 number of processes that work on the allocated arrays. However, I keep running into this error on execution:
File "dijk_inner_mp.py", line 20, in <module>
l = [multiprocessing.Lock() for i in range(0,N1)]
File "dijk_inner_mp.py", line 20, in <listcomp>
l = [multiprocessing.Lock() for i in range(0,N1)]
File "/usr/lib/python3.4/multiprocessing/context.py", line 66, in Lock
return Lock(ctx=self.get_context())
File "/usr/lib/python3.4/multiprocessing/synchronize.py", line 163, in __init__
SemLock.__init__(self, SEMAPHORE, 1, 1, ctx=ctx)
File "/usr/lib/python3.4/multiprocessing/synchronize.py", line 60, in __init__
unlink_now)
OSError: [Errno 12] Cannot allocate memory
I have tried increasing the swap file size to say a few Gb after seeing some other questions about the issue, but that didnt seem to help. I also reduced the size to 131K from 1M and ended up with the same error. Any ideas on how to circumvent this issue?

Every instance of multiprocessing.Lock() maps a new semaphore file in /dev/shm/ into memory.
man mmap:
ENOMEM The process's maximum number of mappings would have been exceeded.
(Errno 12 is defined as ENOMEM.)
The system's maximum number of mappings is controlled by the kernel parameter vm.max_map_count; you can read it with /sbin/sysctl vm.max_map_count. Without much doubt you will see that this value on your system is clearly lower than the number of locks you want to create.
For ways to alter vm.max_map_count see e. g. this Linux Forums thread.

Related

Changing a value in multiple files at the same time in Python

I want to change the value of beta in Test.py which are in multiple folders at the same time without actually opening these files but I am getting an error. How do I do this?
import os
N=[8,10, 23,29, 36, 37, 41,42, 45, 46, 47]
I=[]
for i in N:
os.read(rf'C:\Users\User\{i}\Test.py')
beta=1e-1
The error is
in <module>
os.read(rf'C:\Users\User\OneDrive - Technion\Research_Technion\Python_PNM\All_ND\var_6.0_beta_0.1\{i}\220_beta_1.0_50.0_6.0ND.py')
TypeError: read expected 2 arguments, got 1
Syntax: os.read(fd, n)
Parameter: fd: A file descriptor representing the file to be read. n:
An integer value denoting the number of bytes to be read from the file
associated with the given file descriptor fd
Seems like you forgot the second argument n.
see - https://www.geeksforgeeks.org/python-os-read-method/#:~:text=read()%20method%20in%20Python,bytes%20left%20to%20be%20read.

Having a problem with handling errors on Python

Well, after quite some research, I've figured out I have no idea what is wrong with my code. This is a Newton's Method optimization code that I am working on, and I want to test it on several functions on a for loop on another program, along with several other optimization methods, but there is a problem on a certain function that just won't let the code keep on going to test everything.
The thing is that the Newton's method requires a matrix not to be singular in order to solve a system, and on this problematic function, that matrix ends up being singular at some point, and I get an error message telling me that and stopping the whole process. I knew nothing about try and except error handling up until this point, so after some research I figured it out and implemented it on my code to try and avoid this error, however, it will still stop the entire program because of the error, would anyone take a look at it for me?
This is the code for the problematic part:
while np.dot(grad(x),grad(x)) > 10**(-4):
if f(x+a*p) <= f(x)+c1*a*np.dot(grad(x),p):
x = x+a*p
try:
p = -np.linalg.solve(hess(x),grad(x))
except:
break
k = k+1
else:
a = 0.9*a
if k >= 100:
break
return [x, f(x),grad(x)]
hess(x) ends at some point being a singular matrix and giving me an exception error, but the code won't attempt to not do the try block and do the except block in this situation as I think it should, what is going on?
I've tried on a smaller problem like
i = 0
while i < 10:
if 2 < 3:
try:
-np.linalg.solve(A,b)
except:
print(9)
break
i = i+1
with A a singular matrix, and it works just fine, by just printing one "9", so why on the main program this won't happen at all? Perhaps it has something to do with the fact that I call this function on another program or something like that?
I could try and get around this problem by having an if statement testing if the matrix is singular beforehand and breaking the loop if so, but I feel like that would be quite costy for an iteration, so I would like to avoid it to better compare the methods later on, so I want to avoid that as much as possible.
This is what I get when I execute my code:
Traceback (most recent call last):
File "<ipython-input-1-31bd7312c08f>", line 1, in <module>
runfile('C:/Users/CLIENTE/Desktop/Coisas/Estudos 2.0/Ñ Linear/Testes.py', wdir='C:/Users/CLIENTE/Desktop/Coisas/Estudos 2.0/Ñ Linear')
File "C:\Users\CLIENTE\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 704, in runfile
execfile(filename, namespace)
File "C:\Users\CLIENTE\Anaconda3\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/CLIENTE/Desktop/Coisas/Estudos 2.0/Ñ Linear/Testes.py", line 52, in <module>
effer = método(dado)
File "C:\Users\CLIENTE\Desktop\Coisas\Estudos 2.0\Ñ Linear\Método_de_Newton.py", line 27, in MétodoDeNewton
p = -np.linalg.solve(hess(x),grad(x))
File "C:\Users\CLIENTE\Anaconda3\lib\site-packages\numpy\linalg\linalg.py", line 394, in solve
r = gufunc(a, b, signature=signature, extobj=extobj)
File "C:\Users\CLIENTE\Anaconda3\lib\site-packages\numpy\linalg\linalg.py", line 89, in _raise_linalgerror_singular
raise LinAlgError("Singular matrix")
LinAlgError: Singular matrix
Have you tried catching the fully-qualified name of the Exception?
while np.dot(grad(x),grad(x)) > 10**(-4):
if f(x+a*p) <= f(x)+c1*a*np.dot(grad(x),p):
x = x+a*p
try:
p = -np.linalg.solve(hess(x),grad(x))
# HEY! Look here.
except np.linalg.LinAlgError:
break
k = k+1
else:
a = 0.9*a
if k >= 100:
break
return [x, f(x),grad(x)]

Unable to list all vertices present in Janusgraph with ".toList()" using Gremlinpython

I have tried testing what is in a graph that I created to see whether nodes were indeed created.
The code to create a small graph for testing:
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
graph = Graph()
g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','g'))
# in a loop add nodes and properties to get a small graph for testing
t = g.addV('testnode').property('val',1)
for i in range(2,11):
t = g.addV('testnode').property('val', i)
t.iterate()
# proceed to create edge (as_ and from_ contain an underscore because as & from are python's reserved words)
g.V().has("val", 2).as_("a").V().has("val", 4).as_("b").addE("link").property("someproperty", "abc").from_("a").to("b").iterate()
list1 = []
list1 = g.V().has("val", 2).toList()
print(len(list1))
to which I would expect to have the value "1" returned in the terminal, which happened correctly while testing previously (and now fails).
However, this returns an error:
Traceback (most recent call last):
File "test_addingVEs.py", line 47, in <module>
list1 = g.V().has("val_i", 2).toList()
File "/home/user/.local/lib/python3.5/site-packages/gremlin_python/process/traversal.py", line 52, in toList
return list(iter(self))
File "/home/user/.local/lib/python3.5/site-packages/gremlin_python/process/traversal.py", line 43, in __next__
self.traversal_strategies.apply_strategies(self)
File "/home/user/.local/lib/python3.5/site-packages/gremlin_python/process/traversal.py", line 346, in apply_strategies
traversal_strategy.apply(traversal)
File "/home/user/.local/lib/python3.5/site-packages/gremlin_python/driver/remote_connection.py", line 143, in apply
remote_traversal = self.remote_connection.submit(traversal.bytecode)
File "/home/user/.local/lib/python3.5/site-packages/gremlin_python/driver/driver_remote_connection.py", line 54, in submit
results = result_set.all().result()
File "/usr/lib/python3.5/concurrent/futures/_base.py", line 405, in result
return self.__get_result()
File "/usr/lib/python3.5/concurrent/futures/_base.py", line 357, in __get_result
raise self._exception
File "/home/user/.local/lib/python3.5/site-packages/gremlin_python/driver/resultset.py", line 81, in cb
f.result()
File "/usr/lib/python3.5/concurrent/futures/_base.py", line 398, in result
return self.__get_result()
File "/usr/lib/python3.5/concurrent/futures/_base.py", line 357, in __get_result
raise self._exception
File "/usr/lib/python3.5/concurrent/futures/thread.py", line 55, in run
result = self.fn(*self.args, **self.kwargs)
File "/home/user/.local/lib/python3.5/site-packages/gremlin_python/driver/connection.py", line 77, in _receive
self._protocol.data_received(data, self._results)
File "/home/user/.local/lib/python3.5/site-packages/gremlin_python/driver/protocol.py", line 98, in data_received
"{0}: {1}".format(status_code, message["status"]["message"]))
gremlin_python.driver.protocol.GremlinServerError: 598:
A timeout occurred during traversal evaluation of [RequestMessage
{, requestId=d56cce63-77f3-4c1f-9c14-3f5f33d4a67b, op='bytecode', processor='traversal', args={gremlin=[[], [V(), has(val, 2)]], aliases={g=g}}}]
- consider increasing the limit given to scriptEvaluationTimeout
The .toList() function did work previously, but not anymore.
Is there anything wrong in my code, or should I look elsewhere for a possible cause?
Well, the error is indicating the problem:
A timeout occurred during traversal evaluation of [RequestMessage
{, requestId=d56cce63-77f3-4c1f-9c14-3f5f33d4a67b, op='bytecode', processor='traversal', args={gremlin=[[], [V(), has(val, 2)]], aliases={g=g}}}]
- consider increasing the limit given to scriptEvaluationTimeout
Of course, assuming the default scriptEvaluationTimeout of 30 seconds, it should not take that long to return a result of the query you are executing unless you have a significant amount of vertices and you do not have an index on "val". So given that your graph is really small, I don't see why such an execution would take so long.
I don't know what you're environment is like that you're testing on, but if you're running all of JanusGraph/Cassandra on a highly underpowered machine I guess something highly resource starved could take a long time to execute. I think that I would try to increase the scriptEvaluationTimeout as suggested in the error to see just how high you have to increase it to get the result back. If you don't have indices on val you probably should add those anyway (though I don't think that's your problem unless that vertex count is bigger than your code is indicating).

Python: fromfile array too big and/or gzip will not write

My goal is to extract a large subimage from an even larger uncompressed image (30000, 65536), without reading the whole image into memory, then save the subimage in a compressed format. At the moment, I only care about determining how well the compression worked as an indicator of image complexity; I don't need to save the image in a visible format, but I would love to. This is my first python script and I am getting stuck on the entry-limits for some function calls.
I get two related errors based on two alternate attempts (with boring lines removed):
Version 1:
fd = open(fname,'rb')
h5file=h5py.File(h5fname, "w")
data = h5file.create_dataset("data", (Height, Width), dtype='i', maxshape=(None, None)) # try i8?
data = fromfile(file=fd, dtype='h', count=Height*Width) #FAIL
fd.close()
h5file.close()
outfilez = gzip.open(outfilename,'wb')
outfilez.write(data)
outfilez.close()
Error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:...\sitecustomize.py", line 523, in runfile
execfile(filename, namespace)
File "C:...\Script_v3.py", line 183, in <module>
data = fromfile(file=fd, dtype=datatype, count=BandHeight_tracks*Width)
ValueError: array is too big.
Version 2 (for loop to reduce fromfile usage):
fd = open(fname,'rb')
h5file=h5py.File(h5fname, "w")
data = h5file.create_dataset("data", (Height, Width), dtype='i', maxshape=(None, None)) # try i8?
for i in range(0, Height-1):
data[i:] = fromfile(file=fd, dtype='h', count=Width)
fd.close()
h5file.close()
outfilez = gzip.open(outfilename,'wb')
outfilez.write(data)
outfilez.close()
Error (I do not get this with the other version):
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:...\sitecustomize.py", line 523, in runfile
execfile(filename, namespace)
File "C:...\Script_v4.py", line 195, in <module>
outfilez.write(data)
File "C:...\gzip.py", line 235, in write
self.crc = zlib.crc32(data, self.crc) & 0xffffffffL
TypeError: must be string or read-only buffer, not Dataset
I am running code using spyder on a 64bit Win7 machine with 16GBRAM. The images are a max of 4GB.
You should look if the arrays in NumPy suits your needs:
NumPy is a general-purpose array-processing package designed to efficiently manipulate large multi-dimensional arrays of arbitrary records without sacrificing too much speed for small multi-dimensional arrays. NumPy is built on the Numeric code base and adds features introduced by numarray as well as an extended C-API and the ability to create arrays of arbitrary type which also makes NumPy suitable for interfacing with general-purpose data-base applications.
Solution as I found out eventually:
Error 1:
Ignore. If I need to operate with this much memory, switch to C++
Error 2:
The type was not set somehow. From the terminal we see:
>>> data[0]
array([2, 2, 0, ..., 3, 4, 2])
It should be:
>>> data[0]
array([2, 2, 0, ..., 3, 4, 2], dtype=int16)
The fix is that I added the following line after the for loop:
data=int16(data)

Read multiple HDF5 files in Python using multiprocessing

I'm trying to read a bunch of HDF5 files ("a bunch" meaning N > 1000 files) using PyTables and multiprocessing. Basically, I create a class to read and store my data in RAM; it works perfectly fine in a sequential mode and I'd like to parallelize it to gain some performance.
I tried a dummy approach for now, creating a new method flatten() to my class to parallelize file reading. The following example is a simplified example of what I'm trying to do. listf is a list of strings containing the name of the files to read, nx and ny are the size of the array I want to read in the file:
import numpy as np
import multiprocessing as mp
import tables
class data:
def __init__(self, listf, nx, ny, nproc=0):
self.listinc = []
for i in range(len(listf)):
self.listinc.append((listf[i], nx, ny))
def __del__(self):
del self.listinc
def get_dsets(self, tuple_inc):
listf, nx, ny = tuple_inc
x = np.zeros((nx, ny))
f = tables.openFile(listf)
x = np.transpose(f.root.x[:ny,:nx])
f.close()
return(x)
def flatten(self):
nproc = mp.cpu_count()*2
def worker(tasks, results):
for i, x in iter(tasks.get, 'STOP'):
print i, x
results.put(i, self.get_dsets(x))
tasks = mp.Queue()
results = mp.Queue()
manager = mp.Manager()
lx = manager.list()
for i, out in enumerate(self.listinc):
tasks.put((i, out))
for i in range(nproc):
mp.Process(target=worker, args=(tasks, results)).start()
for i in range(len(self.listinc)):
j, res = results.get()
lx.append(res)
for i in range(nproc):
tasks.put('STOP')
I tried different things (including, like I did in this simple example, the use of a manager to retrieve the data) but I always get a TypeError: an integer is required.
I do not use ctypes array because I don't really require to have shared arrays (I just want to retrieve my data) and after retrieving the data, I want to play with it with NumPy.
Any thought, hint or help would be highly appreciated!
Edit: The complete error I get is the following:
Process Process-341:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/home/toto/test/rd_para.py", line 81, in worker
results.put(i, self.get_dsets(x))
File "/usr/lib/python2.7/multiprocessing/queues.py", line 101, in put
if not self._sem.acquire(block, timeout):
TypeError: an integer is required
The answer was actually very simple...
In the worker, since it is a tuple that I retrieve, i can't do:
result.put(i, self.get_dsets(x))
but instead I have to do:
result.put((i, self.get_dsets(x)))
which then works perfectly well.

Categories

Resources