Unable to perform scatter execution in python

Unable to perform scatter execution in python - python

I am planning to use MPI scatter and gather for one of my requirement.
I have to include MPI scatter and gather in python to parse a json file and then scatter the data and use that to to filter out the required properties.
I have no issues in parsing the json.
if rank == 0 :
for x in newData1:
for j in range(size):
result[j].append(x)
data = result
print(data)
else:
data = None
chunks = comm.scatter(chunks, root=0)
newData is a list of dictionaries in python.
cntr = fuction_name(chunks)
is the function call to manipulate the scattered data.
when I tried to print the scattered data I am getting below output:
rank 0
chunks 0
rank 1
chunks 1
Below is the error I get when I try to run my program
Traceback (most recent call last):
File "mpi.py", line 192, in <module>
cntr = function_name(chunks)
File "mpi.py", line 60, in make_grid_list
for i in range(len(data)):
TypeError: object of type 'int' has no len()
Traceback (most recent call last):
File "mpi.py", line 192, in <module>
cntr = function_name(chunks)
File "mpi.py", line 60, in make_grid_list
for i in range(len(data)):
TypeError: object of type 'int' has no len()

Related

spatial regression in Python - read matrix from list

I have a following problem. I am following this example about spatial regression in Python:
import numpy
import libpysal
import spreg
import pickle
# Read spatial data
ww = libpysal.io.open(libpysal.examples.get_path("baltim_q.gal"))
w = ww.read()
ww.close()
w_name = "baltim_q.gal"
w.transform = "r"
Example above works. But I would like to read my own spatial matrix which I have now as a list of lists. See my approach:
ww = libpysal.io.open(matrix)
But I got this error message:
Traceback (most recent call last):
File "/usr/lib/python3.8/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/libpysal/io/fileio.py", line 90, in __new__
cls.__registry[cls.getType(dataPath, mode, dataFormat)][mode][0]
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/libpysal/io/fileio.py", line 105, in getType
ext = os.path.splitext(dataPath)[1]
File "/usr/lib/python3.8/posixpath.py", line 118, in splitext
p = os.fspath(p)
TypeError: expected str, bytes or os.PathLike object, not list
this is how matrix looks like:
[[0, 2, 1], [2, 0, 4], [1, 4, 0]]
EDIT:
If I try to insert my matrix into the GM_Lag like this:
model = spreg.GM_Lag(
y,
X,
w=matrix,
)
I got following error:
warn("w must be API-compatible pysal weights object")
Traceback (most recent call last):
File "/usr/lib/python3.8/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 2, in <module>
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/spreg/twosls_sp.py", line 469, in __init__
USER.check_weights(w, y, w_required=True)
File "/home/vojta/Desktop/INTERNET_HANDEL/ZASILKOVNA/optimal-delivery-branches/venv/lib/python3.8/site-packages/spreg/user_output.py", line 444, in check_weights
if w.n != y.shape[0] and time == False:
AttributeError: 'list' object has no attribute 'n'
EDIT 2:
This is how I read the list of lists:
import pickle
with open("weighted_matrix.pkl", "rb") as f:
matrix = pickle.load(f)
How can I insert list of lists into spreg.GM_Lag ? Thanks

Why do you want to pass it to the libpysal.io.open method? If I understand correctly this code, you first open a file, then read it (and the read method seems to be returning a List). So in your case, where you already have the matrix, you don't need to neither open nor read any file.
What will be needed though is what w is supposed to look like here: w = ww.read(). If it is a simple matrix, then you can initialize w = matrix. If the read method also format the data a certain way, you'll need to do it another way. If you could describe the expected behavior of the read method (e.g. what does the input file contain, and what is returned), it would be useful.
As mentioned, as the data is formatted into a libpysal.weights object, you must build one yourself. This can supposedly be done with this method libpysal.weights.W. (Read the doc too fast).

How do I print very large value of N%M using Python?

For example if I give input 5 and 18. I want to convert 5 to five ones i.e. (11111)%18 = 5. I can do this using print(int(('1'*N))%M)
but I want same with very large numbers i.e. N=338692981500, M=1838828
now my N should be converted in 111111111111111111........1111 (Ntimes)%1838828 = 482531. When I did this I'm getting memory error.
N,M=map(int,input().split())
print(int(('1'*N))%M)
338692981500 1838828
Traceback (most recent call last):
File "testing.py", line 687, in <module>
result=int ('1'*N)%M;
MemoryError

how to represent object as variable for graph of seaborn in python?

I have data-frame which has column name price. So i want to draw a distribution plot for that column. and i want to assign the graph name as column_name so that i can the graph when i need in multiple places even though i have number of distribution, I can call required graph separately, here i have column are dynamic.
x = 'price'
y = sns.distplot(df[x])
exec("%s = %s" % (x,y))
print(price)
I have try this code but throwing an error like
Traceback (most recent call last):
File "/home/mahesh/.local/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3287, in run_code
last_expr = (yield from self._async_exec(code_obj, self.user_ns))
File "<ipython-input-36-f28fdca73b33>", line 8, in async-def-wrapper
File "<string>", line 1
price = AxesSubplot(0.125,0.125;0.775x0.755)
^
SyntaxError: invalid syntax

One way is using a function
x = df.price
def displot(j):
sns.distplot(j)
displot(x)

Numpy std calculation: TypeError: cannot perform reduce with flexible type

I am trying to read lines of numbers starting at line 7 and compiling the numbers into a list until there is no more data, then calculate standard deviation and %rms on this list. Seems straightforward but I keep getting the error:
Traceback (most recent call last):
File "rmscalc.py", line 21, in <module>
std = np.std(values)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/fromnumeric.py", line 2817, in std
keepdims=keepdims)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/_methods.py", line 116, in _std
keepdims=keepdims)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/_methods.py", line 86, in _var
arrmean = um.add.reduce(arr, axis=axis, dtype=dtype, keepdims=True)
TypeError: cannot perform reduce with flexible type
Here is my code below:
import numpy as np
import glob
import os
values = []
line_number = 6
road = '/Users/allisondavis/Documents/HCl'
for pbpfile in glob.glob(os.path.join(road, 'pbpfile*')):
lines = open(pbpfile, 'r').readlines()
while line_number < 400 :
if lines[line_number] == '\n':
break
else:
variables = lines[line_number].split()
values.append(variables)
line_number = line_number + 3
print values
a = np.asarray(values).astype(np.float32)
std = np.std(a)
rms = std * 100
print rms
Edit: It produces an rms (which is wrong - not sure why yet) but the following error message is confusing: I need the count to be high (picked 400 just to ensure it would get the entire file no matter how large)
Traceback (most recent call last):
File "rmscalc.py", line 13, in <module>
if lines[line_number] == '\n':
IndexError: list index out of range

values is a string array and so is a. Convert a into a numeric type using astype. For example,
a = np.asarray(values).astype(np.float32)
std = np.std(a)

Numpy array from pandas frames can't be count vectorized due to "'float' object has no attribute 'lower'" error

I have a pandas data frame that I am reading from a csv. It includes three columns, a subject line, and two numbers I am not using yet.
>>> input
0 1 2
0 Stress Free Christmas Gift They'll Love 0.010574 8
I have converted the list of subjects to a numpy array, and I want to use count vectorizer for naive bayes. When I do that, I get the following error.
>>> cv=CountVectorizer()
>>> subjects=np.asarray(input[0])
>>> cv.fit_transform(subjects)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/sklearn/feature_extraction/text.py", line 780, in fit_transform
vocabulary, X = self._count_vocab(raw_documents, self.fixed_vocabulary)
File "/Library/Python/2.7/site-packages/sklearn/feature_extraction/text.py", line 715, in _count_vocab
for feature in analyze(doc):
File "/Library/Python/2.7/site-packages/sklearn/feature_extraction/text.py", line 229, in <lambda>
tokenize(preprocess(self.decode(doc))), stop_words)
File "/Library/Python/2.7/site-packages/sklearn/feature_extraction/text.py", line 195, in <lambda>
return lambda x: strip_accents(x.lower())
AttributeError: 'float' object has no attribute 'lower'
These items should definitely all be strings. When I read the csv in with the csv library instead and created an array of that column, I didn't have any problems. Any ideas?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Unable to perform scatter execution in python - python

Related

spatial regression in Python - read matrix from list

How do I print very large value of N%M using Python?

how to represent object as variable for graph of seaborn in python?

Numpy std calculation: TypeError: cannot perform reduce with flexible type

Numpy array from pandas frames can't be count vectorized due to "'float' object has no attribute 'lower'" error

Categories

Resources