Map function doesn't work in for loop - python

I'm trying to merge two lists if they contain a certain word.
My code works fine until I try to transfer it to under a function or under a for loop.
When I do I get:
TypeError: argument 2 to map() must support iteration
I also tried replacing map(None, a,b) with itertools.imap(None, a,b) as suggested in other posts but get :
TypeError: 'int' object is not iterable
Any suggestions?
a = 0
b = 0
row_combine = []
for row in blank3:
if 'GOVERNMENTAL' in row:
a = row
if 'ACTIVITIES' in row:
b = row
c = map(None, a,b) #problem is here
for row in c:
row1 = []
if row[0] == None:
row1.append(''.join([''] + [row[1]]))
else:
row1.append(''.join([row[0]] + [' '] + [row[1]]))
row_combine.append(''.join(row1))
output for a:
a = [' ', u'GOVERNMENTAL', u'BUSINESS-TYPE']
output for b:
b = [u'ASSETS', u'ACTIVITIES', u'ACTIVITIES', u'2009', u'2008', u'JEDO']
need it to be:
[ u'ASSETS', u'GOVERNMENTAL ACTIVITIES', u'BUSINESS-TYPE ACTIVITIES', u'2009', u'2008', u'JEDO']
hence the for for loop after map function.

If after iterating through blank3 you never encounter both 'GOVERNMENTAL' and 'ACTIVITIES', a or b could be 0, which will cause map to fail. You could start a and b off as empty lists, or check your input before the map()
Meanwhile, instead of the for loop:
row_combine = map(lambda x, y: ((x or '') + ' ' + (y or '')).strip(), a, b)
Which yields:
[u'ASSETS', u'GOVERNMENTAL ACTIVITIES', u'BUSINESS-TYPE ACTIVITIES', u'2009', u'2008', u'JEDO']

Related

Why won't len(temp_list) update its value in a loop where values of temp_list changes every iteration?

I am trying to create a dataframe data which consists of two columns which are 'word' and 'misspelling'. I have 5 parts in which I attempt to achieve it which are 1 function, 3 dataframes, and 1 loop.
A function which generate misspellings (got this from Peter Norvig):
def generate(word):
letters = 'abcdefghijklmnopqrstuvwxyz'
splits = [(word[:i], word[i:]) for i in range(len(word) +1)]
deletes = [L + R[1:] for L, R in splits if R]
transposes = [L + R[1] + R[0] + R[2:] for L, R in splits if len(R)>1]
replaces = [L + c + R[1:] for L, R in splits if R for c in letters]
inserts = [L + c + R for L, R in splits for c in letters]
return set(deletes + transposes + replaces + inserts)
A dataframe with words to generate the misspelling:
wl = ['a', 'is', 'the']
word_list = pd.DataFrame(wl, columns = ['word'])
An empty dataframe meant to be filled up in the loop:
data = pd.DataFrame(columns = ['word', 'misspelling'])
An empty dataframe meant to temporarily hold the values from the function 'generate' in the loop:
temp_list = pd.DataFrame(columns = ['misspelling'])
A loop that will fill up the dataframe data:
y = 0
for a in range(len(word_list)):
temp_list['misspelling'] = pd.DataFrame(generate(word_list.at[a,'word']))
data = pd.concat([data,temp_list], ignore_index = True)
print(len(temp_list)) #to check the length of 'temp_list' in each loop
for x in range(len(temp_list)):
data.at[y,'word'] = word_list.at[a,'word']
y = y + 1
y = data.index[-1] + 1temp_list.drop(columns = ['misspelling'])
What I expected when I check data outside of the loop is for it to have a total of 390 rows which is the total of len(generate('is')) + len(generate('a')) + len(generate('the')).
The total of rows in data turned out to be 234 which is way less. When I went around to check which variable was not tallying up, it turned out to be len(temp_list) which I expect it to update every loop since new values are replacing it.
len(temp_list) remains the same which is causing temp_list['misspelling'] = pd.DataFrame(generate(word_list.at[a,'word'])) to only have the maximum length of len(generate('a')) (in which 'a' is the first value in word_list) although the generated misspellings in temp_list was different each loop.
I thought adding temp_list.drop(columns = ['misspelling']) at the end of the outer loop would reset temp_list but it doesn't seem like it resetted len(temp_list).
temp_list.drop() with inplace=False (which is the default) does not modify the existing dataframe, but returns a new one. However, even if you fix that, it still won’t work, because you would also need to drop the index, and I’m not sure that’s even possible.
I don’t quite understand what you are trying to do (for example, the for x in ... loop never uses x) but I suspect you might be better off using plain Python lists instead of dataframes.

I want to make a list of list in python in loop

I want to get an output like
['a','aa','aaa','aaaa','aaaaa'],['b','bb','bbb','bbbb','bbbbb'],['c','cc','ccc','cccc','ccccc']
using the code snippet below
a = ["a","b","g","f"]
b =[1,2,3,4,5]
e = []
f = []
for i in a:
for j in b:
e.append(i*j)
f.append(e)
print(f)
Can anybody help me with this please
You're failing to reset e to a new empty list after each outer loop, so you're inserting a bunch of aliases to the same list. Just move the e = [] between the two fors (and indent to match), e.g.:
for i in a:
e = []
for j in b:
...
so you make a new list to populate on each loop.
Alternatively, as noted in the comments, a nested listcomp would do the trick:
f = [[x * y for x in b] for y in a]
removing the need for a named e at all (and running somewhat faster to boot).
It looks like you forgot to reset you e list with every iteration. This will yield your desired outcome:
a = ["a","b","g","f"]
b =[1,2,3,4,5]
e = []
f = []
for i in a:
e= []
for j in b:
e.append(i*j)
f.append(e)
print(f)
The only error is that you are not resetting e to [] after each iteration in a. Let me explain:
This is the code you need:
for i in a:
for j in b:
e.append(i*j)
f.append(e)
e = []
You need the e = [] at the end. If there was no e = [], at the end of thefirst iteration, e = ['a', 'aa'...]. During the second iteration, e would equal [a, aa, aaa, aaaa, aaaaa, b, bb, bbb...] However, setting it to an empty list stops this.
Python has a property that if u multiply a number with a string it will multiply that string for example:
print(2*"ab")
abab
You can use this for your answer:
a = ["a","b","g","f"]
b =[1,2,3,4,5]
c = [[x*y for x in b] for y in a]
print(c)
Lengthy, but this works fine too
a = ['a','b','c','d']
b = [1,2,3,4,5]
nest_array = []
for i in range(0,len(a)):
arr = []
for val in b:
str1 = ""
while len(str1) < val:
if len(str1) == val:
break
else:
str1 += a[i]
arr.append(str1)
nest_array.append(arr)
print(nest_array)

Traversing through a list that has b' ahead

When I write
a1 = list([b'1,690569\n1,315892\n1,189226\n2,834328\n2,1615927\n2,1194519\n'])
print(a1)
for edge_ in a1:
print('edge =' + str(edge_))
z[(edge_[0], edge_[1])] = 1
print('edge_0 =' + str(edge_[0]))
print('edge_1 =' + str(edge_[1]))
print(z)
I get the output as
[b'1,690569\n1,315892\n1,189226\n2,834328\n2,1615927\n2,1194519\n']
edge =b'1,690569\n1,315892\n1,189226\n2,834328\n2,1615927\n2,1194519\n'
edge_0 =49
edge_1 =44
{(49, 44): 1}
Can anyone explain why it is 49 and 44? These values are coming irrespective of the element inside the list.
Firstly, as others have already mentioned, your array below is a byte array. This is evident due to the 'b' at the start. You don't need to use 'list()' by the way.
a1 = [b'1,690569\n1,315892\n1,189226\n2,834328\n2,1615927\n2,1194519\n']
Given that z is an empty dictionary (i.e. z = dict())
Below is just adding a tuple as a key and an integer as value:
z[(edge_[0], edge_[1])] = 1
We can see the following:
edge_ = a1[0] = b'1,690569\n1,315892\n1,189226\n2,834328\n2,1615927\n2,1194519\n'
edge_[0] = a1[0][0] = ord(b'1') = 49
edge_[1] = a1[0][1] = ord(b',') = 44
Hence z[(edge_[0], edge_[1])] = 1 becomes:
z[(49, 44)] = 1
z = {(49, 44): 1}

write a function and return all the values within the function for enumerate

I would like my function to return all the indices and values for the loop:
Code:
a = [["one"], ["two"], ["three"], ["four"]]
def count(a):
for i, x in enumerate(a):
return i,x
aa = count(a)
print(aa)
Output:
(0, ['one'])
Expected output:
(0 ['one'])
(1 ['two'])
(2 ['three'])
(3 ['four'])
You are only getting the first item because you return the first time you loop. Try something like this. Modify it as you need.
a = [["one"], ["two"], ["three"], ["four"]]
def count(a):
output_str = ""
for i, x in enumerate(a):
output_str += str(i) + "," + str(x) + "\n"
return output_str
aa = count(a)
print(aa)
Output
0,['one']
1,['two']
2,['three']
3,['four']
As #C. Yduqoli suggested, this can be done with yield. But would involve an additional step, like so:
a = [["one"], ["two"], ["three"], ["four"]]
def count(a):
for i, x in enumerate(a):
yield i,x
aa = count(a)
for item in aa:
print(item)
Output
(0, ['one'])
(1, ['two'])
(2, ['three'])
(3, ['four'])
Since enumerate() returns an iterable, not a sequence, in your case, you can turn it into a sequence by list() or tuple().
a = [["one"], ["two"], ["three"], ["four"]]
def count(a):
return list(enumerate(a))
aa = count(a)
print(aa)
Note that the output of the above code is not exactly what you will expect, since we are printing a list here.

Mapping two list of lists based on its items into list pairs in Python

I have two list of lists which basically need to be mapped to each other based on their matching items (list). The output is a list of pairs that were mapped. When the list to be mapped is of length one, we can look for direct matches in the other list. The problem arises, when the list to be mapped is of length > 1 where I need to find, if the list in A is a subset of B.
Input:
A = [['point'], ['point', 'floating']]
B = [['floating', 'undefined', 'point'], ['point']]
My failed Code:
C = []
for a in A:
for b in B:
if a == b:
C.append([a, b])
else:
if set(a).intersection(b):
C.append([a, b])
print C
Expected Output:
C = [
[['point'], ['point']],
[['point', 'floating'], ['floating', 'undefined', 'point']]
]
Just add a length condition to the elif statement:
import pprint
A = [['point'], ['point', 'floating']]
B = [['floating', 'undefined', 'point'], ['point']]
C = []
for a in A:
for b in B:
if a==b:
C.append([a,b])
elif all (len(x)>=2 for x in [a,b]) and not set(a).isdisjoint(b):
C.append([a,b])
pprint.pprint(C)
output:
[[['point'], ['point']],
[['point', 'floating'], ['floating', 'undefined', 'point']]]
Just for interests sake, here's a "one line" implementation using itertools.ifilter.
from itertools import ifilter
C = list(ifilter(
lambda x: x[0] == x[1] if len(x[0]) == 1 else set(x[0]).issubset(x[1]),
([a,b] for a in A for b in B)
))
EDIT:
Having reading the most recent comments on the question, I think I may have misinterpreted what exactly is considered to be a match. In which case, something like this may be more appropriate.
C = list(ifilter(
lambda x: x[0] == x[1] if len(x[0])<2 or len(x[1])<2 else set(x[0]).intersection(x[1]),
([a,b] for a in A for b in B)
))
Either way, the basic concept is the same. Just change the condition in the lamba to match exactly what you want to match.

Categories

Resources