I have 2 separate lists which I would like to select some pieces of both. In the first variable set, I need to select based on its sign then when I could have zero crossing, I should select from both lists based on the indices coming from zero_crossing.
I could define the function to have zero_crossing indices but I do not know how to select from both lists using the indices.
def func_1(arr1):
for i, a in enumerate(arr1):
zero_crossings = np.where(np.diff(np.sign(arr1)))[0]
return zero_crossings
def func_2(arr1,arr2):
res_list1 = []
res_list2 = []
for i in (zero_crossings):
res_list1.append(arr1[i:i+1])
res_list2.append(arr2[i:i+1])
zero_crossing = [3,12,18]
list1 = [-12,-14,-16,-10,1,3,6,2,,1,5,5,3,-1,-12,-2,-3,-5,-3,3,2,5,2]
list2 = [0.00040409 0.00041026 0.0004162 ... 0.00116538 0.001096 0.00102431]
The expected results:
new_list_1 = [list1[0:3]+list1[3:12]+list1[12:18]]
new_list_1 = [list2[0:3]+list2[3:12]+list2[12:18]]
for i in range(len(zero_crossing)):
list_{%d} = []
list_i = list1[zero_crossing[i]:zero_crossing[i+1]]
I want to use list1 to see where we have a change in sign then through the list of indices of sign changing, try to select the values of both lists.
All efforts will be appreciated.
Hi I am trying to solve a problem where I have to return the indices in a sublist of the same person. When i say same person , I mean if they have the same username,phone or email(any one of them).
I understand that these identites are mostly unique but for the sake of questions lets assume.
eg.
data = [("username1","phone_number1", "email1"),
("usernameX","phone_number1", "emailX"),
("usernameZ","phone_numberZ", "email1Z"),
("usernameY","phone_numberY", "emailX"),
("username2","phone_number2", "emailX")]
Expected output :
[[0,1,3,4][2]]
Explaination: As 0,1 have the same phone and 3 and 4 have the same email so They all fall under one category. and 2 index falls in the other catoegry.
My approach until now is :
data = [("username1","phone_number1", "email1"),
("usernameX","phone_number1", "emailX"),
("usernameZ","phone_numberZ", "email1Z"),
("usernameY","phone_numberY", "emailX"),
]
def match(t1,t2):
if(t1[0] == t2[0] or t1[1] == t2[1] or t1[2] == t2[2]):
return True
else:
return False
# print(match(data[1],data[3]))
together = []
for i in range(len(data)):
temp = {i}
for j in range(len(data)):
if(match(data[i],data[j])):
temp.add(j)
together.append(temp)
for i in range(len(data)):
ans = together[i]
for j in range(i+1,len(data)):
if(bool(ans.intersection(together[j]))):
ans = ans.union(together[j])
print(ans)
I am not able to reach desired result.
Any help is appreciated. Thank you.
A first solution is similar to yours with some enhancements:
Leveraging any for the match, such that it doesn't require to know the number of items inside the tuples.
Checking if a user is already identified as part of "together" to skip useless comparison
Here it is:
together = set()
for user_idx, user in enumerate(data):
if user_idx in together:
continue # That user is already matched
# No need to check with previous users
for other_idx, other in enumerate(data[user_idx + 1 :], user_idx + 1):
# Match
if any(val_ref == val_other for val_ref, val_other in zip(user, other)):
together.update((user_idx, other_idx))
isolated = set(range(len(data))) ^ together
Another solution use tricks by going through a numpy array to identify isolated users. With numpy it is easy to compare a user to every other user (aka the original array). An isolated user will only match one time to itself on each of its fields, hence summing the boolean values along fields will return, for an isolated user, the length of the tuple of fields.
data = np.array(data)
# For each user, match it with the whole matrice
matches = sum(user == data for user in data)
# Isolated users only match with themselves, hence only have 1 on their line
isolated = set(np.where(np.sum(matches, axis=1) == data.shape[1])[0])
# Together are other users
together = set(range(len(data))) ^ set(isolated)
see the matches array for better understanding:
[[1 2 1]
[1 2 3]
[1 1 1]
[1 1 3]
[1 1 3]]
However, it is not leveraging any of the optimisation mentioned before.
Still, numpy is fast so it should be ok.
I've created a function to combine specific items in a python list, but I suspect there is a better way I can't find despite extreme googling. I need the code to be fast, as I'm going to be doing this thousands of times.
mergeleft takes a list of items and a list of indices. In the example below, I call it as mergeleft(fields,(2,4,5)). Items 5, 4, and 2 of list fields will be concatenated to the item immediately to the left. In this case, 3 and d get concatenated to c; b gets concatenated to a. The result is a list ('ab', 'cd3', 'f').
fields = ['a','b','c','d', 3,'f']
def mergeleft(x, fieldnums):
if 1 in fieldnums: raise Exception('Cannot merge field 1 left')
if max(fieldnums) > len(x): raise IndexError('Fieldnum {} exceeds available fields {}'.format(max(fieldnums),len(x)))
y = []
deleted_rows = ''
for i,l in enumerate(reversed(x)):
if (len(x) - i) in fieldnums:
deleted_rows = str(l) + deleted_rows
else:
y.append(str(l)+deleted_rows)
deleted_rows = ''
y.reverse()
return y
print(mergeleft(fields,(2,4,5)))
# Returns ['ab','cd3','f']
fields = ['a','b','c','d', 3,'f']
This assumes a list of indices in monotonic ascending order.
I reverse the order, so that I'm merging right-to-left.
For each given index, I merge that element into the one on the left, converting to string at each point.
Do note that I've changed the fieldnums type to list, so that it's easily reversible. You can also just traverse the tuple in reverse order.
def mergeleft(lst, fieldnums):
fieldnums.reverse()
for pos in fieldnums:
# Merge this field left
lst[pos-2] = str(lst[pos-2]) + str(lst[pos-1])
lst = lst[:pos-1] + lst[pos:]
return lst
print(mergeleft(fields,[2,4,5]))
Output:
['ab', 'cd3', 'f']
Here's a decently concise solution, probably among many.
def mergeleft(x, fieldnums):
if 1 in fieldnums: raise Exception('Cannot merge field 1 left')
if max(fieldnums) > len(x): raise IndexError('Fieldnum {} exceeds available fields {}'.format(max(fieldnums),len(x)))
ret = list(x)
for i in reversed(sorted(set(fieldnums))):
ret[i-1] = str(ret[i-1]) + str(ret.pop(i))
return ret
I have a list with the following structure;
[('0','927','928'),('2','693','694'),('2','742','743'),('2','776','777'),('2','804','805'),
('2','987','988'),('2','997','998'),('2','1019','1020'),
('2','1038','1039'),('2','1047','1048'),('2','1083','1084'),('2','659','660'),
('2','677','678'),('2','743','744'),('2','777','778'),('2','805','806'),('2','830','831')
the 1st number is an id, the second a position of a word and the third number is the position of a second word. What I need to do and am struggling with is finding sets of words next to each other.
These results are given for searches of 3 words, so there is the positions of word 1 with word 2 and positions of word 2 with word 3. For example ;
I run the phrase query "women in science" I then get the values given in the list above, so ('2','776','777') is the results for 'women in' and ('2','777','778') is the results for 'in science'.
I need to find a way to match these results up, so for every document it groups the words together depending on amounts of word in the query. (so if there is 4 words in the query there will be 3 results that need to be matched together).
Is this possible?
You need to quickly find word info by its position. Create a dictionary keyed by word position:
# from your example; I wonder why you use strings and not numbers.
positions = [('0','927','928'),('2','693','694'),('2','742','743'),('2','776','777'),('2','804','805'),
('2','987','988'),('2','997','998'),('2','1019','1020'),
('2','1038','1039'),('2','1047','1048'),('2','1083','1084'),('2','659','660'),
('2','677','678'),('2','743','744'),('2','777','778'),('2','805','806'),('2','830','831')]
# create the dictionary
dict_by_position = {w_pos:(w_id, w_next) for (w_id, w_pos, w_next) in positions}
Now it's a piece of cake to follow chains:
>>> dict_by_position['776']
('2', '777')
>>> dict_by_position['777']
('2', '778')
Or programmatically:
def followChain(start, position_dict):
result = []
scanner = start
while scanner in position_dict:
next_item = position_dict[scanner]
result.append(next_item)
unused_id, scanner = next_item # unpack the (id, next_position)
return result
>>> followChain('776', dict_by_position)
[('2', '777'), ('2', '778')]
Finding all chains that are not subchains of each other:
seen_items = set()
for start in dict_by_position:
if start not in seen_items:
chain = followChain(start, dict_by_position)
seen_items.update(set(chain)) # mark all pieces of chain as seen
print chain # or do something reasonable instead
The following will do what you're asking, as I understand it - it's not the prettiest output in the world, and I think that if possible you should be using numbers if numbers are what you're trying to work with.
There are probably more elegant solutions, and simplifications that could be made to this:
positions = [('0','927','928'),('2','693','694'),('2','742','743'),('2','776','777'),('2','804','805'),
('2','987','988'),('2','997','998'),('2','1019','1020'),
('2','1038','1039'),('2','1047','1048'),('2','1083','1084'),('2','659','660'),
('2','677','678'),('2','743','744'),('2','777','778'),('2','805','806'),('2','830','831')]
sorted_dict = {}
sorted_list = []
grouped_list = []
doc_ids = []
def sort_func(positions):
for item in positions:
if item[0] not in doc_ids:
doc_ids.append(item[0])
for doc_id in doc_ids:
sorted_set = []
for item in positions:
if item[0] != doc_id:
continue
else:
if item[1] not in sorted_set:
sorted_set.append(item[1])
if item[2] not in sorted_set:
sorted_set.append(item[2])
sorted_list = sorted(sorted_set)
sorted_dict[doc_id] = sorted_list
for k in sorted_dict:
group = []
grouped_list = []
for i in sorted_dict[k]:
try:
if int(i)-1 == int(sorted_dict[k][sorted_dict[k].index(i)-1]):
group.append(i)
else:
if group != []:
grouped_list.append(group)
group = [i]
except IndexError:
group.append(i)
continue
if grouped_list != []:
sorted_dict[k] = grouped_list
else:
sorted_dict[k] = group
return sorted_dict
My output for the above was:
{'0': ['927', '928'], '2': [['1019', '1020'], ['1038', '1039'], ['1047', '1048'], ['1083', '1084'], ['659', '660'], ['677', '678'], ['693', '694'], ['742', '743', '744'], ['776', '777', '778'], ['804', '805', '806'], ['830', '831'], ['987', '988']]}
I am trying to create a 2D list, and I keep getting the same error "TypeError: list indices must be integers, not tuple" I do not understand why, or how to use a 2D list correctly.
Total = 0
server = xmlrpclib.Server(url);
mainview = server.download_list("", "main")
info = [[]]
info[0,0] = hostname
info[0,1] = time
info[0,2] = complete
info[0,3] = Errors
for t in mainview:
Total += 1
print server.d.get_hash(t)
info[Total, 0] = server.d.get_hash(t)
info[Total, 1] = server.d.get_name(t)
info[Total, 2] = server.d.complete(t)
info[Total, 3] = server.d.message(t)
if server.d.complete(t) == 1:
Complete += 1
else:
Incomplete += 1
if (str(server.d.message(t)).__len__() >= 3):
Error += 1
info[0,2] = Complete
info[0,3] = Error
everything works, except for trying to deal with info.
Your mistake is in accessing the 2D-list, modify:
info[0,0] = hostname
info[0,1] = time
info[0,2] = complete
info[0,3] = Errors
to:
info[0].append(hostname)
info[0].append(time)
info[0].append(complete)
info[0].append(Errors)
Same goes to info[Total, 0] and etc.
The way you created info, it is a list containing only one element, namely an empty list. When working with lists, you have to address the nested items like
info[0][0] = hostname
For initialization, you have to create a list of lists by e.g.
# create list of lists of 0, size is 10x10
info = [[0]*10 for i in range(10)]
When using numpy arrays, you can address the elements as you did.
One advantage of "lists of lists" is that not all entries of the "2D list" shall have the same data type!
info = [[] for i in range(4)] # create 4 empty lists inside a list
info[0][0].append(hostname)
info[0][1].append(time)
info[0][2].append(complete)
info[0][3].append(Errors)
You need to create the 2d array first.