Related
I am writing a Python code to remove equal same characters from two strings which lies on the same indices. For example remove_same('ABCDE', 'ACBDE') should make both arguments as BC and CB. I know that string is immutable here so I have converted them to list. I am getting an out of index error.
def remove_same(l_string, r_string):
l_list = list(l_string)
r_list = list(r_string)
i = 0
while i != len(l_list):
print(f'in {i} length is {len(l_list)}')
while l_list[i] == r_list[i]:
l_list.pop(i)
r_list.pop(i)
if i == len(l_list) - 1:
break
if i != len(l_list):
i += 1
return l_list[0] == r_list[0]
I would avoid using a while loop in that case, I think this is a better and more clear solution:
def remove_same(s1, s2):
l1 = list(s1)
l2 = list(s2)
out1 = []
out2 = []
for c1, c2 in zip(l1, l2):
if c1 != c2:
out1.append(c1)
out2.append(c2)
s1_out = "".join(out1)
s2_out = "".join(out2)
print(s1_out)
print(s2_out)
It could be shortened using some list comprehensions but I was trying to be as explicit as possible
I feel this could be a problem.
while l_list[i] == r_list[i]:
l_list.pop(i)
r_list.pop(i)
This could reduce size of list and it can go below i.
Do a dry run on this, if l_list = ["a"] and r_list = ["a"].
It is in general not a good idea to modify a list in a loop. Here is a cleaner, more Pythonic solution. The two strings are zipped and processed in parallel. Each pair of equal characters is discarded, and the remaining characters are arranged into new strings.
a = 'ABCDE'
b = 'ACFDE'
def remove_same(s1, s2):
return ["".join(s) for s
in zip(*[(x,y) for x,y in zip(s1,s2) if x!=y])]
remove_same(a, b)
#['BC', 'CF']
Here you go:
def remove_same(l_string, r_string):
# if either string is empty, return False
if not l_string or not r_string:
return False
l_list = list(l_string)
r_list = list(r_string)
limit = min(len(l_list), len(r_list))
i = 0
while i < limit:
if l_list[i] == r_list[i]:
l_list.pop(i)
r_list.pop(i)
limit -= 1
else:
i += 1
return l_list[0] == r_list[0]
print(remove_same('ABCDE', 'ACBDE'))
Output:
False
My Python code generates some matrices (one at a time) through a loop over some index called i.
Storing matrices with names like mat_0, mat_1,..., mat_i is straightforward but I was wondering if it somehow possible to store matrices like iterable elements like mat[0], mat[1],...,mat[i]?
Note: The matrices are stored in scipy sparse coo_matrix format.
Edit 1 : The index i does not necessarily follow a proper sequence and may loop over some random numbers like 0,2,3,7,... In that case the matrices have to be stored as mat[0], mat[2], mat[3], mat[7],... and so on.
Edit 2: Minimal working code
import numpy as np
from math import sqrt
from scipy.sparse import coo_matrix, csr_matrix
primesqt = np.array([1.41421356, 1.73205080, 2.23606797, 2.64575131, 3.31662479, 3.60555127, 4.12310562, 4.35889894, 4.79583152, 5.38516480, 5.56776436, 6.08276253, 6.40312423, 6.55743852, 6.85565460, 7.28010988, 7.68114574, 7.81024967, 8.18535277, 8.42614977, 8.54400374, 8.88819441, 9.11043357, 9.43398113, 9.84885780, 10.04987562, 10.14889156, 10.34408043, 10.44030650, 10.63014581, 11.26942766, 11.44552314, 11.70469991, 11.78982612, 12.20655561, 12.28820572, 12.52996408, 12.76714533, 12.92284798, 13.15294643, 13.37908816, 13.45362404, 13.82027496, 13.89244398, 14.03566884, 14.10673597, 14.52583904, 14.93318452, 15.06651917, 15.13274595])
def bg(n, k, min_elem, max_elem):
allowed = range(max_elem, min_elem-1, -1)
def helper(n, k, t):
if k == 0:
if n == 0:
yield t
elif k == 1:
if n in allowed:
yield t + (n,)
elif min_elem * k <= n <= max_elem * k:
for v in allowed:
yield from helper(n - v, k - 1, t + (v,))
return helper(n, k, ())
def BinarySearch(lys, val):
first = 0
last = len(lys)-1
index = -1
while (first <= last) and (index == -1):
mid = (first+last)//2
if lys[mid] == val:
index = mid
else:
if val<lys[mid]:
last = mid -1
else:
first = mid +1
return index
m = 4
dim = 16
nmax = 1
a = []
for n in range(0,(nmax*m)+1):
for x in bg(n, m, 0, nmax):
a.append(x)
T = np.zeros(dim)
for ii in range(dim):
for jj in range(m):
T[ii] += primesqt[jj]*float(a[ii][jj])
ind = np.argsort(T)
T = sorted(T)
all_bs = [0,2,3,7] # i_list
# Evaluate 'mat_ee' for each 'ee' given in the list 'all_bs'
for ee in all_bs:
row = []
col = []
val = []
for ii in range(m):
for vv in range(dim):
Tg = 0
if a[vv][ii]+1 < nmax+1:
k = np.copy(a[vv])
elem = sqrt(float(k[ii]+1.0))+ee
k[ii] = k[ii]+1
# Generate tag Tg for elem != 0
for jj in range(m):
Tg += float((primesqt[jj])*k[jj])
# Search location of non-zero element in sorted T
location = BinarySearch(T, Tg)
uu = ind[location]
row.append(uu)
col.append(vv)
val.append(elem)
mat_ee = (coo_matrix((val, (row, col)), shape=(dim, dim)).tocsr()) # To be stored as mat[0], mat[2], mat[3], mat[7]
print(mat_ee)
A dictionary would allow you to reference an object using an arbitrary (but immutable) object. In your case, you could store the matrices mat_ee in each iteration of the outer loop (for ee in all_bs:) using that ee index:
csr_matrices = {}
for ee in all_bs:
# your inner loops, all the way to…
mat_ee = (coo_matrix((val, (row, col)),
shape=(dim, dim))
.tocsr())
csr_matrices[ee] = mat_ee
From that moment, you can access the elements of the dictionaries using the indices you had in all_bs:
print(csr_matrices[2])
and when you inspect the dictionary, you’ll notice it only contains the keys you specified:
print(csr_matrices.keys())
You could use a List of your objects.
items_list = list()
for something:
result = function
items_list.append(result)
I need to get a high correlation group from the correlation coefficient matrix, keep one of them and exclude the other。But I don't know how to do it gracefully and efficiently.
Here's a similar answer, but hopefully it will be done using a vector-like matrix.:
Merge arrays if they contain one or more of the same value
For example:
a = np.array([[1,0,0,0,0,1],
[0,1,0,1,0,0],
[0,0,1,0,1,1],
[0,1,0,1,0,0],
[0,0,1,0,1,0],
[1,0,1,0,0,1]])
Diagonal:
(0,0),(1,1),(2,2)...(5,5)
Other:
(0,5),(1,3),(2,4),(2,5)
These three pairs because each other contains merged into a group of:
(0,2,4,5) = (0,5),(2,4),(2,5)
So ultimately I need the output:
(I will use the results to index other data and therefore decide to keep the largest index in each group)
out = [(0,2,4,5),(1,3)]
I think the simplest approach is to take a nested loop and iterate through all the elements multiple times. I would like to have a more concise and efficient way to achieve, thank you
This is a loop implementation, I'm sorry to write it hard to see:
a = np.array([[1,0,0,0,0,1],
[0,1,0,1,0,0],
[0,0,1,0,1,1],
[0,1,0,1,0,0],
[0,0,1,0,1,0],
[1,0,1,0,0,1]])
a[np.tril_indices(6, -1)]= 0
a[np.diag_indices(6)] = 0
g = list(np.c_[np.where(a)])
p = {}; index = 1
while len(g)>0:
x = g.pop(0)
if not p:
p[index] = list(x)
for i,l in enumerate(g):
if np.in1d(l,x[0]).any()|np.in1d(l,x[1]).any():
n = list(g.pop(i))
p[index].extend(n)
else:
T = False
for key,v in p.items():
if np.in1d(v,x[0]).any()|np.in1d(v,x[1]).any():
v.extend(list(x))
T = True
if T==False:
index += 1; p[index] = list(x)
for i,l in enumerate(g):
if np.in1d(l,x[0]).any()|np.in1d(l,x[1]).any():
n = list(g.pop(i))
p[index].extend(n)
for key,v in p.items():
print key,np.unique(v)
out:
1 [0 2 4 5]
2 [1 3]
The central problem of merging / consolidating the pairs with common extrema can be solved using this answer.
Hence, the above code may be rewritten like:
a = np.array([[1,0,0,0,0,1],
[0,1,0,1,0,0],
[0,0,1,0,1,1],
[0,1,0,1,0,0],
[0,0,1,0,1,0],
[1,0,1,0,0,1]])
a[np.tril_indices(6, -1)]= 0
a[np.diag_indices(6)] = 0
g = np.c_[np.where(a)].tolist()
def consolidate(items):
items = [set(item.copy()) for item in items]
for i, x in enumerate(items):
for j, y in enumerate(items[i + 1:]):
if x & y:
items[i + j + 1] = x | y
items[i] = None
return [sorted(x) for x in items if x]
p = {i + 1: x for i, x in enumerate(sorted(consolidate(g)))}
So I want to sort a list of namedtuples using insertion sort in python, and here is my code.
def sortIt(tripods):
for i in range(1, len(tripods)):
tmp = tripods[i][3]
k = i
while k > 0 and tmp < tripods[k - 1][3]:
tripods[k] = tripods[k - 1]
k -= 1
tripods[k] = tmp
Whenever it runs it gives me a TypeError: int object is not suscriptable.
Tripods is a list of namedTuples and the index 3 of the namedTuple is what I want the list to be ordered by.
def sortIt(tripods):
for i in range(1, len(tripods)):
tmp = tripods[i]
k = i
while k > 0 and tmp[3] < tripods[k - 1][3]:
tripods[k] = tripods[k - 1]
k -= 1
tripods[k] = tmp
I have modified your code. You were assigning tripods[i][3] to tmp, which is a int. That's why the compiler warns you int is not suscriptable.
I want to write a function that determines if a sublist exists in a larger list.
list1 = [1,0,1,1,1,0,0]
list2 = [1,0,1,0,1,0,1]
#Should return true
sublistExists(list1, [1,1,1])
#Should return false
sublistExists(list2, [1,1,1])
Is there a Python function that can do this?
Let's get a bit functional, shall we? :)
def contains_sublist(lst, sublst):
n = len(sublst)
return any((sublst == lst[i:i+n]) for i in xrange(len(lst)-n+1))
Note that any() will stop on first match of sublst within lst - or fail if there is no match, after O(m*n) ops
If you are sure that your inputs will only contain the single digits 0 and 1 then you can convert to strings:
def sublistExists(list1, list2):
return ''.join(map(str, list2)) in ''.join(map(str, list1))
This creates two strings so it is not the most efficient solution but since it takes advantage of the optimized string searching algorithm in Python it's probably good enough for most purposes.
If efficiency is very important you can look at the Boyer-Moore string searching algorithm, adapted to work on lists.
A naive search has O(n*m) worst case but can be suitable if you cannot use the converting to string trick and you don't need to worry about performance.
No function that I know of
def sublistExists(list, sublist):
for i in range(len(list)-len(sublist)+1):
if sublist == list[i:i+len(sublist)]:
return True #return position (i) if you wish
return False #or -1
As Mark noted, this is not the most efficient search (it's O(n*m)). This problem can be approached in much the same way as string searching.
My favourite simple solution is following (however, its brutal-force, so i dont recommend it on huge data):
>>> l1 = ['z','a','b','c']
>>> l2 = ['a','b']
>>>any(l1[i:i+len(l2)] == l2 for i in range(len(l1)))
True
This code above actually creates all possible slices of l1 with length of l2, and sequentially compares them with l2.
Detailed explanation
Read this explanation only if you dont understand how it works (and you want to know it), otherwise there is no need to read it
Firstly, this is how you can iterate over indexes of l1 items:
>>> [i for i in range(len(l1))]
[0, 1, 2, 3]
So, because i is representing index of item in l1, you can use it to show that actuall item, instead of index number:
>>> [l1[i] for i in range(len(l1))]
['z', 'a', 'b', 'c']
Then create slices (something like subselection of items from list) from l1 with length of2:
>>> [l1[i:i+len(l2)] for i in range(len(l1))]
[['z', 'a'], ['a', 'b'], ['b', 'c'], ['c']] #last one is shorter, because there is no next item.
Now you can compare each slice with l2 and you see that second one matched:
>>> [l1[i:i+len(l2)] == l2 for i in range(len(l1))]
[False, True, False, False] #notice that the second one is that matching one
Finally, with function named any, you can check if at least one of booleans is True:
>>> any(l1[i:i+len(l2)] == l2 for i in range(len(l1)))
True
The efficient way to do this is to use the Boyer-Moore algorithm, as Mark Byers suggests. I have done it already here: Boyer-Moore search of a list for a sub-list in Python, but will paste the code here. It's based on the Wikipedia article.
The search() function returns the index of the sub-list being searched for, or -1 on failure.
def search(haystack, needle):
"""
Search list `haystack` for sublist `needle`.
"""
if len(needle) == 0:
return 0
char_table = make_char_table(needle)
offset_table = make_offset_table(needle)
i = len(needle) - 1
while i < len(haystack):
j = len(needle) - 1
while needle[j] == haystack[i]:
if j == 0:
return i
i -= 1
j -= 1
i += max(offset_table[len(needle) - 1 - j], char_table.get(haystack[i]));
return -1
def make_char_table(needle):
"""
Makes the jump table based on the mismatched character information.
"""
table = {}
for i in range(len(needle) - 1):
table[needle[i]] = len(needle) - 1 - i
return table
def make_offset_table(needle):
"""
Makes the jump table based on the scan offset in which mismatch occurs.
"""
table = []
last_prefix_position = len(needle)
for i in reversed(range(len(needle))):
if is_prefix(needle, i + 1):
last_prefix_position = i + 1
table.append(last_prefix_position - i + len(needle) - 1)
for i in range(len(needle) - 1):
slen = suffix_length(needle, i)
table[slen] = len(needle) - 1 - i + slen
return table
def is_prefix(needle, p):
"""
Is needle[p:end] a prefix of needle?
"""
j = 0
for i in range(p, len(needle)):
if needle[i] != needle[j]:
return 0
j += 1
return 1
def suffix_length(needle, p):
"""
Returns the maximum length of the substring ending at p that is a suffix.
"""
length = 0;
j = len(needle) - 1
for i in reversed(range(p + 1)):
if needle[i] == needle[j]:
length += 1
else:
break
j -= 1
return length
Here is the example from the question:
def main():
list1 = [1,0,1,1,1,0,0]
list2 = [1,0,1,0,1,0,1]
index = search(list1, [1, 1, 1])
print(index)
index = search(list2, [1, 1, 1])
print(index)
if __name__ == '__main__':
main()
Output:
2
-1
Here is a way that will work for simple lists that is slightly less fragile than Mark's
def sublistExists(haystack, needle):
def munge(s):
return ", "+format(str(s)[1:-1])+","
return munge(needle) in munge(haystack)
def sublistExists(x, y):
occ = [i for i, a in enumerate(x) if a == y[0]]
for b in occ:
if x[b:b+len(y)] == y:
print 'YES-- SUBLIST at : ', b
return True
if len(occ)-1 == occ.index(b):
print 'NO SUBLIST'
return False
list1 = [1,0,1,1,1,0,0]
list2 = [1,0,1,0,1,0,1]
#should return True
sublistExists(list1, [1,1,1])
#Should return False
sublistExists(list2, [1,1,1])
Might as well throw in a recursive version of #NasBanov's solution
def foo(sub, lst):
'''Checks if sub is in lst.
Expects both arguments to be lists
'''
if len(lst) < len(sub):
return False
return sub == lst[:len(sub)] or foo(sub, lst[1:])
def sublist(l1,l2):
if len(l1) < len(l2):
for i in range(0, len(l1)):
for j in range(0, len(l2)):
if l1[i]==l2[j] and j==i+1:
pass
return True
else:
return False
I know this might not be quite relevant to the original question but it might be very elegant 1 line solution to someone else if the sequence of items in both lists doesn't matter. The result below will show True if List1 elements are in List2 (regardless of order). If the order matters then don't use this solution.
List1 = [10, 20, 30]
List2 = [10, 20, 30, 40]
result = set(List1).intersection(set(List2)) == set(List1)
print(result)
Output
True
if iam understanding this correctly, you have a larger list, like :
list_A= ['john', 'jeff', 'dave', 'shane', 'tim']
then there are other lists
list_B= ['sean', 'bill', 'james']
list_C= ['cole', 'wayne', 'jake', 'moose']
and then i append the lists B and C to list A
list_A.append(list_B)
list_A.append(list_C)
so when i print list_A
print (list_A)
i get the following output
['john', 'jeff', 'dave', 'shane', 'tim', ['sean', 'bill', 'james'], ['cole', 'wayne', 'jake', 'moose']]
now that i want to check if the sublist exists:
for value in list_A:
value= type(value)
value= str(value).strip('<>').split()[1]
if (value == "'list'"):
print "True"
else:
print "False"
this will give you 'True' if you have any sublist inside the larger list.