Related
I am struggling trying to create a function which searches a list to see if any of the strings contained within it are substrings of any of the other strings within the same list. If a substring is found it should return the index number and if none are found it should return False
For example.
lst1 = ["red", "yellow", "green", "yellowhammer"]
lst2 = ["red", "yellow", "green"]
In this example, lst1 would return a value of 1 as yellow is a substring of yellowhammer and lst2 would return a value of False as there are no substrings.
I have tried the following
templst = lst1
for i in templst:
if i in lst1:
return i
else:
return False
However this does not work because it always finds itself so even if there are no substrings it returns a value even if it should return False.
The following code should accomplish what you need. The details of how this is accomplished are commented within.
# Lists that OP provided
lst1 = ["red", "yellow", "green", "yellowhammer"]
lst2 = ["red", "yellow", "green"]
# Function that checks the list
def checkList(myList):
# Create a variable to hold the concatenated string
total = ""
# Build the concatenated string
for item in myList:
total += item
# Loop through the list again
for i, item in enumerate(myList):
# Count the amount of times each item appears in the concatenation
curr = total.count(item)
# If its more than one, since it will always appear once
if(curr > 1):
# Return its index
return i
# Otherwise, return False
return False
# Test the two test samples
list_1_ans = checkList(lst1)
list_2_ans = checkList(lst2)
# Print out results
print("First Test Answer: {} | Second Test Answer: {}".format(list_1_ans, list_2_ans))
Yields:
First Test Answer: 1 | Second Test Answer: False
First, to create a function, you want to use the def keyword. This is a function that takes a list of strings as its input and returns either a bool or an int, so with type hints it'll look like:
from typing import List, Union
def index_of_substring(strings: List[str]) -> Union[bool, int]:
"""The first index whose item is a substring of a different
item in the same list, or False if no such item exists."""
# implement me
pass
Now we need to implement the body of the function in terms of its strings argument. Since we want to return the index within the list, it makes sense to iterate over the range of list indices:
for i in range(len(strings)):
Each i in this loop is an index (e.g. in your list1 it'll be a number from 0 to 3). Now we want to ask the question "is the item at this index a substring of any other item other than itself?"
To answer that question, we want to ask about other indexes in the list that we can compare to our current index i; we'll call those other indexes j:
for j in range(len(strings))
and the conditions we want to satisfy relative to i are:
strings[i] in strings[j] and i != j
We can put this all together in a list comprehension that will give us a list which tells us which items in the range satisfy that and condition:
[strings[i] in strings[j] and i != j for j in range(len(strings))]
and we want to know if any of those items are True:
any([strings[i] in strings[j] and i != j for j in range(len(strings))])
If they are, we want to return i. We want to repeat this check for each i, and if none of them are true, we want to return False. The complete function looks like:
def index_of_substring(strings: List[str]) -> Union[bool, int]:
for i in range(len(strings)):
if any([strings[i] in strings[j] and i != j for j in range(len(strings))]):
return i
return False
and we can call it like this:
print(index_of_substring(lst1))
print(index_of_substring(lst2))
which prints:
1
False
The following function will return the output you need
def check_subs(lst1):
answer = {1 if x in y and x !=y else 0 for x in lst1 for y in lst1}
if sum(answer)>0:
return answer
else:
return False
If it were just checking whether letters in a test_string are also in a control_string,
I would not have had this problem.
I will simply use the code below.
if set(test_string.lower()) <= set(control_string.lower()):
return True
But I also face a rather convoluted task of discerning whether the overlapping letters in the
control_string are in the same sequential order as those in test_string.
For example,
test_string = 'Dih'
control_string = 'Danish'
True
test_string = 'Tbl'
control_string = 'Bottle'
False
I thought of using the for iterator to compare the indices of the alphabets, but it is quite hard to think of the appropriate algorithm.
for i in test_string.lower():
for j in control_string.lower():
if i==j:
index_factor = control_string.index(j)
My plan is to compare the primary index factor to the next factor, and if primary index factor turns out to be larger than the other, the function returns False.
I am stuck on how to compare those index_factors in a for loop.
How should I approach this problem?
You could just join the characters in your test string to a regular expression, allowing for any other characters .* in between, and then re.search that pattern in the control string.
>>> test, control = "Dih", "Danish"
>>> re.search('.*'.join(test), control) is not None
True
>>> test, control = "Tbl", "Bottle"
>>> re.search('.*'.join(test), control) is not None
False
Without using regular expressions, you can create an iter from the control string and use two nested loops,1) breaking from the inner loop and else returning False until all the characters in test are found in control. It is important to create the iter, even though control is already iterable, so that the inner loop will continue where it last stopped.
def check(test, control):
it = iter(control)
for a in test:
for b in it:
if a == b:
break
else:
return False
return True
You could even do this in one (well, two) lines using all and any:
def check(test, control):
it = iter(control)
return all(any(a == b for b in it) for a in test)
Complexity for both approaches should be O(n), with n being the max number of characters.
1) This is conceptually similar to what #jpp does, but IMHO a bit clearer.
Here's one solution. The idea is to iterate through the control string first and yield a value if it matches the next test character. If the total number of matches equals the length of test, then your condition is satisfied.
def yield_in_order(x, y):
iterstr = iter(x)
current = next(iterstr)
for i in y:
if i == current:
yield i
current = next(iterstr)
def checker(test, control):
x = test.lower()
return sum(1 for _ in zip(x, yield_in_order(x, control.lower()))) == len(x)
test1, control1 = 'Tbl', 'Bottle'
test2, control2 = 'Dih', 'Danish'
print(checker(test1, control1)) # False
print(checker(test2, control2)) # True
#tobias_k's answer has cleaner version of this. If you want some additional information, e.g. how many letters align before there's a break found, you can trivially adjust the checker function to return sum(1 for _ in zip(x, yield_in_order(...))).
You can use find(letter, last_index) to find occurence of desired letter after processed letters.
def same_order_in(test, control):
index = 0
control = control.lower()
for i in test.lower():
index = control.find(i, index)
if index == -1:
return False
# index += 1 # uncomment to check multiple occurrences of same letter in test string
return True
If test string have duplicate letters like:
test_string = 'Diih'
control_string = 'Danish'
With commented line same_order_in(test_string, control_string) == True
and with uncommented line same_order_in(test_string, control_string) == False
Recursion is the best way to solve such problems.
Here's one that checks for sequential ordering.
def sequentialOrder(test_string, control_string, len1, len2):
if len1 == 0: # base case 1
return True
if len2 == 0: # base case 2
return False
if test_string[len1 - 1] == control_string[len2 - 1]:
return sequentialOrder(test_string, control_string, len1 - 1, len2 - 1) # Recursion
return sequentialOrder(test_string, control_string, len1, len2-1)
test_string = 'Dih'
control_string = 'Danish'
print(isSubSequence(test_string, control_string, len(test_string), len(control_string)))
Outputs:
True
and False for
test_string = 'Tbl'
control_string = 'Bottle'
Here's an Iterative approach that does the same thing,
def sequentialOrder(test_string,control_string,len1,len2):
i = 0
j = 0
while j < len1 and i < len2:
if test_string[j] == control_string[i]:
j = j + 1
i = i + 1
return j==len1
test_string = 'Dih'
control_string = 'Danish'
print(sequentialOrder(test_string,control_string,len(test_string) ,len(control_string)))
An elegant solution using a generator:
def foo(test_string, control_string):
if all(c in control_string for c in test_string):
gen = (char for char in control_string if char in test_string)
if all(x == test_string[i] for i, x in enumerate(gen)):
return True
return False
print(foo('Dzn','Dahis')) # False
print(foo('Dsi','Dahis')) # False
print(foo('Dis','Dahis')) # True
First check if all the letters in the test_string are contained in the control_string. Then check if the order is similar to the test_string order.
A simple way is making use of the key argument in sorted, which serves as a key for the sort comparison:
def seq_order(l1, l2):
intersection = ''.join(sorted(set(l1) & set(l2), key = l2.index))
return True if intersection == l1 else False
Thus this is computing the intersection of the two sets and sorting it according to the longer string. Having done so you only need to compare the result with the shorter string to see if they are the same.
The function returns True or False accordingly. Using your examples:
seq_order('Dih', 'Danish')
#True
seq_order('Tbl', 'Bottle')
#False
seq_order('alp','apple')
#False
I've been stuck on this for a while and I keep running into problems, I'm trying to create a function that returns true if at least one pair of adjacent elements in a list are equal.
Test cases:
[1, 2, 3] -> False
[1, 2, 2, 3] -> True
[2, 6, 3, 6] -> False
['a', 'b', 'c', 'd', 'd'] -> True
def equal_adjacent_elements(l):
for x in range(len(l)):
if l[x] == l[x+1] or l[x] == l[x-1]:
return True
else:
return False
The problems I run into are assertion errors and I believe it's because of my loop. Once I find a pair that is equal the returned value won't stay the same because my loops will evaluate the next values in the list. So I just need to find at least one pair, I don't know how I would do that though.
You can zip the list with itself offest by 1 and use any to short-cut the find-one pattern:
def equal_adjacent_elements(l):
return any(x == y for x, y in zip(l, l[1:]))
I made few changes. It should work now.
def equal_adjacent_elements(l):
for x in range(len(l)-1):
if l[x] == l[x+1]:
return True
return False
or, shorter one using any,
def equal_adjacent_elements(l)
return any( l[x] == l[x+1] for x in range(len(l)-1) )
There are two problems here:
the indices can run into overflow here;
you immediately return False from the moment there are two consecutive elements that are not equal, you should return False after the for loop.
The index problem here is that x ranges from 0 to (excluding) len(l). So that means x-1 ranges from -1 (which is the last element of the list) to len(l)-1 and x+1 from 1 to len(l) (including). But the list is not that long, so you get an index out of the list bounds, which is an error.
A more Pythonic approach however is to use a zip(..) over the list l, and the "tail" of the list (the list omitting the first element). So zip(l, l[1:]).
We can iterate over every pair x,y of l and l[1:]. In case x == y, then we have such element and return True.
In case we fail to find such pair, we eventually return `False.
So a correct implementation is:
def equal_adjacent_elements(l):
for x,y in zip(l, l[1:]):
if x == y:
return True
return False
Python however has a builtin any, that will return True from the moment the iterable we give it contains at least one True; and False if no element has truthiness True.
So we can convert the above code to something using the any builtin:
def equal_adjacent_elements(l):
return any(x == y for x,y in zip(l, l[1:]))
I'm kinda new to python but anyways hoped this is easy enough
def equal_adjacent_elements(l):
try:
for i in range(0, len(l)):
if l[i] == l[i+1]:
return True
except IndexError:
return False
I have this exercise from a list, return two list's one with positive numbers and the other with negative.
My code:
def fuc(list):
negatives = []
positives = []
for i in list:
if i > 0:
positives.append(i)
print(i)
else:
negatives.append(i)
print(i)
print(fuc([1,-1,2,-2,3,-3,4,-4,5,-5]))
This code doesn't return 2 list's(negative and positive), I want to know how I get 2 list's from the original list.
Adding the else keyword and returning the values would work, but there's a nicer approach using ternary expression to determine which list to append to:
def func(l):
negatives = []
positives = []
for i in l:
(positives if i >= 0 else negatives).append(i):
return negatives,positives
That is, if you consider 0 as positive, else you'd have to filter it out and the interest would be limited.
If you have the choice of using numpy you can do something like this.
import numpy as np
def fuc(myList):
myList=np.array(myList)
neg=myList[np.less(myList,0)]
pos=myList[np.greater(myList,0)]
return list(neg),list(pos)
i've written a small program:
def check(xrr):
""" goes through the list and returns True if the list
does not contain common pairs, IE ([a,b,c],[c,d,e]) = true
but ([a,b,c],[b,a,c]) = false, note the lists can be longer than 2 tuples"""
x = xrr[:]
#sorting the tuples
sorted(map(sorted,x))
for i in range(len(x)-1):
for j in range(len(x)-1):
if [x[i]] == [x[i+1]] and [x[j]] == [x[j+1]]:
return False
return True
But it doesnt seem to work right, this is probably something extremely basic, but after a couple of days trying on and off, i cant really seem to get my head around where the error is.
Thanx in advance
There are so many problems with your code as others have mentioned. I'll try to explain how I would implement this function.
It sounds like what you want to do is actually this: You generate a list of pairs from the input sequences and see if there are any duplicates among the pairs. When you formulate the problem like this it gets much easier to implement.
First we need to generate the pairs. It can be done in many ways, the one you would probably do is:
def pairs( seq ):
ret = []
# go to the 2nd last item of seq
for k in range(len(seq)-1):
# append a pair
ret.append((seq[k], seq[k+1]))
return ret
Now we want to see (a,b) and (b,a) and the same tuple, so we simply sort the tuples:
def sorted_pairs( seq ):
ret = []
for k in range(len(seq)-1):
x,y = (seq[k], seq[k+1])
if x <= y:
ret.append((x,y))
else:
ret.append((y,x))
return ret
Now solving the problem is pretty straight forward. We just need to generate all these tuples and add them to a set. Once we see a pair twice we are done:
def has_common_pairs( *seqs ):
""" checks if there are any common pairs among any of the seqs """
# store all the pairs we've seen
seen = set()
for seq in seqs:
# generate pairs for each seq in seqs
pair_seq = sorted_pairs(seq)
for pair in pair_seq:
# have we seen the pair before?
if pair in seen:
return True
seen.add(pair)
return False
Now the function you were trying to implement is quite simple:
def check(xxr):
return not has_common_pairs(*xxr)
PS: You can generalize the sorted_pairs function to work on any kind of iterable, not only those that support indexing. For completeness sake I'll paste it below, but you don't really need it here and it' harder to understand:
def sorted_pairs( seq ):
""" yield pairs (fst, snd) generated from seq
where fst <= snd for all fst, snd"""
it = iter(seq)
fst = next(it)
for snd in it:
if first <= snd:
yield fst, snd
else:
yield snd, fst
first = snd
I would recommend using a set for this:
def check(xrr):
s = set()
for t in xrr:
u = tuple(sorted(t))
if u in s:
return False
s.add(u)
return True
This way, you don't need to sort the whole list and you stop when the first duplicate is found.
There are several errors in your code. One is that sorted returns a new list, and you just drop the return value. Another one is that you have two nested loops over your data where you would need only one. Here is the code that makes your approach work:
def check(xrr):
x = sorted(map(sorted,xrr))
for i in range(len(x)-1):
if x[i] == x[i+1]:
return False
return True
This could be shortened to
def check(xrr):
x = sorted(map(sorted,xrr))
return all(a != b for a, b in zip(x[:-1], x[1:]))
But note that the first code I gave will be more efficient.
BTW, a list in Python is [1, 2, 3], while a tuple is (1, 2, 3).
sorted doesn't alter the source, it returns a new list.
def check(xrr):
xrrs = map(sorted, xrr)
for i in range(len(xrrs)):
if xrrs[i] in xrrs[i+1:]: return False
return True
I'm not sure that's what's being asked, but if I understood it correctly, I'd write:
def check(lst):
return any(not set(seq).issubset(lst[0]) for seq in lst[1:])
print check([(1, 2, 3), (2, 3, 5)]) # True
print check([(1, 2, 3), (3, 2, 1)]) # False
Here is more general solution, note that it find duplicates, not 'non-duplicates', it's better this way and than to use not.
def has_duplicates(seq):
seen = set()
for item in seq:
if hasattr(item, '__iter__'):
item = tuple(sorted(item))
if item in seen:
return True
seen.add(item)
return False
This is more general solution for finding duplicates:
def get_duplicates(seq):
seen = set()
duplicates = set()
for item in seq:
item = tuple(sorted(item))
if item in seen:
duplicates.add(item)
else:
seen.add(item)
return duplicates
Also it is better to find duplicates, not the 'not duplicates', it saves a lot of confusion. You're better of using general and readable solution, than one-purpose functions.