Checking if two elements in a tuple have the same value - python

I have an tuple containing 100 string values (say). Now, I want to check if two string elements there in the tuple are same?
I tried to do something like this with nested loops:
def hasDuplicates(arr: tuple):
ctr = 0
# arr looks something like this & len(arr) == 100
# arr = ('abc', 'bcd', 'sdf', 'abc', 'pqr', ...)
for m in arr:
for n in arr:
if n == m:
ctr += 1
# while looping, len(arr) times every element
# will be compared with itself
if ctr > len(arr):
return True
return False
...which worked but I think there is a better work around for this. Can anyone provide a better solution to this? :)

If I understand correctly, you can just convert your tuple to a set and check whether it has the same length as the original tuple.
def has_duplicates(iterable):
l = list(iterable) # in case iterable is an iterator
return len(set(l)) != len(l)
Demo:
>>> tup = ('abc', 'bcd', 'sdf', 'abc', 'pqr')
>>> has_duplicates(tup)
>>> True
>>> has_duplicates(range(100))
>>> False
Won't work for infinite iterators :)
~edit~
A more general version that does not have to build a potentially long list and set upfront:
def has_duplicates(iterable):
seen = set()
for x in iterable:
if x in seen:
return True
seen.add(x)
return False
Of course, both versions require the elements of your iterable to be hashable.

You can also check this using any keyword and count method from list object:
arr = ('abc', 'bcd', 'sdf', 'abc', 'pqr')
def sameStrings(arr):
return any(arr.count(elem)>1 for elem in list(arr))
print(sameStrings(arr))
Output:
True
Edit
Updating answer with proposed solution by #timgeb using Counter from collections module:
from collections import Counter
arr = ('abc', 'bcd', 'sdf', 'abc', 'pqr')
def sameStrings(arr):
myCounter = Counter(list(arr))
return max(myCounter.values())>1
print(sameStrings(arr))
Output:
True

has_duplicates = len(set(some_tuple)) == 1

Related

Getting the nth char of each string in a list of strings

Let's they I have the list ['abc', 'def', 'gh'] I need to get a string with the contents of the first char of the first string, the first of the second and so on.
So the result would look like this: "adgbehcf" But the problem is that the last string in the array could have two or one char.
I already tried to nested for loop but that didn't work.
Code:
n = 3 # The encryption number
for i in range(n):
x = [s[i] for s in partiallyEncrypted]
fullyEncrypted.append(x)
a version using itertools.zip_longest:
from itertools import zip_longest
lst = ['abc', 'def', 'gh']
strg = ''.join(''.join(item) for item in zip_longest(*lst, fillvalue=''))
print(strg)
to get an idea why this works it may help having a look at
for tpl in zip_longest(*lst, fillvalue=''):
print(tpl)
I guess you can use:
from itertools import izip_longest
l = ['abc', 'def', 'gh']
print "".join(filter(None, [i for sub in izip_longest(*l) for i in sub]))
# adgbehcf
Having:
l = ['abc', 'def', 'gh']
This would work:
s = ''
In [18]: for j in range(0, len(max(l, key=len))):
...: for elem in l:
...: if len(elem) > j:
...: s += elem[j]
In [28]: s
Out[28]: 'adgbehcf'
Please don't use this:
''.join(''.join(y) for y in zip(*x)) +
''.join(y[-1] for y in x if len(y) == max(len(j) for j in x))

Filtering out sublist from list based on contents of entire sublist?

So here is what I have:
lst = [["111","101","000"],["1001","1100","1111"],["00","11","00"]]
And I want to filter out the sublists that contain only strings of "0"*len(string) and "1"*len(string). The result should look like this:
[["111","101","000"],["1001","1100","1111"]]
Break up the task into smaller parts. Then combine to get the solution:
# check that a string is all 0 or all 1
def check_string(s):
size = len(s)
return s in ('0'*size, '1'*size)
# check that a list contains only strings that satisfy check_string
def check_list(l):
return all(check_string(s) for s in l)
lst = [["111","101","000"],["1001","1100","1111"],["00","11","00"]]
result = [l for l in lst if not check_list(l)]
Then we have
>>> print(result)
[['111', '101', '000'], ['1001', '1100', '1111']]
Here's one way to do it with regular expressions:
import re
[[y for y in x if not (re.match('1+$', y) or re.match('0+$', y))] for x in lst]
And here is a better clever way inspired by the answer here:
[[y for y in x if not (y == len(y) * y[0])] for x in lst]
With generator expressions:
lst = list([x for x in lst if not all([y == y[0]*len(y) for y in x])])
Note: This is better than #Tum's answer because it takes the list as a whole (e.g., ["111","101","000"]) rather than individually accepting or rejecting each value (e.g., accepting "101" but rejecting "111" and "000", leaving ["101"]
You can do so using the filter function as follows:
import re
orig_list = [["111","101","000"], ["1001","1100","1111"], ["01","10"]]
def checker(item):
for idx in item:
if re.search(r'^1*$', idx) or re.search(r'^0*$', idx):
return True
return False
new_list = list(filter(checker, orig_list))
print(new_list)
Output:
[['111', '101', '000'], ['1001', '1100', '1111']]
One more solution:
[lst[j] for j in set([k for k, i in enumerate(lst) for m in i if m[0]*len(m) != m])]
In this case think about m[0]: if you have empty string what does it mean in your case? You can exclude it also.

Given a list of string, determine if one string is a prefix of another string

I want to write a Python function which checks if one string is a prefix string of another; not an arbitrary sub string of another; must be prefix. If it is, return True. For instance,
list = ['abc', 'abcd', 'xyx', 'mno']
Return True because 'abc' is a prefix of 'abcd'.
list = ['abc', 'xyzabc', 'mno']
Return False
I tried the startwith() and list comprehension, but it didn't quite work.
Appreciate for any help or pointers.
Let us first sort the given lst w.r.t length of the string, due to the known fact that sub strings always have length less than or equal to the original string, so after sorting we have strings with smaller length at the start of the list, and then we iterate over the sorted list comparing the current element with all the elements next to it, This small optimization would reduce the complexity of the problem as now we don't have to comapre each element with every other element.
lst1 = ['abc', 'abcd', 'xyx', 'mno']
lst2 = ['abc', 'xyzabc', 'mno']
lst3 = ["abc", "abc"]
def check_list(lst):
lst = list(set(lst)) #if you want to avoid redundant strings.
lst.sort(key = lambda x:len(x))
n = len(lst)
for i in xrange(n):
for j in xrange(i+1, n):
if lst[j].startswith(lst[i]):
return True
return False
print check_list(lst1)
print check_list(lst2)
print check_list(lst3)
>>> True
>>> False
>>> False #incase you use lst = list(set(lst))
Using itertools
import itertools
list1 = ["abc", "xyz", "abc123"]
products = itertools.product(list1, list1)
is_substringy = any(x.startswith(y) for x, y in products if x != y)
This isn't very optimised, but depending on the amount of data you've got to deal with, the code is fairly elegant (and short); that might trump speed in your use case.
This assumes that you don't have pure repeats in the list however (but you don't have that in your example).
import itertools
mlist = ['abc', 'abcd', 'xyx', 'mno']
#combination of list elements, 2-by-2. without repetition
In [638]: for i,j in itertools.combinations(mlist,2):
print (i,j)
.....:
('abc', 'abcd')
('abc', 'xyx')
('abc', 'mno')
('abcd', 'xyx')
('abcd', 'mno')
('xyx', 'mno')
#r holds the final result. if there is any pair where one is a prefixed of another
r=False
In [639]: for i,j in itertools.combinations(mlist,2):
r = r or i.startswith(j) # if i is the prefix of j. logical or
r = r or j.startswith(i) # if j is the prefix of i
.....:
In [640]: r
Out[640]: True

How to check float string?

I have a list which consists irregular words and float numbers, I'd like to delete all these float numbers from the list, but first I need to find a way to detect them. I know str.isdigit() can discriminate numbers, but it can't work for float numbers. How to do it?
My code is like this:
my_list = ['fun','3.25','4.222','cool','82.356','go','foo','255.224']
for i in my_list:
if i.isdigit() == True:
my_list.pop(i)
# Can't work, i.isdigit returns False
Use exception handling and a list comprehension. Don't modify the list while iterating over it.
>>> def is_float(x):
... try:
... float(x)
... return True
... except ValueError:
... return False
>>> lis = ['fun','3.25','4.222','cool','82.356','go','foo','255.224']
>>> [x for x in lis if not is_float(x)]
['fun', 'cool', 'go', 'foo']
To modify the same list object use slice assignment:
>>> lis[:] = [x for x in lis if not is_float(x)]
>>> lis
['fun', 'cool', 'go', 'foo']
Easy way:
new_list = []
for item in my_list:
try:
float(item)
except ValueError:
new_list.append(item)
Using regular expressions:
import re
expr = re.compile(r'\d+(?:\.\d*)')
new_list = [item for item in my_list if not expr.match(item)]
A point about using list.pop():
When you use list.pop() to alter an existing list, you are shortening the length of the list, which means altering the indices of the list. This will lead to unexpected results if you are simultaneously iterating over the list. Also, pop() takes the index as an argument, not the element. You are iterating over the element in my_list. It is better to create a new list as I have done above.
A dead simple list comprehension, adding only slightly to isdigit:
my_list = [s for s in my_list if not all(c.isdigit() or c == "." for c in s)]
This will remove string representations of both int and float values (i.e. any string s where all characters c are numbers or a full stop).
As I understand OP the function should only remove floats. If integers should stay - consider this solution:
def is_float(x):
try:
return int(float(x)) < float(x)
except ValueError:
return False
my_list = ['fun', '3.25', 'cool', '82.356', 'go', 'foo', '255.224']
list_int = ['fun', '3.25', 'cool', '82.356', 'go', 'foo', '255.224', '42']
print [item for item in my_list if not is_float(item)]
print [item for item in list_int if not is_float(item)]
Output
['fun', 'cool', 'go', 'foo']
['fun', 'cool', 'go', 'foo', '42']
Regular expressions would do the trick - this code searches each string for the format of a float (including floats starting with or ending with a decimal point), and if the string is not a float, adds it to the new list.
import re
my_list = ['fun','3.25','4.222','cool','82.356','go','foo','255.224']
new_list = []
for pos, st in enumerate(my_list):
if not re.search('[0-9]*?[.][0-9]*', st):
new_list.append(st)
print new_list
Creating a new list avoids working on the same list you are iterating on.
Ewans answer is cleaner and quicker, I think.

Basic list operations using python

These three functions are apart of my study guide and would greatly appreciate some assistance.
In each case, the function returns a value (so use the return statement): it does not print the value (no print statement) or mutate (change the value of) any of its arguments.
1) The repl function takes three arguments:
◦old is any value;
◦new is any value;
◦xs is a list.
Example:
>>> repl('zebra', 'donkey', ['mule', 'horse', 'zebra', 'sheep', 'zebra'])
['mule', 'horse', 'donkey', 'sheep', 'donkey']
It returns a new list formed by replacing every occurrence of old in xs with new.
It must not mutate the list xs; i.e., after return from the function, the actual argument given for xs must be what it was before.
>>> friends = ['jules', 'james', 'janet', 'jerry']
>>> repl('james', 'henry', friends)
['jules', 'henry', 'janet', 'jerry']
>>> friends
['jules', 'james', 'janet', 'jerry']
2) The search function looks for a value in a list. It takes two arguments:
◦y is the value being searched for.
◦xs is the list being searched in.
It returns the index of the first occurrence of y in xs, if it occurs; −1 otherwise.
Examples:
>>> words = ['four', 'very', 'black', 'sheep']
>>> search('four', words)
0
>>> search('sheep', words)
3
>>> search('horse', words)
-1
3) The doubles function is given a list of numbers and returns a new list containing the doubles of every number in the given list.
Example:
>>> doubles([1, 3, 7, 10])
[2, 6, 14, 20]
It must not mutate the given list:
>>> salaries = [5000, 7500, 15000]
>>> doubles(salaries)
[10000, 15000, 30000]
>>> salaries
[5000, 7500, 15000]
This is to be done without using any list methods except append. (In particular, you may not use the index or count for the search function.)
Although you can use the list len function, and the list operations +, *, indexing, slicing, and == for comparing lists or elements. You will need to use some of these but not all.
Any help is greatly appreciated like I mentioned in the introduction.
So far all I have is.
def repl (find, replacement, s):
newString = ''
for c in s:
if c != find:
newString = newString + c
else:
newString = newString + replacement
return newString
def search(y, xs):
n = len(xs)
for i in range(n):
if xs[i] == y:
return i
return -1
and....
def search(key,my_list):
if key in my_list:
return my_list.index(key)
else:
return
I'm not sure what needs to be returned after the else statement.
def relp(old,new,my_list):
final = []
for x in my_list:
if x is old:
final.append(new)
else:
final.append(x)
return final
def search(key,my_list):
if key in my_list:
return my_list.index(key)
else:
return -1
def doubles(my_list):
return[x*x for x in my_list]
I suspect this lesson is about list comprehensions
doubles = lambda my_list: [x*2 for x in my_list]
repl = lambda old_t,new_t,my_list: [x if x != old_t else new_t for x in my_list]
print repl("cow","mouse",["cow","rat","monkey","elephant","cow"])
print doubles([1,2,3,4,'d'])

Categories

Resources