python : recognising if a list has no duplicate numbers - python

Im currently working on a game which uses lists and gives users choices depending on if there is/isnt duplicates in a list (really basic text based game), however i cant seem to code something that recognises if there is no duplicates without looping through and checking if the count for each number is greater than 1. I really just need something that would check if the list contains no duplicates so that i can then write the rest of the program.
so the psuedocode would look something like this
numbers = [1, 3, 5, 6]
check list for duplicates
if no duplicates:
do x

Use set function with sorted:
if sorted(set(y)) == sorted(y):
pass
Set remove duplicates from given list so its easy to check if your list has duplicates. Sorted its optional but if you give user option to input numbers in other order this will be helpful then.
set()
sorted()
Simpler solution if you don't need sorted use:
len(y) != len(set(y))
Its faster because u dont need use sort on both lists. Its return True of False.
Check for duplicates in a flat list

You can check the length of the set of the list and compare with its regular length:
l = [1, 3, 32, 4]
if len(set(l)) == len(l):
pass

Related

python: Simplest way to sort a list by a calculated value

If I have a master_list (list of lists), I want to know what the simplest way would to sort it by x value (I'll call it score). Score (a float) is derived by calling a function that calculates it based on the items in master_list[i] (a list).
The way I was going about is like this:
for i in range(len(master_list):
# call get_score(master_list[i]) (function that calculates score for master_list[i]) and
insert to index 0 in master_list[i]
sorted_list = sorted(master_list)
for i in range(len(master_list):
master_list[i].pop(0)
My function get_score() returns the a single float for one of the lists in master_list
It is important not to modify the original master_list, hence why I am removing score from master_list[i]. What I wish to know is if there is any way to accomplish without adding score to each master_list[i], then removing it.
I also tried something like this but I don't believe it would work:
score_sorted = sorted(master_list, key=get_score(iterm for item in master_list))
Expected output:
master_list = [['b', 1]['a', 2]['c', 3]]
If the score for master_list[i] are 3, 2, 1 for all items respectively then the ouput would be:
master_list_with_score = [[3,'b', 1][2,'a', 2][1,'c', 3]]
sorted_by_score_master_list = [['c', 3]['a', 2]['b', 1]]
Sorry about the formatting, it's my first time posting. Let me know if there is need for any clarification
You're just supposed to provide the function, sorted will call it on the list elements itself.
sorted(master_list, key=get_score)
Try it online!
I would keep a seperate list of scores.
master_list = master_list
score = get_score(master_list[i]) # i am assuming it returns a list of scores for the same index in master list
sorted(zip(score, master_list)) # then master list is sorted by the scores
If you want a seperated sorted list,
sorted_master_list = [i for (score,i) in sorted(zip(score, master_list))]

Organize/Formatting the python code to one line

Is there any way to rewrite the below python code in one line
for i in range(len(main_list)):
if main_list[i] != []:
for j in range(len(main_list[i])):
main_list[i][j][6]=main_list[i][j][6].strftime('%Y-%m-%d')
something like below,
[main_list[i][j][6]=main_list[i][j][6].strftime('%Y-%m-%d') for i in range(len(main_list)) if main_list[i] != [] for j in range(len(main_list[i]))]
I got SyntaxError for this.
Actually, i'm trying to storing all the values fetched from table into one list. Since the table contains date method/datatype, my requirement needs to convert it to string as i faced with malformed string error.
So my approach is to convert that element of list from datetime.date() to str. And i got it working. Just wanted it to work with one line
Use the explicit for loop. There's no better option.
A list comprehension is used to create a new list, not to modify certain elements of an existing list.
You may be able to update values via a list comprehension, e.g. [L.__setitem__(i, 'some_value') for i in range(len(L))], but this is not recommended as you are using a side-effect and in the process creating a list of None values which you then discard.
You could also write a convoluted list comprehension with a ternary statement indicating when you meet the 6th element in a 3rd nested sublist. But this will make your code difficult to maintain.
In short, use the for loop.
You're getting a syntax error because you're not allowed to perform assignments within a list comprehension. Python forbids assignments because it is discouraging over complex list comprehensions in favour of for loops.
Obviously you shouldn't do this on one line, but this is how to do it:
import datetime
# Example from your comment:
type1 = "some type"
main_list = [[], [],
[[1, 2, 3, datetime.date(2016, 8, 18), type1],
[3, 4, 5, datetime.date(2016, 8, 18), type1]], [], []]
def fmt_times(lst):
"""Format the fourth value of each element of each non-empty sublist"""
for i in range(len(lst)):
if lst[i] != []:
for j in range(len(lst[i])):
lst[i][j][3] = lst[i][j][3].strftime('%Y-%m-%d')
return lst
def fmt_times_one_line(main_list):
"""Format the fourth value of each element of each non-empty sublist"""
return [[] if main_list[i] == [] else [[main_list[i][j][k] if k != 3 else main_list[i][j][k].strftime('%Y-%m-%d') for k in range(len(main_list[i][j]))] for j in range(len(main_list[i])) ] for i in range(len(main_list))]
import copy
# Deep copy needed because fmt_times modifies the sublists.
assert fmt_times(copy.deepcopy(main_list)) == fmt_times_one_line(main_list)
The list comprehension is a functional thing. If you know how map() works in python or javascript then it's the same thing. In a map() or comprehension we generally don't mutate the data we're mapping over (and python discourages attempting it) so instead we recreate the entire object, substituting only the values we wanted to modify.
One line?
main_list = convert_list(main_list)
You will have to put a few more lines somewhere else though:
def convert_list(main_list):
for i, ml in enumerate(main_list):
if isinstance(ml, list) and len(ml) > 0:
main_list[i] = convert_list(ml)
elif isinstance(ml, datetime.date):
main_list[i] = ml.strftime('%Y-%m-%d')
return main_list
You might be able to whack this together with a list comprehension but it's a terrible idea (for reasons better explained in the other answer).

How to count number of unique lists within list?

I've tried using Counter and itertools, but since a list is unhasable, they don't work.
My data looks like this: [ [1,2,3], [2,3,4], [1,2,3] ]
I would like to know that the list [1,2,3] appears twice, but I cant figure out how to do this. I was thinking of just converting each list to a tuple, then hashing with that. Is there a better way?
>>> from collections import Counter
>>> li=[ [1,2,3], [2,3,4], [1,2,3] ]
>>> Counter(str(e) for e in li)
Counter({'[1, 2, 3]': 2, '[2, 3, 4]': 1})
The method that you state also works as long as there are not nested mutables in each sublist (such as [ [1,2,3], [2,3,4,[11,12]], [1,2,3] ]:
>>> Counter(tuple(e) for e in li)
Counter({(1, 2, 3): 2, (2, 3, 4): 1})
If you do have other unhasable types nested in the sub lists lists, use the str or repr method since that deals with all sub lists as well. Or recursively convert all to tuples (more work).
ll = [ [1,2,3], [2,3,4], [1,2,3] ]
print(len(set(map(tuple, ll))))
Also, if you wanted to count the occurences of a unique* list:
print(ll.count([1,2,3]))
*value unique, not reference unique)
I think, using the Counter class on tuples like
Counter(tuple(item) for item in li)
Will be optimal in terms of elegance and "pythoniticity": It's probably the shortest solution, it's perfectly clear what you want to achieve and how it's done, and it uses resp. combines standard methods (and thus avoids reinventing the wheel).
The only performance drawback I can see is, that every element has to be converted to a tuple (in order to be hashable), which more or less means that all elements of all sublists have to be copied once. Also the internal hash function on tuples may be suboptimal if you know that list elements will e.g. always be integers.
In order to improve on performance, you would have to
Implement some kind of hash algorithm working directly on lists (more or less reimplementing the hashing of tuples but for lists)
Somehow reimplement the Counter class in order to use this hash algorithm and provide some suitable output (this class would probably use a dictionary using the hash values as key and a combination of the "original" list and the count as value)
At least the first step would need to be done in C/C++ in order to match the speed of the internal hash function. If you know the type of the list elements you could probably even improve the performance.
As for the Counter class I do not know if it's standard implementation is in Python or in C, if the latter is the case you'll probably also have to reimplement it in C in order to achieve the same (or better) performance.
So the question "Is there a better solution" cannot be answered (as always) without knowing your specific requirements.
list = [ [1,2,3], [2,3,4], [1,2,3] ]
repeats = []
unique = 0
for i in list:
count = 0;
if i not in repeats:
for i2 in list:
if i == i2:
count += 1
if count > 1:
repeats.append(i)
elif count == 1:
unique += 1
print "Repeated Items"
for r in repeats:
print r,
print "\nUnique items:", unique
loops through the list to find repeated sequences, while skipping items if they have already been detected as repeats, and adds them into the repeats list, while counting the number of unique lists.

Wildcard for nested list-query

I've got a nested list and I'd like to check whether i is contained on the lowest level of my list (i is the first of two elements of one "sublist").
1) Is there a direct way to do this?
2) I tried the following:
for i in randomlist:
if [i,randomlist.count(i)] in list1:
Is there a way to replace randomlist.count(i) with a wildcard? I tried *,%,..., but non of these worked well. Any ideas?
Thanks in advance!
I think what you want is:
if any(l[0] == i for l in list1):
This will only check the first item in each sub-list, which is effectively the same as having a wild-card second element.
It seems that this is the actual problem:
input shows nested list with numbers and their counts in sublists:
[[86, 4], [67, 1], [89, 1],...] output: i need to know whether a
number with its count is already in the list (in order not to add it a
second time), but the count is unknown during the for loop
There are two ways to approach this problem. First, if the list does not have duplicates, simply convert it to a dictionary:
numbers = dict([[86,4],[67,1],[89,1]])
Now each number is the key, and the count a value. Next, if you want to know if a number is not in the dictionary, you have many ways to do that:
# Fetch the number
try:
count = numbers[14]
except KeyError:
print('{} does not exist'.format(14))
# Another way to write the above is:
count = numbers.get(14)
if not count:
print('{} does not exist'.format(14))
# From a list of a numbers, add them only if they don't
# exist in the dictionary:
for i in list_of_numbers:
if i not in numbers.keys():
numbers[i] = some_value
If there are already duplicates in the original list, you can still convert it into a dictionary but you need to do some extra work if you want to preserve all the values for the numbers:
from collections import defaultdict
numbers = defaultdict(list)
for key,value in original_list:
numbers[key].append(value)
Now if you have duplicate numbers, all their values are stored in a list. You can still follow the same logic:
for i in new_numbers:
numbers[i].append(new_value)
Except now if the number already existed, the new_value will just be added to the list of existing values.
Finally, if all you want to do is add to the list if the first number doesn't exist:
numbers = set(i[0] for i in original_list)
for i in new_numbers:
if i not in numbers:
original_list += [i, some_value]

Delete all occurrences of specific values from list of lists python

As far as I can see this question (surprisingly?) has not been asked before - unless I am failing to spot an equivalent question due to lack of experience. (Similar questions have
been asked about 1D lists)
I have a list_A that has int values in it.
I want to delete all occurrences of all the values specified in List_A from my list_of_lists. As a novice coder I can hack something together here myself using list comprehensions and for loops, but given what I have read about inefficiencies of deleting elements from within lists, I am looking for advice from more experienced users about the fastest way to go about this.
list_of_lists= [
[1,2,3,4,5,6,8,9],
[0,2,4,5,6,7],
[0,1,6],
[0,4,9],
[0,1,3,5],
[0,1,4],
[0,1,2],
[1,8],
[0,7],
[0,3]
]
Further info
I am not looking to eliminate duplicates (there is already a question on here about that). I am looking to eliminate all occurrences of selected values.
list_A may typically have 200 values in it
list_of_lists will have a similar (long tailed) distribution to that shown above but in the order of up to 10,000 rows by 10,000 columns
Output can be a modified version of original list_of_lists or completely new list - whichever is quicker
Last but not least (thanks to RemcoGerlich for drawing attention to this) - I need to eliminate empty sublists from with the list of lists
Many thanks
Using list comprehension should work as:
new_list = [[i for i in l if i not in list_A] for l in list_of_list]
After that, if you want to remove empty lists, you can make:
for i in new_list:
if not i:
new_list.remove(i)
of, as #ferhatelmas pointed in comments:
new_list = [i for i in new_list if i]
To avoid duplicates in list_A you can convert it to a set before with list_A = set(list_A)
I'd write a function that just takes one list (or iterable) and a set, and returns a new list with the values from the set removed:
def list_without_values(L, values):
return [l for l in L if l not in values]
Then I'd turn list_A into a set for fast checking, and loop over the original list:
set_A = set(list_A)
list_of_lists = [list_without_values(L, set_A) for L in list_of_lists]
Should be fast enough and readibility is what matters most.

Categories

Resources