Check if a string is present in multiple lists (Python) - python

What is the most efficient way to check whether a set string is present in all of three given lists, and if so, set itemInAllLists to True?
Below is the general idea of what I'm trying to achieve.
item = 'test-element'
list_a = ['a','random','test-element']
list_b = ['light','apple','table']
list_c = ['car','field','test-element','chair']
itemInAllLists = False
if item in [list_a] and item in [list_b] and item in [list_c]:
itemInAllLists = True
Check if string present in multiple lists (Python)

Have a look at the all built-in for Python. It will return True if all elements of an iterable is true.
If you put all your lists in a combined list, you can do list comprehension to check each list.
all(item in all_lists for all_lists in [list_a, list_b, list_c])
As deceze mentions, you don't have to do it this way. What you are doing works as well and might be easier to read. Using all or any might be better suited for more lists or when you create them dynamically.
For your code to work, you just have to remove the brackets so the syntax is correct:
if item in list_a and item in list_b and item in list_c:
pass

Related

Quicker way to filter lists based on a check to external variable?

I have a variable = 'P13804'
I also have a list like this:
['1T9G\tA\t2.9\tP11310\t241279.81', '1T9G\tS\t2.9\tP38117\t241279.81', '1T9G\tD\t2.9\tP11310\t241279.81', '1T9G\tB\t2.9\tP11310\t241279.81', '1T9G\tR\t2.9\tP13804\t241279.81', '1T9G\tC\t2.9\tP11310\t241279.81']
You can see, if you split each item in this list up by tab, that the third item in each sub-list of this list is sometimes 'P11310' and sometimes is 'P13804'.
I want to remove the items from the list, where the third item does not match my variable of interest (i.e. in this case P13804).
I know a way to do this is:
var = 'P13804'
new_list = []
for each_item in list1:
split_each_item = each_item.split('\t')
if split_each_item[3] != var:
new_list.append(each_item)
print(new_list)
In reality, the lists are really long, and i have a lot of variables to check. So I'm wondering does someone have a faster way of doing this?
It is generally more efficient in Python to build a list with a comprehension than repeatedly appending to it. So I would use:
var = 'P13804'
new_list = [i for i in list1 if i.split('\t')[2] == var]
According to timeit, it saves more or less 20% of the elapsed time.

Relationship between elements of two list: how to exploit it in Python?

SO here is my minimal working example:
# I have a list
list1 = [1,2,3,4]
#I do some operation on the elements of the list
list2 = [2**j for j in list1]
# Then I want to have these items all shuffled around, so for instance
list2 = np.random.permutation(list2)
#Now here is my problem: I want to understand which element of the new list2 came from which element of list1. I am looking for something like this:
list1.index(something)
# Basically given an element of list2, I want to understand from where it came from, in list1. I really cant think of a simple way of doing this, but there must be an easy way!
Can you please suggest me an easy solution? This is a minimal working example,however the main point is that I have a list, I do some operation on the elements and assign these to a new list. And then the items get all shuffled around and I need to understand where they came from.
enumerate, like everyone said is the best option but there is an alternative if you know the mapping relation. You can write a function that does the opposite of the mapping relation. (eg. decodes if the original function encodes.)
Then you use decoded_list = map(decode_function,encoded_list) to get a new list. Then by cross comparing this list with the original list, you can achieve your goal.
Enumerate is better if you are certain that the same list was modified using the encode_function from within the code to get the encoded list.
However, if you are importing this new list from elsewhere, eg. from a table on a website, my approach is the way to go.
You could use a permutation list/index :
# I have a list
list1 = [1,2,3,4]
#I do some operation on the elements of the list
list2 = [2**j for j in list1]
# Then I want to have these items all shuffled around, so for instance
index_list = range(len(list2))
index_list = np.random.permutation(index_list)
list3 = [list2[i] for i in index_list]
then,with input_element:
answer = index_list[list3.index(input_element)]
Based on your code:
# I have a list
list1 = [1,2,3,4]
#I do some operation on the elements of the list
list2 = [2**j for j in list1]
# made a recode of index and value
index_list2 = list(enumerate(list2))
# Then I want to have these items all shuffled around, so for instance
index_list3 = np.random.permutation(index_list2)
idx, list3 = zip(*index_list3)
#get the index of element_input in list3, then get the value of the index in idx, that should be the answer you want.
answer = idx[list3.index(element_input)]
def index3_to_1(index):
y = list3[index]
x = np.log(y)/np.log(2) # inverse y=f(x) for your operation
return list1.index(x)
This supposes that the operations you are doing on list2 are reversible. Also, it supposes that each element in list1 is unique.

How does the list comprehension to flatten a python list work? [duplicate]

This question already has answers here:
How can I use list comprehensions to process a nested list?
(13 answers)
Closed 7 months ago.
I recently looked for a way to flatten a nested python list, like this: [[1,2,3],[4,5,6]], into this: [1,2,3,4,5,6].
Stackoverflow was helpful as ever and I found a post with this ingenious list comprehension:
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
I thought I understood how list comprehensions work, but apparently I haven't got the faintest idea. What puzzles me most is that besides the comprehension above, this also runs (although it doesn't give the same result):
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Can someone explain how python interprets these things? Based on the second comprension, I would expect that python interprets it back to front, but apparently that is not always the case. If it were, the first comprehension should throw an error, because 'sublist' does not exist. My mind is completely warped, help!
Let's take a look at your list comprehension then, but first let's start with list comprehension at it's easiest.
l = [1,2,3,4,5]
print [x for x in l] # prints [1, 2, 3, 4, 5]
You can look at this the same as a for loop structured like so:
for x in l:
print x
Now let's look at another one:
l = [1,2,3,4,5]
a = [x for x in l if x % 2 == 0]
print a # prints [2,4]
That is the exact same as this:
a = []
l = [1,2,3,4,5]
for x in l:
if x % 2 == 0:
a.append(x)
print a # prints [2,4]
Now let's take a look at the examples you provided.
l = [[1,2,3],[4,5,6]]
flattened_l = [item for sublist in l for item in sublist]
print flattened_l # prints [1,2,3,4,5,6]
For list comprehension start at the farthest to the left for loop and work your way in. The variable, item, in this case, is what will be added. It will produce this equivalent:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
Now for the last one
exactly_the_same_as_l = [item for item in sublist for sublist in l]
Using the same knowledge we can create a for loop and see how it would behave:
for item in sublist:
for sublist in l:
exactly_the_same_as_l.append(item)
Now the only reason the above one works is because when flattened_l was created, it also created sublist. It is a scoping reason to why that did not throw an error. If you ran that without defining the flattened_l first, you would get a NameError
The for loops are evaluated from left to right. Any list comprehension can be re-written as a for loop, as follows:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for sublist in l:
for item in sublist:
flattened_l.append(item)
The above is the correct code for flattening a list, whether you choose to write it concisely as a list comprehension, or in this extended version.
The second list comprehension you wrote will raise a NameError, as 'sublist' has not yet been defined. You can see this by writing the list comprehension as a for loop:
l = [[1,2,3],[4,5,6]]
flattened_l = []
for item in sublist:
for sublist in l:
flattened_l.append(item)
The only reason you didn't see the error when you ran your code was because you had previously defined sublist when implementing your first list comprehension.
For more information, you may want to check out Guido's tutorial on list comprehensions.
For the lazy dev that wants a quick answer:
>>> a = [[1,2], [3,4]]
>>> [i for g in a for i in g]
[1, 2, 3, 4]
While this approach definitely works for flattening lists, I wouldn't recommend it unless your sublists are known to be very small (1 or 2 elements each).
I've done a bit of profiling with timeit and found that this takes roughly 2-3 times longer than using a single loop and calling extend…
def flatten(l):
flattened = []
for sublist in l:
flattened.extend(sublist)
return flattened
While it's not as pretty, the speedup is significant. I suppose this works so well because extend can more efficiently copy the whole sublist at once instead of copying each element, one at a time. I would recommend using extend if you know your sublists are medium-to-large in size. The larger the sublist, the bigger the speedup.
One final caveat: obviously, this only holds true if you need to eagerly form this flattened list. Perhaps you'll be sorting it later, for example. If you're ultimately going to just loop through the list as-is, this will not be any better than using the nested loops approach outlined by others. But for that use case, you want to return a generator instead of a list for the added benefit of laziness…
def flatten(l):
return (item for sublist in l for item in sublist) # note the parens
Note, of course, that the sort of comprehension will only "flatten" a list of lists (or list of other iterables). Also if you pass it a list of strings you'll "flatten" it into a list of characters.
To generalize this in a meaningful way you first want to be able to cleanly distinguish between strings (or bytearrays) and other types of sequences (or other Iterables). So let's start with a simple function:
import collections
def non_str_seq(p):
'''p is putatively a sequence and not a string nor bytearray'''
return isinstance(p, collections.Iterable) and not (isinstance(p, str) or isinstance(p, bytearray))
Using that we can then build a recursive function to flatten any
def flatten(s):
'''Recursively flatten any sequence of objects
'''
results = list()
if non_str_seq(s):
for each in s:
results.extend(flatten(each))
else:
results.append(s)
return results
There are probably more elegant ways to do this. But this works for all the Python built-in types that I know of. Simple objects (numbers, strings, instances of None, True, False are all returned wrapped in list. Dictionaries are returned as lists of keys (in hash order).

Python list index splitting and manipulation

My question seems simple, but for a novice to python like myself this is starting to get too complex for me to get, so here's the situation:
I need to take a list such as:
L = [(a, b, c), (d, e, d), (etc, etc, etc), (etc, etc, etc)]
and make each index an individual list so that I may pull elements from each index specifically. The problem is that the list I am actually working with contains hundreds of indices such as the ones above and I cannot make something like:
L_new = list(L['insert specific index here'])
for each one as that would mean filling up the memory with hundreds of lists corresponding to individual indices of the first list and would be far too time and memory consuming from my point of view. So my question is this, how can I separate those indices and then pull individual parts from them without needing to create hundreds of individual lists (at least to the point where I wont need hundreds of individual lines to create them).
I might be misreading your question, but I'm inclined to say that you don't actually have to do anything to be able to index your tuples. See my comment, but: L[0][0] will give "a", L[0][1] will give "b", L[2][1] will give "etc" etc...
If you really want a clean way to turn this into a list of lists you could use a list comprehension:
cast = [list(entry) for entry in L]
In response to your comment: if you want to access across dimensions I would suggest list comprehension. For your comment specifically:
crosscut = [entry[0] for entry in L]
In response to comment 2: This is largely a part of a really useful operation called slicing. Specifically to do the referenced operation you would do this:
multiple_index = [entry[0:3] for entry in L]
Depending on your readability preferences there are actually a number of possibilities here:
list_of_lists = []
for sublist in L:
list_of_lists.append(list(sublist))
iterator = iter(L)
for i in range(0,iterator.__length_hint__()):
return list(iterator.next())
# Or yield list(iterator.next()) if you want lazy evaluation
What you have there is a list of tuples, access them like a list of lists
L[3][2]
will get the second element from the 3rd tuple in your list L
Two way of using inner lists:
for index, sublist in enumerate(L):
# do something with sublist
pass
or with an iterator
iterator = iter(L)
sublist = L.next() # <-- yields the first sublist
in both case, sublist elements can be reached via
direct index
sublist[2]
iteration
iterator = iter(sublist)
iterator.next() # <-- yields first elem of sublist
for elem in sublist:
# do something with my elem
pass

determine if a list contains other lists

if I have a list, is there any way to check if it contains any other lists?
what i mean to say is, I want to know if a list has this strcuture: [] as opposed to this structure [[]]
so, compare [1,2,3,4] to [1,[2,3],4]
this is complicated by the fact that i have a list of strings.
well, phihag's solution seems to be working so far, but what I'm doing is this:
uniqueCrossTabs = list(itertools.chain.from_iterable(uniqueCrossTabs))
in order to flatten a list if it has other lists in it.
But since my list contains strings, if this is done on an already flattened list, I get a list of each character of each string that was in the original list.
This is not the behavior i was looking for. so, checking to see if the list needs to be flattened before flattening is neccessary.
any(isinstance(el, list) for el in input_list)
You can take phihag's answer even further if you actually want a list of all the lists inside the list:
output_list = filter( lambda x: isinstance(x,list), input_list)
lst1 in lst2
Yields True iff lst1 is in lst2.

Categories

Resources