Convert double for loop with break to list comprehension - python

I have a type something like this:
class T:
id: int
data: Any
I have a list of Ts with unique ids. I am provided another list of Ts with new data. So I want a new list that replaces any of the items where I have a new T with that new T.
current = [T(1), T(2), T(3)]
new = [T(2)*]
output = [T(1), T(2)*, T(3)]
I have a working double for loop with a break that I'm trying to turn into a list comprehension if it's cleaner.
output = []
for item in current_items:
for new_item in new_items:
if new_item.id == item.id:
item = new_item
break
output.append(item)
This is what I tried but without the ability to break the inner loop after performing the if condition it obviously doesn't work.
output = [
new_item if new_item.id == item.id else item
for item in current_items
for new_item in new_items
]

Let's feed next a generator expression finding elements in b with the same id as the current element in a, and default to the current element in a if no matching id is found in b.
>>> from dataclasses import dataclass
>>> #dataclass
... class T(object):
... id: int
... name: str
...
>>> a = [T(1, "foo"), T(2, "bar"), T(3, "baz")]
>>> b = [T(2, "wooble")]
>>> [next((y for y in b if y.id == x.id), x) for x in a]
[T(id=1, name='foo'), T(id=2, name='wooble'), T(id=3, name='baz')]
Using next will replicate the behavior of your loop with a break on finding a match. The entire b list will only be iterated if a match isn't found.
If we didn't care about efficiency, we could generate a list of all matching items in b and then take the first one if it's not empty.
[c[0] if (c := [y for y in b if y.id == x.id]) else x for x in a]
But that both looks uglier and is potentially less efficient both in terms of runtime complexity and space, as it generates a bunch of useless lists.

Basically what you are doing is setting the output to be new_items that are in the current items. I would suggest creating a set of current item ids and then just filtering new items that are in that set.
current_item_ids = {
item.id
for item in current_items
}
output = [
item
for item in new_items
if item in current_item_ids
]

Related

How can I search an item in a list of lists?

I am doing a school project about a inventory system, and I am facing some problem in programming the Search function.
Take an example:
ilist = [ [1,2,3,4,5], [6,7,8,9,10], [...], ...]
I would like to search for 1 and want the list containing 1 to display.
search = input('By user:')
for item in ilist:
if item == search :
print(item)
It does not work this way and I get this error:
list index out of range error
you have a nested list and are now checking against the list ('1' wont match with [1,2,3,4,5] )
so you have to loop over the list within the list and change input to int:
ilist = [ [1,2,3,4,5], [6,7,8,9,10]]
search = input('By user:')
for item in ilist:
for i in item:
if i == int(search):
print(i)
this is building on your way of coding, could be further improved from this
Two problems:
ilist is a list of lists, and you're comparing search to each list
Each member in each list is of type int, while search is of type str
In short, change this:
if item == search
To this:
if int(search) in item
You can use in to find the element from each list
search = int(input('By user:'))
for item in ilist:
if search in item:
print(item)

isolating a sub list from a big list in python

I have a big list in python like this small example:
small example:
['MLEEDMEVAIKMVVVGNGAVGKSSMIQRYCKGIFTKDYKKTIGVDFLERQIQVNDEDVRLMLWDTAGQEEFDAITKAYYRGAQACVLVFSTTDRESFEAV', 'MDHTEGSPAEEPPAHAPSPGKFGERPPPKRLTREAMRNYLKERGDQTVLILHAKVAQKSYGNEKRFFCPPPCVYLMGSGWKKKKEQMERDGCSEQESQPCAFIGIGNSDQEMQQLNLEGKNYCTAKTLYISDSDKRKHFMLSVKMFYGNSDDIGVFLSKRIKVISKPSKKKQSLKNADLCIASGTKVALFNRLRSQTVSTRYLHVEGGNFHASSQQWGAFFIHLLDDDESEGEEFTVRDGYIHYGQTVKLVCSVTGMALPRLIIRKVDKQTALLDADDPVSQLHKCAFYLKDTERMYLCLSQERIIQFQATPCPKEPNKEMINDGASWTIISTDKAEYTFYEGMGPVLAPVTPVPVVESLQLNGGGDVAMLELTGQNFTPNLRVWFGDVEAETMYRCGESMLCVVPDISAFREGWRWVRQPVQVPVTLVRNDGIIYSTSLTFTYTPEPGPRPHCSAAGAILRANSSQVPPNESNTNSEGSYTNASTNSTSVTSSTATVVS']
in the file there are many items and each item is a sequence of characters. I want to make a new list in which every item has only one W. the expected output for the small example would be like the expected output.
expected output:
['MLEEDMEVAIKMVVVGNGAVGKSSMIQRYCKGIFTKDYKKTIGVDFLERQIQVNDEDVRLMLWDTAGQEEFDAITKAYYRGAQACVLVFSTTDRESFEAV']
I am trying to do that in python and wrote the following code:
newlist = []
for item in mylist:
for c in item:
if c == W:
newlist.append(item)
but it does not return what I want. do you know how to fix it?
Use .count
Ex:
res = []
mylist = ['MLEEDMEVAIKMVVVGNGAVGKSSMIQRYCKGIFTKDYKKTIGVDFLERQIQVNDEDVRLMLWDTAGQEEFDAITKAYYRGAQACVLVFSTTDRESFEAV', 'MDHTEGSPAEEPPAHAPSPGKFGERPPPKRLTREAMRNYLKERGDQTVLILHAKVAQKSYGNEKRFFCPPPCVYLMGSGWKKKKEQMERDGCSEQESQPCAFIGIGNSDQEMQQLNLEGKNYCTAKTLYISDSDKRKHFMLSVKMFYGNSDDIGVFLSKRIKVISKPSKKKQSLKNADLCIASGTKVALFNRLRSQTVSTRYLHVEGGNFHASSQQWGAFFIHLLDDDESEGEEFTVRDGYIHYGQTVKLVCSVTGMALPRLIIRKVDKQTALLDADDPVSQLHKCAFYLKDTERMYLCLSQERIIQFQATPCPKEPNKEMINDGASWTIISTDKAEYTFYEGMGPVLAPVTPVPVVESLQLNGGGDVAMLELTGQNFTPNLRVWFGDVEAETMYRCGESMLCVVPDISAFREGWRWVRQPVQVPVTLVRNDGIIYSTSLTFTYTPEPGPRPHCSAAGAILRANSSQVPPNESNTNSEGSYTNASTNSTSVTSSTATVVS']
for item in mylist:
if item.count("W") == 1:
res.append(item)
print(res)
or
res = [item for item in mylist if item.count("W") == 1]
Output:
['MLEEDMEVAIKMVVVGNGAVGKSSMIQRYCKGIFTKDYKKTIGVDFLERQIQVNDEDVRLMLWDTAGQEEFDAITKAYYRGAQACVLVFSTTDRESFEAV']
The problem is you are iterating each character in each string and appending when a condition is met. Moreover, your logic can't "undo" a list.append operation if another W is found. So if W is met twice in a string, you are appending twice.
Instead, you can use a list comprehension with list.count:
res = [i for i in L if i.count('W') == 1]

How to make list with sublists based on pattern(word) from a input list?

I have a list where particular block of statements separated by foo like follows:
a=['E3P.B99990001.pdb_138:6.923:0.241:6.116', 'E3P.B99990001.pdb_397:15.856:3.506:8.144', 'foo',
'E3P.B99990002.pdb_138:4.499:4.286:8.260', 'E3P.B99990002.pdb_397:14.897:3.238:9.338', 'foo']
Here, I want to make a Mainlist with sublists for each blog which is seperated by pattern "foo" like follows.
Mainlist=[
['E3P.B99990001.pdb_138:6.923:0.241:6.116',
'E3P.B99990001.pdb_397:15.856:3.506:8.144']#sublist1 (read input list values until first "foo" and make first sublist)
['E3P.B99990002.pdb_138:6.923:0.241:6.116',
'E3P.B99990002.pdb_397:15.856:3.506:8.144']#sublist2 (read input list values until first "foo" to second "foo" and make second sublist)
Main Idea is make a different sublist by using "foo" is delimiter
]
I hope its understandable. if some one knows could you help me out of it.
Thanking you in advance
CODE BASED ON #Brien gives exact answer:
sub = []
for item in a:
if item == 'foo':
ATOM_COORDINATE.append(a)
sub = []
else:
a.append(item)
print sub
OUTPUT:
[
['E3P.B99990001.pdb_138:6.923:0.241:6.116', 'E3P.B99990001.pdb_397:15.856:3.506:8.144', 'E3P.B99990001.pdb_424:8.558:1.315:6.627', 'E3P.B99990001.pdb_774:14.204:-5.490:24.812', 'E3P.B99990001.pdb_865:15.545:4.258:10.007', 'E3P.B99990001.pdb_929:16.146:-6.081:24.770'],
['E3P.B99990002.pdb_138:4.499:4.286:8.260', 'E3P.B99990002.pdb_397:14.897:3.238:9.338', 'E3P.B99990002.pdb_424:5.649:5.914:8.639', 'E3P.B99990002.pdb_774:12.114:-6.864:23.897', 'E3P.B99990002.pdb_865:15.200:3.910:11.227', 'E3P.B99990002.pdb_929:13.649:-6.894:22.589']
]
# assuming your original list is called biglist
Mainlist = []
sublist = []
for item in biglist:
if item == 'foo':
Mainlist.append(sublist)
sublist = []
else:
sublist.append(item)

python modify item in list, save back in list

I have a hunch that I need to access an item in a list (of strings), modify that item (as a string), and put it back in the list in the same index
I'm having difficulty getting an item back into the same index
for item in list:
if "foo" in item:
item = replace_all(item, replaceDictionary)
list[item] = item
print item
now I get an error
TypeError: list indices must be integers, not str
due to this line list[item] = item
which makes sense! but I do not know how to put the item back into the list at that same index using python
what is the syntax for this? Ideally the for loop can keep track of the index I am currently at
You could do this:
for idx, item in enumerate(list):
if 'foo' in item:
item = replace_all(...)
list[idx] = item
You need to use the enumerate function: python docs
for place, item in enumerate(list):
if "foo" in item:
item = replace_all(item, replaceDictionary)
list[place] = item
print item
Also, it's a bad idea to use the word list as a variable, due to it being a reserved word in python.
Since you had problems with enumerate, an alternative from the itertools library:
for place, item in itertools.zip(itertools.count(0), list):
if "foo" in item:
item = replace_all(item, replaceDictionary)
list[place] = item
print item
A common idiom to change every element of a list looks like this:
for i in range(len(L)):
item = L[i]
# ... compute some result based on item ...
L[i] = result
This can be rewritten using enumerate() as:
for i, item in enumerate(L):
# ... compute some result based on item ...
L[i] = result
See enumerate.
For Python 3:
ListOfStrings = []
ListOfStrings.append('foo')
ListOfStrings.append('oof')
for idx, item in enumerate(ListOfStrings):
if 'foo' in item:
ListOfStrings[idx] = "bar"

Deleting dicts with near-duplicate values from a list of dicts - Python

I want to clean up a list of dicts, according to the following rules:
1) The list of dicts is already sorted, so the earlier dicts are preferred.
2) In the lower dicts, if the ['name'] and ['code'] string values match with the same key values of any dict higher up on the list, and if the absolute value of the difference of the int(['cost']) between those 2 dicts is < 2; then that dict is assumed to be a duplicate of the earlier dict, and is deleted from the list.
Here is one dict from the list of dicts:
{
'name':"ItemName",
'code':"AAHFGW4S",
'from':"NDLS",
'to':"BCT",
'cost':str(29.95)
}
What is the best way to delete duplicates like this?
There may be a more pythonic way of doing this but this is the basic pseudocode:
def is_duplicate(a,b):
if a['name'] == b['name'] and a['cost'] == b['cost'] and abs(int(a['cost']-b['cost'])) < 2:
return True
return False
newlist = []
for a in oldlist:
isdupe = False
for b in newlist:
if is_duplicate(a,b):
isdupe = True
break
if not isdupe:
newlist.append(a)
Since you say the cost are integers you can use that:
def neardup( items ):
forbidden = set()
for elem in items:
key = elem['name'], elem['code'], int(elem['cost'])
if key not in forbidden:
yield elem
for diff in (-1,0,1): # add all keys invalidated by this
key = elem['name'], elem['code'], int(elem['cost'])-diff
forbidden.add(key)
Here is a less tricky way that really calculates the difference:
from collections import defaultdict
def neardup2( items ):
# this is a mapping `(name, code) -> [cost1, cost2, ... ]`
forbidden = defaultdict(list)
for elem in items:
key = elem['name'], elem['code']
curcost = float(elem['cost'])
# a item is new if we never saw the key before
if (key not in forbidden or
# or if all the known costs differ by more than 2
all(abs(cost-curcost) >= 2 for cost in forbidden[key])):
yield elem
forbidden[key].append(curcost)
Both solutions avoid rescanning the whole list for every item. After all, the cost only gets interesting if (name, code) are equal, so you can use a dictionary to look up all candidates fast.
Kind of a convoluted problem but I think something like this would work:
for i, d in enumerate(dictList):
# iterate through the list of dicts, starting with the first
for k,v in d.iteritems():
# for each key-value pair in this dict...
for d2 in dictList[i:]:
# check against all of the other dicts "beneath" it
# eg,
# if d['name'] == d2['name'] and d['code'] == d2['code']:
# --check the cost stuff here--

Categories

Resources