nested for loops in python with lists - python

Folks - I have two lists
list1=['a','b']
list2=['y','z']
I would like to send the variables to a function like below:
associate_address(list1[0],list2[0])
associate_address(list1[1],list2[1])
my script:
for l in list1:
for i in list2:
conn.associate_address(i,l)
I receive the below output:
conn.associate_address(a,y)
conn.associate_address(a,z)
I would like it to look like this:
conn.associate_address(a,y)
conn.associate_address(b,z)

Use the zip function, like this:
list1=['a','b']
list2=['y','z']
for i, j in zip(list1, list2):
print(i, j)
Output:
('a', 'y')
('b', 'z')

Why do you suppose this is?
>>> for x in [1,2]:
... for y in ['a','b']:
... print x,y
...
1 a
1 b
2 a
2 b
Nested loops will be performed for each iteration in their parent loop. Think about truth tables:
p q
0 0
0 1
1 0
1 1
Or combinations:
Choose an element from a set of two elements.
2 C 1 = 2
Choose one element from each set, where each set contains two elements.
(2 C 1) * (2 C 1) = 4
Let's say you have a list of 10 elements. Iterating over it with a for loop will take 10 iterations. If you have another list of 5 elements, iterating over it with a for loop will take 5 iterations. Now, if you nest these two loops, you will have to perform 50 iterations to cover every possible combination of the elements of each list.
You have many options to solve this.
# use tuples to describe your pairs
lst = [('a','y'), ('b','z')]
for pair in lst:
conn.associate_address(pair[0], pair[1])
# use a dictionary to create a key-value relationship
dct = {'a':'y', 'b':'z'}
for key in dct:
conn.associate_address(key, dct[key])
# use zip to combine pairwise elements in your lists
lst1, lst2 = ['a', 'b'], ['y', 'z']
for p, q in zip(lst1, lst2):
conn.associate_address(p, q)
# use an index instead, and sub-index your lists
lst1, lst2 = ['a', 'b'], ['y', 'z']
for i in range(len(lst1)):
conn.associate_address(lst1[i], lst2[i])

I would recommend using a dict instead of 2 lists since you clearly want them associated.
Dicts are explained here
Once you have your dicts set up you will be able to say
>>>mylist['a']
y
>>>mylist['b']
z

Related

How to combine the elements of two lists using zip function in python?

I have two different lists and I would like to know how I can get each element of one list print with each element of another list. I know I could use two for loops (each for one of the lists), however I want to use the zip() function because there's more that I will be doing in this for loop for which I will require parallel iteration.
I therefore attempted the following but the output is as shown below.
lasts = ['x', 'y', 'z']
firsts = ['a', 'b', 'c']
for last, first in zip(lasts, firsts):
print (last, first, "\n")
Output:
x a
y b
z c
Expected Output:
x a
x b
x c
y a
y b
y c
z a
z b
z c
I believe the function you are looking for is itertools.product:
lasts = ['x', 'y', 'z']
firsts = ['a', 'b', 'c']
from itertools import product
for last, first in product(lasts, firsts):
print (last, first)
x a
x b
x c
y a
y b
y c
z a
z b
z c
Another alternative, that also produces an iterator is to use a nested comprehension:
iPairs=( (l,f) for l in lasts for f in firsts)
for last, first in iPairs:
print (last, first)
Honestly I think you wont be able to do it with zip because you are searching for another behaviour.
To use syntactic sugar and make it work with zip will just void your debugging experience.
But if you would drag me by the knee:
zip([val for val in l1 for _ in range(len(l2))],
[val for _ in l1 for val in l2])
where you first duplicate the first list to get xxxyyyzzz and duplicate the second list with abcabcabc
or
for last,first in [(l,f) for l in lasts for f in firsts]:
print(last, first, "\n")

Chained Comparison with Loop Explanation in Python

Beginner Here! I came across some python code about the zip() function being combined with the sum() function, but the code does not make sense to me and I was wondering if I could get an explanation:
list_1 = ['a', 'a', 'a', 'b']
list_2 = ['a', 'b', 'b', 'b', 'c']
print(sum(a != b for a, b in zip(list_1, list_2)))
a and b are not defined, but are being compared? Is it also looping through "a" with b for a? What is a and b in this case? How are they being added together with sum()? What is being looped through? If I can have some help understanding this, it would be greatly appreciated.
Thanks in advance!
When confronted with code like this, it's helpful to break it into bite-sized pieces and see what each does. Here's an annotated version:
list_1 = ['a', 'a', 'a', 'b']
list_2 = ['a', 'b', 'b', 'b', 'c']
print(list(zip(list_1, list_2))) # you need to pass this to list() because zip is a lazy iterator
# correponding pairs from each list
# zip() trucates to the shortest list, so `c` is ignored
# [('a', 'a'), ('a', 'b'), ('a', 'b'), ('b', 'b')]
print([(a, b) for a, b in zip(list_1, list_2)])
# same thing as above using a list comprehension
# loops over each pair in the zip and makes a tuple of (a,b)
print([a != b for a, b in zip(list_1, list_2)])
# [False, True, True, False]
# compare each item in those pairs. Are they different?
print(sum(a != b for a, b in zip(list_1, list_2)))
# 2
# take advantage of the fact that True = 1 and False = 0
# and sum those up -> 0 + 1 + 1 + 0
It's also helpful for lookup things like zip(), and list comprehensions, although for many it makes more sense when you see them in action.
The for construct in the code is a generator. This form of generator is typically seen in a list comprehension, but it can also be passed directly to a function that wants an iterable, such as sum.
If you want to see what the generator actually produces, you can do:
x = [a != b for a, b in zip(list_1, list_2)]
This is equivalent to:
x = list(a != b for a, b in zip(list_1, list_2))
The values in the list x are bool values that are True where values in the two lists compare unequal and False where they compare equal. If one list is longer than the other, the values past the shorter list are skipped.
Back to your code, instead of creating a list, it's left as a generator and passed directly to sum, which will operate on any iterable. Since bool values are just integers (with False = 0 and True = 1), this just sums the number of differing values between the two lists (ignoring the extra values in the longer list).

partial intersection - multiple groups

I am not sure how to approach my problem, thus I haven't been able to see if it already exists (apologies in advance)
Group Item
A 1
A 2
A 3
B 1
B 3
C 1
D 2
D 3
I want to know all combinations of groups that share more than X items (2 in this example). And I want to know which items they share.
RESULT:
A-B: 2 (item 1 and item 3)
A-D: 2 (item 2 and item 3)
The list of groups and items is really long and the maximum number of item matches across groups is probably not more than 3-5.
NB More than 2 groups can have shared items - e.g. A-B-E: 3
So it's not sufficient to only compare two groups at a time. I need to compare all combination of groups.
My thoughts
First round: one pile of all groups - are at least two values shared amongst all?
Second round: All-1 group (all combinations)
Third round: All-2 groups (all combinations)
Untill I reach the comparison between only two groups (all combinations).
However this seems super heavy performance-wise!! And I have no idea of how to do this.
What are your thoughts?
Thanks!
Unless you have additional information to restrict the search, I would just process all subsets (having size >= 2) of the set of unique groups.
For each subset, I would search the items belonging to all members of the set:
a = df['Group'].unique()
for cols in chain(*(combinations(a, i) for i in range(2, len(a) + 1))):
vals = df['Item'].unique()
for col in cols:
vals = df.loc[(df.Group==col)&(df.Item.isin(vals)), 'Item'].unique()
if len(vals) > 0: print(cols, vals)
it gives:
('A', 'B') [1 3]
('A', 'C') [1]
('A', 'D') [2 3]
('B', 'C') [1]
('B', 'D') [3]
('A', 'B', 'C') [1]
('A', 'B', 'D') [3]
This is how I would approach the problem, it may not be the most efficient way to deal with it, but it has the merit to be clear.
List for each group, all items possessed by the group.
Then for each pair of group, list all shared items (for instance, for all items of group A, check if it is an item of group B).
Check if the number of shared items is higher than your threshold X.
It's not an off-the-shelf function, but it should be rather easy (or at least a good exercise) to implement.
Have fun !
Here is new solution that will work with all combinations
Steps:
get dataframe "grouped" which groups/lists all groups the the item is in
from each row of grouped get all possible combinations of group which has some common items
from "grouped" dataframe count for each combination if there are 2 or more common items add that in dictionary
Note: It only loop through group combinations that has common items so if you have lots of groups its already filters out huge part of possible combinations that don't have common items
import numpy as np
import pandas as pd
from itertools import combinations
d = {
"Group": "A,A,A,B,B,C,D,D".split(","),
"Item": [1,2,3,1,3,1,2,3]
}
df = pd.DataFrame(d)
grouped = df.groupby("Item").apply(lambda x: list(x.Group))
all_combinations_with_common = [sorted(combinations(item, i)) for item in grouped
for i in range(2, len(item)) if len(item)>=2]
all_combinations_with_common = np.concatenate(all_combinations_with_common)
commons = {}
REPEAT_COUNT = 2
for comb in all_combinations_with_common:
items = grouped.apply(lambda x: np.all(np.in1d(comb, x)))
if sum(items)>=REPEAT_COUNT:
commons["-".join(comb)] = grouped[items].index.values
display(commons)
output
{'A-B': array([1, 3]), 'A-D': array([2, 3])}

Check if there are 2 or 3 elements with same value in a list/tuple/etc.

I have a 5-element list and I want to know if there are 2 or 3 equal elements (or two equal and three equal). This "check" would be a part of if condition. Let's say I'm too lazy or stupid to write:
if (a==b and c==d and c==e) or .......... or .........
i know it might be written like this, but not exactly how:
if (a==b and (c==b and (c==e or ....
How do I do it? I also know that you can write something similar to this:
if (x,y for x in [5element list] for y in [5element list] x==y, x not y:
If you just want to check for multiple occurences and the objects are of an hashable type, the following solution could work for you.
You could create a list of your objects
>>>l = [a, b, c, d, e]
Then, you could create a set from the same list and compare the length of the list and the length of the set. If the set has less elements than the list, you know you must have multiple occurences.
>>>if (len(set(l)) < len(l)):
...
Use count. You just want [i for i in myList if myList.count(i) > 1]. This list contains the repeated elements, if it's non-empty you have repeated elements.
Edit: SQL != python, removed 'where', also this'll get slow for bigger lists, but for 5 elements it'll be fine.
You can use collections.Counter, which counts the occurrence of every element in your list.
Once you have the count, just check that your wanted values (2 or 3) are present among the counts.
from collections import Counter
my_data = ['a', 'b', 'a', 'c', 'd']
c=Counter(my_data)
counts = set(c.values())
print 2 in counts or 3 in counts

How to copy data in Python

After entering a command I am given data, that I then transform into a list. Once transformed into a list, how do I copy ALL of the data from that list [A], and save it - so when I enter a command and am given a second list of data [B], I can compare the two; and have data that is the same from the two lists cancel out - so what is not similar between [A] & [B] is output. For example...
List [A]
1
2
3
List [B]
1
2
3
4
Using Python, I now want to compare the two lists to each other, and then output the differences.
Output = 4
Hopefully this makes sense!
You can use set operations.
a = [1,2,3]
b = [1,2,3,4]
print set(b) - set(a)
to output the data in list format you can use the following print statement
print list(set(b) - set(a))
>>> b=[1,2,3,4]
>>> a=[1,2,3]
>>> [x for x in b if x not in a]
[4]
for element in b:
if element in a:
a.remove(element)
This answer will return a list not a set, and should take duplicates into account. That way [1,2,1] - [1,2] returns [1] not [].
Try itertools.izip_longest
import itertools
a = [1,2,3]
b = [1,2,3,4]
[y for x, y in itertools.izip_longest(a, b) if x != y]
# [4]
You could easily modify this further to return a duple for each difference, where the first item in the duple is the position in b and the second item is the value.
[(i, pair[1]) for i, pair in enumerate(itertools.izip_longest(a, b)) if pair[0] != pair[1]]
# [(3, 4)]
For entering the data use a loop:
def enterList():
result = []
while True:
value = raw_input()
if value:
result.append(value)
else:
return result
A = enterList()
B = enterList()
For comparing you can use zip to build pairs and compare each of them:
for a, b in zip(A, B):
if a != b:
print a, "!=", b
This will truncate the comparison at the length of the shorter list; use the solution in another answer given here using itertools.izip_longest() to handle that.

Categories

Resources