Related
I want to compare tuples in two or more lists and print out the intersection of them. I have 25 element (which includes empty) in every tuple and tuple count changes in every list.
So far I have tried taking intersection of two lists, the code that I used can be seen below :
res_final = set(tuple(x) for x in res).intersection(set(tuple(x) for x in res1))
output:
set()
(res and res1 are my lists)
Hope this example helps:
import numpy as np
np.random.seed(0) # random seed for repeatability
a_ = np.random.randint(15,size=(1000,2)) # create random data for tuples
b_ = np.random.randint(15,size=(1000,2)) # create random data for tuples
a, b = set(tuple(d) for d in a_), set(tuple(d) for d in b_) # set of tuples
intersection = a&b # intersection
print(intersection) # result
In the code, matrices of random variables are created, then the rows are converted to tuples. Then we get the set of tuples and finally the important part for you, the intersection of the tuples.
If your input looks something like this:
in_1 = [(1, 1), (2, 2), (3, 3)]
in_2 = [(4, 4), (5, 5), (1, 1)]
in_3 = [(6, 6), (7, 7), (1, 1)]
ins = [in_1, in_2, in_3]
then I think you can use itertools.combinations to find pairwise intersections, and then take a set from them in order to remove duplicates.
from itertools import combinations
intersected = []
for first, second in combinations(ins, 2):
elems = set(first).intersection(set(second))
intersected.extend(elems)
dedup_intersected = set(intersected)
print(dedup_intersected)
# {(1, 1)}
I am a newbee in python and programing, I am trying to come up with combinations and weed out combinations with certain conditions.
So in the case below, I have tried to generate all possible combinations between 1-100. But I don't know where to go after this.
import itertools
i_list = []
for i in range (1, 101):
i_list.append(i)
comb = itertools.combinations(i_list,2)
for combinations in list(comb):
print (combinations)
This runs fine and will generate a list from 1-100, and give me an output of
(1,2) (1,3).........(98,99) (98,100) (99,100)
Now my goal is to weed out the combinations with a difference < 5, so for example: (1,2) the difference is less than 5, so it should not be outputted. (1,8) the difference is greater than 5, so it should be outputted. I hope that make sense.
Can anyone guide me through the thought process and suggest an easy approach?
You can use itertools.filterfalse for this and then iterate over the result.
Also, with iterators, you want to wait until you really need a list before you convert to a list with list(). There's no reason to ever do that in this case because you are always iterating. This allows you to work with very large sets without taking up the memory and time of running through the iterator just to make a list to then iterate the list:
from itertools import combinations, filterfalse
comb = combinations(range(1, 101),2)
filtered = filterfalse(lambda x: abs(x[0] - x[1]) < 5, comb)
for combinations in filtered:
print (combinations)
The iterators produced by range(), combinations and fitleredfalse are all lazy, so they never start evaluating until you start looping over them. This allows you to defer any work until it needs to be done or to iterate over part of a large set without calculating the entire thing.
You can use a list comprehension to restrict the generated values to be kept inside the list:
from itertools import combinations
comb = [ x for x in combinations(range(1,101),2) if x[1]-x[0]>4 ]
print (comb)
Output:
[(1, 6), (1, 7), (1, 8), ... snipp ..., (93, 99), (93, 100), (94, 99), (94, 100), (95, 100)]
combinations respects the order of numbers so no abs() around x[1]-x[0] needed - range itself is a sequence and your resulting list weeds out all numbers you do not want due to the if x[1]-x[0]>4 condition.
This should accomplish what you are asking:
>>> import itertools
>>> combinations = itertools.combinations(range(1, 101), 2)
>>> generator = ((a, b) for a, b in combinations if b - a >= 5)
>>> for pair in generator:
print(pair, end=' ')
(1, 6) (1, 7) (1, 8) (1, 9) (1, 10) (1, 11) (1, 12) (1, 13) (1, 14) (1, 15) ...
Alternatively, you can try this instead to do the exact same thing:
>>> generator = ((a, b) for a in range(1, 96) for b in range(a + 5, 101))
>>> for pair in generator:
print(pair, end=' ')
(1, 6) (1, 7) (1, 8) (1, 9) (1, 10) (1, 11) (1, 12) (1, 13) (1, 14) (1, 15) ...
Having a string='january' ,
how can I generate following cases:
case1(Replacing 1 character) => taking j and replace it with all ASCII letters(a-z). then do the same with: a , n , u , a , r , y.
Basically we would have
(Aanuary , Banuary ,..... ,Zanuary )+ (jAnuary , jBanuary .....jZanuary) + ....+(januarA , januarB , ....., januarZ)
I have done this part using following code, However, I have no idea how to do it for more than one letter since there are lots of permutations.
monthName= 'january'
asci_letters = ['a' , 'b' , .... , 'z']
lst = list(monthName)
indxs = [i for i , _ in enumerate(monthName)]
oneLetter=[]
for i in indxs:
word = monthName
pos = list(word)
for j in asci_letters:
pos[i] = j
changed = ("".join(pos))
oneLetter.append(changed)
Case2: Taking 2 characters and replacing them:
(AAnuary , ABnuary ,.....,AZanuary) + (BAnuary , BBanuary, .... , BZanuary) + (AaAuary , AaBuary,.....,AaZuary) + ...... + (januaAB , .... , januaAZ)
Case3 : doing the same for 3 characters
Case7: doing the same for 7 characters(length of string)
To summarize, I want to create all possible cases of replacing, 1 letter, 2 letters,3 letters, up to all letters of a string.
It's very likely that you can't hold all these permutations in memory because it will quickly become very crowded.
But to get all indices for the cases you can use itertools.combinations. For 1 it will give the single indices:
from itertools import combinations
string_ = 'january'
length = len(string_)
print(list(combinations(range(length), 1)))
# [(0,), (1,), (2,), (3,), (4,), (5,), (6,)]
Likewise you can get the indices for case 2-7:
print(list(combinations(range(length), 2)))
# [(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6), (1, 2), (1, 3), (1, 4),
# (1, 5), (1, 6), (2, 3), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6),
# (4, 5), (4, 6), (5, 6)]
Then it's just a matter of inserting the itertools.product of string.ascii_uppercase at the given indices:
from itertools import product
import string
print(list(product(string.ascii_uppercase, repeat=1)))
# [('A',), ('B',), ('C',), ('D',), ('E',), ('F',), ('G',), ('H',), ('I',),
# ('J',), ('K',), ('L',), ('M',), ('N',), ('O',), ('P',), ('Q',), ('R',),
# ('S',), ('T',), ('U',), ('V',), ('W',), ('X',), ('Y',), ('Z',)]
Likewise for different repeats given the "case".
Putting this all together:
def all_combinations(a_string, case):
lst = list(a_string)
length = len(lst)
for combination in combinations(range(length), case):
for inserter in product(string.ascii_uppercase, repeat=case):
return_string = lst.copy()
for idx, newchar in zip(combination, inserter):
return_string[idx] = newchar
yield ''.join(return_string)
Then you can get all desired permutations for each case by:
list(all_combinations('january', 2)) # case2
list(all_combinations('january', 4)) # case4
list(all_combinations('january', 7)) # case7
Or if you need all of them:
res = []
for case in [1, 2, 3, 4, 5, 6, 7]:
res.extend(all_combinations('january', case))
But that will require a lot of memory.
You can use itertools.combinations_with_replacement for this, which gives you an iterator with all permutations:
from itertools import combinations_with_replacement
# First Param is an iterable of possible values, second the length of the
# resulting permutations
combinations = combinations_with_replacement('ABCDEFGHIJKLMNOPQRSTUVWXYZ',7)
# Then you can iterate like this:
for combination in combinations:
#Do Stuff here
Don't try to convert this iterator to a list of all values, because you probably gonna get a MemoryException.
For your distance you might want to use python distance package. (You need to install it via pip first).
For your case, that you want to get all combinations for Characters a-z with length = 7 (because of January):
import distance
from itertools import combinations_with_replacement
str_to_compary_with = "JANUARY"
for i in range(len(str_to_compare_with):
combinations = combinations_with_replacement('ABCDEFGHIJKLMNOPQRSTUVWXYZ', i+1)
# Then you can iterate like this:
for combination in combinations:
# This is calculating the hamming distance for the combination with the string you want to compare to
# Here you have to figure out yourself if you want to save that output to a file or whatever you wanna do with the distance
hamming_dist = distance.hamming(''.join(combination), str_to_compare_with)
This should do everything that you wanted with help of product and permutations:
from itertools import product, permutations
monthName= 'january'
letters = list('abcdefghijklmnopqrstuvwxyz')
n = len(monthName)
indxs = range(n)
mn = list(monthName)
cases = {k: [] for k in range(2, n+1)}
for num in range(2, n+1):
letter_combos = list(product(*[letters for _ in range(num)]))
positions = permutations(indxs, num)
for p in positions:
for l in letter_combos:
l = iter(l)
for i in p:
mn[i] = next(l)
mn = ''.join(mn)
cases[num].append(mn)
mn = list(monthName)
If you want to know how it is working, you can test this with a subset of letters, say from A-F:
x = []
for i in range(65,70): #subset of letters
x.append(chr(i))
def recurse(string,index,arr):
if(index>len(string)-1):
return
for i in range(index,len(string)):
for item in x:
temp = string[:i]+item+string[i+1:]
arr.append(temp)
recurse(temp,i+1,arr)
arr = []
recurse('abc',0,arr)
print arr
I have a list of tuples (num, id):
l = [(1000, 1), (2000, 2), (5000, 3)]
The second element of each tuple contains the identifier. Say that I want to remove the tuple with the id of 2, how do I do that?
I.e. I want the new list to be: l = [(1000,1), (5000, 3)]
I have tried l.remove(2) but it won't work.
You can use a list comprehension with a filter to achieve this.
l = [(1000, 1), (2000, 2), (5000, 3)]
m = [(val, key) for (val, key) in l if key != 2]
Or using filter:
l = [(1000, 1), (2000, 2), (5000, 3)]
print(list(filter(lambda x: x[1] != 2, l)))
output:
[(1000, 1), (5000, 3)]
That's because the value 2 is not in the list. Instead, something like the below: form a list of the second elements in your tuples, then remove the element at that position.
del_pos = [x[1] for x in l].index(2)
l.pop(del_pos)
Note that this removes only the first such element. If your instance is not unique, then use one of the other solutions. I believe that this is faster, but handles only the single-appearance case.
Simple list comprehension:
[x for x in l if x[1] != 2]
You may do it with:
>>> l = [(1000, 1), (2000, 2), (5000, 3)]
>>> for i in list(l):
... if i[1] == 2:
... l.remove(i)
In case you want to remove only first occurence, add break below remove line
One more possible solution
r = [(1000, 1), (2000, 2), (5000, 3)]
t = [i for i in r if i.count(2) == 0]
As part of learning Python I have set myself some challenges to see the various ways of doing things. My current challenge is to create a list of pairs using list comprehension. Part one is to make a list of pairs where (x,y) must not be the same(x not equal y) and order matters((x,y) not equal (y,x)).
return [(x,y) for x in listOfItems for y in listOfItems if not x==y]
Using my existing code is it possible to modify it so if (x,y) already exists in the list as (y,x) exclude it from the results? I know I could compare items after words, but I want to see how much control you can have with list comprehension.
I am using Python 2.7.
You should use a generator function here:
def func(listOfItems):
seen = set() #use set to keep track of already seen items, sets provide O(1) lookup
for x in listOfItems:
for y in listOfItems:
if x!=y and (y,x) not in seen:
seen.add((x,y))
yield x,y
>>> lis = [1,2,3,1,2]
>>> list(func(lis))
[(1, 2), (1, 3), (1, 2), (2, 3), (1, 2), (1, 3), (1, 2), (2, 3)]
def func(seq):
seen_pairs = set()
all_pairs = ((x,y) for x in seq for y in seq if x != y)
for x, y in all_pairs:
if ((x,y) not in seen_pairs) and ((y,x) not in seen_pairs):
yield (x,y)
seen_pairs.add((x,y))
Alternatively, you can also use generator expression (here: all_pairs) which is like list comprehension, but lazy evaluated. They are very helpful, especially when iterating over combinations, products etc.
Using product and ifilter as well as the unique_everseen recipe from itertools
>>> x = [1, 2, 3, 1, 2]
>>> x = product(x, x)
>>> x = unique_everseen(x)
>>> x = ifilter(lambda z: z[0] != z[1], x)
>>> for y in x:
... print y
...
(1, 2)
(1, 3)
(2, 1)
(2, 3)
(3, 1)
(3, 2)