extracting item with most common probability in python list - python

I have a list [[1, 2, 7], [1, 2, 3], [1, 2, 3, 7], [1, 2, 3, 5, 6, 7]] and I need [1,2,3,7] as final result (this is kind of reverse engineering). One logic is to check intersections -
while(i<dlistlen):
j=i+1
while(j<dlistlen):
il = dlist1[i]
jl = dlist1[j]
tmp = list(set(il) & set(jl))
print tmp
#print i,j
j=j+1
i=i+1
this is giving me output :
[1, 2]
[1, 2, 7]
[1, 2, 7]
[1, 2, 3]
[1, 2, 3]
[1, 2, 3, 7]
[]
Looks like I am close to getting [1,2,3,7] as my final answer, but can't figure out how. Please note, in the very first list (([[1, 2, 7], [1, 2, 3], [1, 2, 3, 7], [1, 2, 3, 5, 6, 7]] )) there may be more items leading to one more final answer besides [1,2,3,4]. But as of now, I need to extract only [1,2,3,7] .
Please note, this is not kind of homework, I am creating own clustering algorithm that fits my need.

You can use the Counter class to keep track of how often elements appear.
>>> from itertools import chain
>>> from collections import Counter
>>> l = [[1, 2, 7], [1, 2, 3], [1, 2, 3, 7], [1, 2, 3, 5, 6, 7]]
>>> #use chain(*l) to flatten the lists into a single list
>>> c = Counter(chain(*l))
>>> print c
Counter({1: 4, 2: 4, 3: 3, 7: 3, 5: 1, 6: 1})
>>> #sort keys in order of descending frequency
>>> sortedValues = sorted(c.keys(), key=lambda x: c[x], reverse=True)
>>> #show the four most common values
>>> print sortedValues[:4]
[1, 2, 3, 7]
>>> #alternatively, show the values that appear in more than 50% of all lists
>>> print [value for value, freq in c.iteritems() if float(freq) / len(l) > 0.50]
[1, 2, 3, 7]

It looks like you're trying to find the largest intersection of two list elements. This will do that:
from itertools import combinations
# convert all list elements to sets for speed
dlist = [set(x) for x in dlist]
intersections = (x & y for x, y in combinations(dlist, 2))
longest_intersection = max(intersections, key=len)

Related

Split one list of numbers into several based on a simple condition, lead and lag in lists [duplicate]

This question already has answers here:
Split List By Value and Keep Separators
(8 answers)
Closed 1 year ago.
Is there an easy way to split the list l below into 3 list. I want to cut the list when the sequence starts over. So every list should start with 1.
l= [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4]
l1 = [1, 2, 3,4, 5]
l2=[1,2,3,4]
l3=[1,2,3,4]
My original thought was to look at the lead value and implement a condition inside a for loop that would cut the list when x.lead < x. But how do I use lead and lag when using lists in python?
NumPy solution
import numpy as np
l = [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4]
parts = [list(i) for i in np.split(l,np.flatnonzero(np.diff(l)-1)+1)]
print(parts)
output
[[1, 2, 3, 4, 5], [1, 2, 3, 4], [1, 2, 3, 4]]
Explanation: I first find differences between adjacent elements using numpy.diff, then subtract 1 to be able to use numpy.flatnonzero to find where difference is other than 1, add 1 (note that numpy.diff output length is input length minus 1) to get indices for use in numpy.split, eventually convert it to list, as otherwise you would end with numpy.arrays
What about this:
l = [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4]
one_indices = [i for i, e in enumerate(l) if e == 1]
slices = []
for count, item in enumerate(one_indices):
if count == len(one_indices) - 1:
slices.append((item, None))
else:
slices.append((item, one_indices[count + 1]))
sequences = [l[x[0] : x[1]] for x in slices]
print(sequences)
Out:
[[1, 2, 3, 4, 5], [1, 2, 3, 4], [1, 2, 3, 4]]
Another way without numpy,
l= [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4]
start = 0
newlist = []
for i,v in enumerate(l):
if i!=0 and v==1:
newlist.append(l[start:i])
start = i
newlist.append(l[start:i+1])
print(newlist)
Working Demo: https://rextester.com/RYCV85570

How to weave two lists using recursion in Python

I want to weave two lists and output all the possible results.
For example,
input: two lists l1 = [1, 2], l2 = [3, 4]
output: [1, 2, 3, 4], [1, 3, 2, 4], [1, 3, 4, 2], [3, 1, 2, 4], [3, 1, 4, 2], [3, 4, 1, 2]
Note: I need to keep the order in each list (e.g. 1 is always before 2, and 3 is always before 4)
The way I am solving this is by removing the head from one list, recursing, and then doing the same thing with the other list. The code is below:
all_possibles = []
def weaveLists(first, second, added):
if len(first) == 0 or len(second) == 0:
res = added[:]
res += first[:]
res += second[:]
all_possibles.append(res)
return
cur1 = first[0]
added.append(cur1)
first = first[1:]
weaveLists(first, second, added)
added = added[:-1]
first = [cur1] + first
cur2 = second[0]
added.append(cur2)
second = second[1:]
weaveLists(first, second, added)
added = added[:-1]
second = [cur2] + second
weaveLists([1, 2], [3, 4], [])
print(all_possibles)
The result I got is:
[[1, 2, 3, 4], [1, 3, 2, 4], [1, 3, 4, 2], [1, 3, 1, 2, 4], [1, 3, 1, 4, 2], [1, 3, 1, 4, 1, 2]]
I couldn't figure out why for the last three lists, the heading 1 from the first list is not removed.
Can anyone help? Thanks!
The reason you get those unexpected results is that you mutate added at this place:
added.append(cur1)
...this will affect the caller's added list (unintentionally). While the "undo" operation is not mutating the list:
added = added[:-1]
This creates a new list, and therefore this "undo" action does not roll back the change in the list of the caller.
The easy fix is to replace the call to append with:
added = added + [cur1]
And the same should happen in the second block.
It is easier if you pass the new values for the recursive call on-the-fly, and replace those two code blocks with just:
weaveLists(first[1:], second, added + [first[0]])
weaveLists(first, second[1:], added + [second[0]])
Here is another way to do it: we generate the possible indices of the items of the first list inside the weaved list, and fill the list accordingly.
We can generate the indices with itertools.combinations: it's the combinations of the indices of the weaved list, taking len(first_list) of them each time.
from itertools import combinations
​
def weave(l1, l2):
total_length = len(l1) + len(l2)
# indices at which to put items from l1 in the weaved output
for indices in combinations(range(total_length), r=len(l1)):
out = []
it1 = iter(l1)
it2 = iter(l2)
for i in range(total_length):
if i in indices:
out.append(next(it1))
else:
out.append(next(it2))
yield out
Sample run:
l1 = [1, 2]
l2 = [3, 4]
​
for w in weave(l1, l2):
print(w)
​
[1, 2, 3, 4]
[1, 3, 2, 4]
[1, 3, 4, 2]
[3, 1, 2, 4]
[3, 1, 4, 2]
[3, 4, 1, 2]
Another sample run with a longer list:
l1 = [1, 2]
l2 = [3, 4, 5]
​
for w in weave(l1, l2):
print(w)
​
[1, 2, 3, 4, 5]
[1, 3, 2, 4, 5]
[1, 3, 4, 2, 5]
[1, 3, 4, 5, 2]
[3, 1, 2, 4, 5]
[3, 1, 4, 2, 5]
[3, 1, 4, 5, 2]
[3, 4, 1, 2, 5]
[3, 4, 1, 5, 2]
[3, 4, 5, 1, 2]

Creating a new list based on their position

So there is a list which contains
'3,2,5,4,1','3,1,2,5,4','2,5,1,4,3'
These numbers are part of the same list, HOWEVER they are strings in a list(ie. list 1)
and from this, you say that for the "first row", 3 occurs at position 0, 2 occurs at position 1, 5 at 2 etc.
For the "second row", 3 occurs at position 0, 1 occurs at position 1, 2 occurs at position 2 etc.
I would like to create a loop or anything at all (besides using imported functions) to create a final list which looks like
0: [3, 3, 2]
1: [2, 1, 5]
2: [5, 2, 1]
3: [4, 5, 4]
4: [1, 4, 3]
Transposition of a two-dimensional list can simply be done using zip()
In [1]: l = [[3,2,5,4,1],
...: [3,1,2,5,4],
...: [2,5,1,4,3]]
In [2]: t = list(zip(*l))
In [3]: t
Out[3]: [(3, 3, 2), (2, 1, 5), (5, 2, 1), (4, 5, 4), (1, 4, 3)]
To output that in the format described above:
In [4]: for n,line in enumerate(t):
...: print("{}: {}".format(n, list(line)))
...:
0: [3, 3, 2]
1: [2, 1, 5]
2: [5, 2, 1]
3: [4, 5, 4]
4: [1, 4, 3]
Single line of code using Dictionary comprehension and List comprehension :
>>> { col:[row[col] for row in l] for col in range(len(l[0])) }
=> {0: [3, 3, 2], 1: [2, 1, 5], 2: [5, 2, 1], 3: [4, 5, 4], 4: [1, 4, 3]}
#driver values :
IN : l = [[3,2,5,4,1],
[3,1,2,5,4],
[2,5,1,4,3]]
NOTE to OP : what your output suggests is a Dictionary by looking at its structure. A list cannot be defined in the same manner.
EDIT : Since the OP's list is a list of strings, first convert that to a list of int using map and then continue as above.
>>> l = ['3,2,5,4,1', '3,1,2,5,4', '2,5,1,4,3']
>>> l = [list(map(int,s.split(','))) for s in l]
>>> l
=> [[3, 2, 5, 4, 1], [3, 1, 2, 5, 4], [2, 5, 1, 4, 3]]

Python: Find-replace on lists

I first want to note that my question is different from what's in this link:
finding and replacing elements in a list (python)
What I want to ask is whether there is some known API or conventional way to achieve such a functionality (If it's not clear, a function/method like my imaginary list_replace() is what I'm looking for):
>>> list = [1, 2, 3]
>>> list_replace(list, 3, [3, 4, 5])
>>> list
[1, 2, 3, 4, 5]
An API with limitation of number of replacements will be better:
>>> list = [1, 2, 3, 3, 3]
>>> list_replace(list, 3, [8, 8], 2)
>>> list
[1, 2, 8, 8, 8, 8, 3]
And another optional improvement is that the input to replace will be a list itself, instead of a single value:
>>> list = [1, 2, 3, 3, 3]
>>> list_replace(list, [2, 3], [8, 8], 2)
>>> list
[1, 8, 8, 3, 3]
Is there any API that looks at least similar and performs these operations, or should I write it myself?
Try;
def list_replace(ls, val, l_insert, num = 1):
l_insert_len = len(l_insert)
indx = 0
for i in range(num):
indx = ls.index(val, indx) #it throw value error if it cannot find an index
ls = ls[:indx] + l_insert + ls[(indx + 1):]
indx += l_insert_len
return ls
This function works for both first and second case;
It wont work with your third requirement
Demo
>>> list = [1, 2, 3]
>>> list_replace(list, 3, [3, 4, 5])
[1, 2, 3, 4, 5]
>>> list = [1, 2, 3, 3, 3]
>>> list_replace(list, 3, [8, 8], 2)
[1, 2, 8, 8, 8, 8, 3]
Note
It returns a new list; The list passed in will not change.
how about this, it work for the 3 requirements
def list_replace(origen,elem,new,cantidad=None):
n=0
resul=list()
len_elem=0
if isinstance(elem,list):
len_elem=len(elem)
for i,x in enumerate(origen):
if x==elem or elem==origen[i:i+len_elem]:
if cantidad and n<cantidad:
resul.extend(new)
n+=1
continue
elif not cantidad:
resul.extend(new)
continue
resul.append(x)
return resul
>>>list_replace([1,2,3,4,5,3,5,33,23,3],3,[42,42])
[1, 2, 42, 42, 4, 5, 42, 42, 5, 33, 23, 42, 42]
>>>list_replace([1,2,3,4,5,3,5,33,23,3],3,[42,42],2)
[1, 2, 42, 42, 4, 5, 42, 42, 5, 33, 23, 3]
>>>list_replace([1,2,3,4,5,3,5,33,23,3],[33,23],[42,42,42],2)
[1, 2, 3, 4, 5, 3, 5, 42, 42, 42, 23, 3]
Given this isn't hard to write, and not a very common use case, I don't think it will be in the standard library. What would it be named, replace_and_flatten? It's quite hard to explain what that does, and justify the inclusion.
Explicit is also better than implicit, so...
def replace_and_flatten(lst, searched_item, new_list):
def _replace():
for item in lst:
if item == searched_item:
yield from new_list # element matches, yield all the elements of the new list instead
else:
yield item # element doesn't match, yield it as is
return list(_replace()) # convert the iterable back to a list
I developed my own function, you are welcome to use and to review it.
Note that in contradiction to the examples in the question - my function creates and returns a new list. It does not modify the provided list.
Working examples:
list = [1, 2, 3]
l2 = list_replace(list, [3], [3, 4, 5])
print('Changed: {0}'.format(l2))
print('Original: {0}'.format(list))
list = [1, 2, 3, 3, 3]
l2 = list_replace(list, [3], [8, 8], 2)
print('Changed: {0}'.format(l2))
print('Original: {0}'.format(list))
list = [1, 2, 3, 3, 3]
l2 = list_replace(list, [2, 3], [8, 8], 2)
print('Changed: {0}'.format(l2))
print('Original: {0}'.format(list))
I always print also the original list, so you can see that it is not modified:
Changed: [1, 2, 3, 4, 5]
Original: [1, 2, 3]
Changed: [1, 2, 8, 8, 8, 8, 3]
Original: [1, 2, 3, 3, 3]
Changed: [1, 8, 8, 3, 3]
Original: [1, 2, 3, 3, 3]
Now, the code (tested with Python 2.7 and with Python 3.4):
def list_replace(lst, source_sequence, target_sequence, limit=0):
if limit < 0:
raise Exception('A negative replacement limit is not supported')
source_sequence_len = len(source_sequence)
target_sequence_len = len(target_sequence)
original_list_len = len(lst)
if source_sequence_len > original_list_len:
return list(lst)
new_list = []
i = 0
replace_counter = 0
while i < original_list_len:
suffix_is_long_enough = source_sequence_len <= (original_list_len - i)
limit_is_satisfied = (limit == 0 or replace_counter < limit)
if suffix_is_long_enough and limit_is_satisfied:
if lst[i:i + source_sequence_len] == source_sequence:
new_list.extend(target_sequence)
i += source_sequence_len
replace_counter += 1
continue
new_list.append(lst[i])
i += 1
return new_list
I developed a function for you (it works for your 3 requirements):
def list_replace(lst,elem,repl,n=0):
ii=0
if type(repl) is not list:
repl = [repl]
if type(elem) is not list:
elem = [elem]
if type(elem) is list:
length = len(elem)
else:
length = 1
for i in range(len(lst)-(length-1)):
if ii>=n and n!=0:
break
e = lst[i:i+length]
if e==elem:
lst[i:i+length] = repl
if n!=0:
ii+=1
return lst
I've tried with your examples and it works ok.
Tests made:
print list_replace([1,2,3], 3, [3, 4, 5])
print list_replace([1, 2, 3, 3, 3], 3, [8, 8], 2)
print list_replace([1, 2, 3, 3, 3], [2, 3], [8, 8], 2)
NOTE: never use list as a variable. I need that object to do the is list trick.

How can I get a set of (possibly overlapping) slices in a Python list based on elements that match a criteria?

Suppose I have a python list l=[1,2,3,4,5]. I would like to find all x-element lists starting with elements that satisfy a function f(e), or the sublist going to the end of l if there aren't enough items. For instance, suppose f(e) is e%2==0, and x=3 I'd like to get [[2,3,4],[4,5]].
Is there an elegant or "pythonic" way to do this?
>>> f = lambda e: e % 2 == 0
>>> x = 3
>>> l = [1, 2, 3, 4, 5]
>>> def makeSublists(lst, length, f):
for i in range(len(lst)):
if f(lst[i]):
yield lst[i:i+length]
>>> list(makeSublists(l, x, f))
[[2, 3, 4], [4, 5]]
>>> list(makeSublists(list(range(10)), 5, f))
[[0, 1, 2, 3, 4], [2, 3, 4, 5, 6], [4, 5, 6, 7, 8], [6, 7, 8, 9], [8, 9]]
Using a list comprehension:
>>> l = range(1,6)
>>> x = 3
>>> def f(e):
return e%2 == 0
>>> [l[i:i+x] for i, j in enumerate(l) if f(j)]
[[2, 3, 4], [4, 5]]

Categories

Resources