Related
So, I know I can get a random list from a population using the random module,
l = [0, 1, 2, 3, 4, 8 ,9]
print(random.sample(l, 3))
# [1, 3, 2]
But, how do I get the list of the unselected ones? Do, I need to remove them manually from the list? Or, is there a method to get them too?
Edit: The list l from example doesn't contain the same items multiple times, but when it does I wouldn't want it removed more than it's selected as sample.
l = [0, 1, 2, 3, 4, 8 ,9]
s1 = set(random.sample(l, 3))
s2 = set(l).difference(s1)
>>> s1
{0, 3, 8}
>>> s2
{1, 2, 4, 9}
Update: same items multiple times
You can shuffle your list first and partition your population after in two:
l = [7, 4, 5, 4, 5, 9, 8, 6, 6, 6, 9, 8, 6, 3, 8]
pop = l[:]
random.shuffle(pop)
pop1, pop2 = pop[:3], pop[3:]
>>> pop1
[8, 4, 9]
>>> pop2
[7, 6, 8, 6, 5, 6, 9, 6, 5, 8, 4, 3]
Because your list can contain multiple same items, you can change to the approach below:
import random
l = [0, 1, 2, 3, 4, 8 ,9]
random.shuffle(l)
selected = l[:3]
unselected = l[3:]
print(selected)
# [4, 0, 1]
print(unselected)
# [8, 2, 3, 9]
If you want to keep track of duplicates, you could count the items of each type and compare the population count to the sample count.
If you don't care about the order of items in the population, you could do it like this:
from collections import Counter
import random
population = [1, 1, 2, 2, 9, 7, 9]
sample = random.sample(population, 3)
pop_count = Counter(population)
samp_count = Counter(sample)
unsampled = [
k
for k in pop_count
for i in range(pop_count[k] - samp_count[k])
]
If you care about the order in the population, you could do something like this:
check = sample.copy()
unsampled = []
for val in population:
if val in check:
check.remove(val)
else:
unsampled.append(val)
Or there's this weird list comprehension (not recommended):
check = sample.copy()
unsampled = [
x
for x in population
if x not in check or check.remove(x)
]
The if clause here uses two tricks:
both parts of the test will be Falseish if x is not in check (list.remove() always returns None), and
remove() will only be called if the first part fails, i.e., if x is in check.
Basically, if (and only if) x is in check, it will bomb through and check the next condition, which will also be False (None), but will have the side effect of removing one copy of x from check.
You can do with:
import random
l = [0, 1, 2, 3, 4, 8 ,9]
rand = random.sample(l, 3)
rest = list(set(l) - set(rand))
print(f"initial list: {l}")
print(f"random list: {rand}")
print (f"rest list: {rest}")
Result:
initial list: [0, 1, 2, 3, 4, 8, 9]
random list: [2, 9, 0]
rest list: [8, 1, 3, 4]
This recusion is adapted from http://www.geeksforgeeks.org/print-all-possible-combinations-of-r-elements-in-a-given-array-of-size-n/ and does indeed print out all possible unique combination of arr with length r.
What I want from it is to save all possible combination in a list to further use this algorithm in another program. Why is the values overritten in combArray in the recustion and how do I solve this?
def combRecursive(arr, data, start, end, index, r, combArray):
if index == r:
combArray.append(data)
return combArray
i = start
while True:
if i > end or end - i + 1 < r - index:
break
data[index] = arr[i]
combArray = combRecursive(arr, data, i + 1, end, index + 1, r, combArray)
i += 1
return combArray
def main():
arr = [1, 2, 3, 4, 5]
r = 3
n = len(arr)
data = [9999999, 9999999, 9999999]
combArray = []
combArray = combRecursive(arr, data, 0, n-1, 0, r, combArray)
print("All possible unique combination is: ")
for element in combArray:
print(element)
Result as of now:
[3, 4, 5]
[3, 4, 5]
[3, 4, 5]
[3, 4, 5]
[3, 4, 5]
[3, 4, 5]
[3, 4, 5]
[3, 4, 5]
[3, 4, 5]
[3, 4, 5]
What I want:
[1, 2, 3]
[1, 2, 4]
[1, 2, 5]
[1, 3, 4]
[1, 3, 5]
[1, 4, 5]
[2, 3, 4]
[2, 3, 5]
[2, 4, 5]
[3, 4, 5]
You initialize data, and from then on make changes to it & add it to combArray, which means you are always adding the same array to combArray, so all of its elements are the same. If you want the elements to be distinct arrays, you need to make a new array for each you want to add to combArrays (by, for example, making a copy of data).
I first want to note that my question is different from what's in this link:
finding and replacing elements in a list (python)
What I want to ask is whether there is some known API or conventional way to achieve such a functionality (If it's not clear, a function/method like my imaginary list_replace() is what I'm looking for):
>>> list = [1, 2, 3]
>>> list_replace(list, 3, [3, 4, 5])
>>> list
[1, 2, 3, 4, 5]
An API with limitation of number of replacements will be better:
>>> list = [1, 2, 3, 3, 3]
>>> list_replace(list, 3, [8, 8], 2)
>>> list
[1, 2, 8, 8, 8, 8, 3]
And another optional improvement is that the input to replace will be a list itself, instead of a single value:
>>> list = [1, 2, 3, 3, 3]
>>> list_replace(list, [2, 3], [8, 8], 2)
>>> list
[1, 8, 8, 3, 3]
Is there any API that looks at least similar and performs these operations, or should I write it myself?
Try;
def list_replace(ls, val, l_insert, num = 1):
l_insert_len = len(l_insert)
indx = 0
for i in range(num):
indx = ls.index(val, indx) #it throw value error if it cannot find an index
ls = ls[:indx] + l_insert + ls[(indx + 1):]
indx += l_insert_len
return ls
This function works for both first and second case;
It wont work with your third requirement
Demo
>>> list = [1, 2, 3]
>>> list_replace(list, 3, [3, 4, 5])
[1, 2, 3, 4, 5]
>>> list = [1, 2, 3, 3, 3]
>>> list_replace(list, 3, [8, 8], 2)
[1, 2, 8, 8, 8, 8, 3]
Note
It returns a new list; The list passed in will not change.
how about this, it work for the 3 requirements
def list_replace(origen,elem,new,cantidad=None):
n=0
resul=list()
len_elem=0
if isinstance(elem,list):
len_elem=len(elem)
for i,x in enumerate(origen):
if x==elem or elem==origen[i:i+len_elem]:
if cantidad and n<cantidad:
resul.extend(new)
n+=1
continue
elif not cantidad:
resul.extend(new)
continue
resul.append(x)
return resul
>>>list_replace([1,2,3,4,5,3,5,33,23,3],3,[42,42])
[1, 2, 42, 42, 4, 5, 42, 42, 5, 33, 23, 42, 42]
>>>list_replace([1,2,3,4,5,3,5,33,23,3],3,[42,42],2)
[1, 2, 42, 42, 4, 5, 42, 42, 5, 33, 23, 3]
>>>list_replace([1,2,3,4,5,3,5,33,23,3],[33,23],[42,42,42],2)
[1, 2, 3, 4, 5, 3, 5, 42, 42, 42, 23, 3]
Given this isn't hard to write, and not a very common use case, I don't think it will be in the standard library. What would it be named, replace_and_flatten? It's quite hard to explain what that does, and justify the inclusion.
Explicit is also better than implicit, so...
def replace_and_flatten(lst, searched_item, new_list):
def _replace():
for item in lst:
if item == searched_item:
yield from new_list # element matches, yield all the elements of the new list instead
else:
yield item # element doesn't match, yield it as is
return list(_replace()) # convert the iterable back to a list
I developed my own function, you are welcome to use and to review it.
Note that in contradiction to the examples in the question - my function creates and returns a new list. It does not modify the provided list.
Working examples:
list = [1, 2, 3]
l2 = list_replace(list, [3], [3, 4, 5])
print('Changed: {0}'.format(l2))
print('Original: {0}'.format(list))
list = [1, 2, 3, 3, 3]
l2 = list_replace(list, [3], [8, 8], 2)
print('Changed: {0}'.format(l2))
print('Original: {0}'.format(list))
list = [1, 2, 3, 3, 3]
l2 = list_replace(list, [2, 3], [8, 8], 2)
print('Changed: {0}'.format(l2))
print('Original: {0}'.format(list))
I always print also the original list, so you can see that it is not modified:
Changed: [1, 2, 3, 4, 5]
Original: [1, 2, 3]
Changed: [1, 2, 8, 8, 8, 8, 3]
Original: [1, 2, 3, 3, 3]
Changed: [1, 8, 8, 3, 3]
Original: [1, 2, 3, 3, 3]
Now, the code (tested with Python 2.7 and with Python 3.4):
def list_replace(lst, source_sequence, target_sequence, limit=0):
if limit < 0:
raise Exception('A negative replacement limit is not supported')
source_sequence_len = len(source_sequence)
target_sequence_len = len(target_sequence)
original_list_len = len(lst)
if source_sequence_len > original_list_len:
return list(lst)
new_list = []
i = 0
replace_counter = 0
while i < original_list_len:
suffix_is_long_enough = source_sequence_len <= (original_list_len - i)
limit_is_satisfied = (limit == 0 or replace_counter < limit)
if suffix_is_long_enough and limit_is_satisfied:
if lst[i:i + source_sequence_len] == source_sequence:
new_list.extend(target_sequence)
i += source_sequence_len
replace_counter += 1
continue
new_list.append(lst[i])
i += 1
return new_list
I developed a function for you (it works for your 3 requirements):
def list_replace(lst,elem,repl,n=0):
ii=0
if type(repl) is not list:
repl = [repl]
if type(elem) is not list:
elem = [elem]
if type(elem) is list:
length = len(elem)
else:
length = 1
for i in range(len(lst)-(length-1)):
if ii>=n and n!=0:
break
e = lst[i:i+length]
if e==elem:
lst[i:i+length] = repl
if n!=0:
ii+=1
return lst
I've tried with your examples and it works ok.
Tests made:
print list_replace([1,2,3], 3, [3, 4, 5])
print list_replace([1, 2, 3, 3, 3], 3, [8, 8], 2)
print list_replace([1, 2, 3, 3, 3], [2, 3], [8, 8], 2)
NOTE: never use list as a variable. I need that object to do the is list trick.
I have a list [[1, 2, 7], [1, 2, 3], [1, 2, 3, 7], [1, 2, 3, 5, 6, 7]] and I need [1,2,3,7] as final result (this is kind of reverse engineering). One logic is to check intersections -
while(i<dlistlen):
j=i+1
while(j<dlistlen):
il = dlist1[i]
jl = dlist1[j]
tmp = list(set(il) & set(jl))
print tmp
#print i,j
j=j+1
i=i+1
this is giving me output :
[1, 2]
[1, 2, 7]
[1, 2, 7]
[1, 2, 3]
[1, 2, 3]
[1, 2, 3, 7]
[]
Looks like I am close to getting [1,2,3,7] as my final answer, but can't figure out how. Please note, in the very first list (([[1, 2, 7], [1, 2, 3], [1, 2, 3, 7], [1, 2, 3, 5, 6, 7]] )) there may be more items leading to one more final answer besides [1,2,3,4]. But as of now, I need to extract only [1,2,3,7] .
Please note, this is not kind of homework, I am creating own clustering algorithm that fits my need.
You can use the Counter class to keep track of how often elements appear.
>>> from itertools import chain
>>> from collections import Counter
>>> l = [[1, 2, 7], [1, 2, 3], [1, 2, 3, 7], [1, 2, 3, 5, 6, 7]]
>>> #use chain(*l) to flatten the lists into a single list
>>> c = Counter(chain(*l))
>>> print c
Counter({1: 4, 2: 4, 3: 3, 7: 3, 5: 1, 6: 1})
>>> #sort keys in order of descending frequency
>>> sortedValues = sorted(c.keys(), key=lambda x: c[x], reverse=True)
>>> #show the four most common values
>>> print sortedValues[:4]
[1, 2, 3, 7]
>>> #alternatively, show the values that appear in more than 50% of all lists
>>> print [value for value, freq in c.iteritems() if float(freq) / len(l) > 0.50]
[1, 2, 3, 7]
It looks like you're trying to find the largest intersection of two list elements. This will do that:
from itertools import combinations
# convert all list elements to sets for speed
dlist = [set(x) for x in dlist]
intersections = (x & y for x, y in combinations(dlist, 2))
longest_intersection = max(intersections, key=len)
Is there a way to compare all elements of a list (ie one such as [4, 3, 2, 1, 4, 3, 2, 1, 4]) to all others and return, for each element, the number of other elements it is different from (ie, for the list above [6, 7, 7, 7, 6, 7, 7, 7, 6])? I then will need to add the numbers from this list.
li = [4, 3, 2, 1, 4, 3, 2, 1, 4]
from collections import Counter
c = Counter(li)
print c
length = len(li)
print [length - c[el] for el in li]
Creating c before executing [length - c[el] for el in li] is better than doing count(i) for each element i of the list, because that means that count() do the same count several times (each time it encounters a given element, it counts it)
By the way, another way to write it:
map(lambda x: length-c[x] , li)
You can get similar counter with count() method.
And subtract the total number.
Do it in one line with a comprehension list.
>>> l = [4, 3, 2, 1, 4, 3, 2, 1, 4]
>>> [ len(l)-l.count(i) for i in l ]
[6, 7, 7, 7, 6, 7, 7, 7, 6]
For Python 2.7:
test = [4, 3, 2, 1, 4, 3, 2, 1, 4]
length = len(test)
print [length - test.count(x) for x in test]
You could just use the sum function, along with a generator expression.
>>> l = [4, 3, 2, 1, 4, 3, 2, 1, 4]
>>> length = len(l)
>>> print sum(length - l.count(i) for i in l)
60
The good thing about a generator expression is that you don't create an actual list in memory, but functions like sum can still iterate over them and produce the desired result. Note, however, that once you iterate over a generator once, you can't iterate over it again.