Randomly remove 'x' elements from a list - python

I'd like to randomly remove a fraction of elements from a list without changing the order of the list.
Say I had some data and I wanted to remove 1/4 of them:
data = [1,2,3,4,5,6,7,8,9,10]
n = len(data) / 4
I'm thinking I need a loop to run through the data and delete a random element 'n' times? So something like:
for i in xrange(n):
random = np.randint(1,len(data))
del data[random]
My question is, is this the most 'pythonic' way of doing this? My list will be ~5000 elements long and I want to do this multiple times with different values of 'n'.
Thanks!

Sequential deleting is a bad idea since deletion in a list is O(n). Instead do something like this:
def delete_rand_items(items,n):
to_delete = set(random.sample(range(len(items)),n))
return [x for i,x in enumerate(items) if not i in to_delete]

You can use random.sample like this:
import random
a = [1,2,3,4,5,6,7,8,9,10]
no_elements_to_delete = len(a) // 4
no_elements_to_keep = len(a) - no_elements_to_delete
b = set(random.sample(a, no_elements_to_keep)) # the `if i in b` on the next line would benefit from b being a set for large lists
b = [i for i in a if i in b] # you need this to restore the order
print(len(a)) # 10
print(b) # [1, 2, 3, 4, 5, 8, 9, 10]
print(len(b)) # 8
Two notes on the above.
You are not modifying the original list in place but you could.
You are not actually deleting elements but rather keeping elements but it is the same thing (you just have to adjust the ratios)
The drawback is the list-comprehension that restores the order of the elements
As #koalo says in the comments the above will not work properly if the elements in the original list are not unique. I could easily fix that but then my answer would be identical to the one posted by#JohnColeman. So if that might be the case just use his instead.

Is the order meaningful?
if not you can do something like:
shuffle(data)
data=data[:len(data)-n]

I suggest using numpy indexing as in
import numpy as np
data = np.array([1,2,3,4,5,6,7,8,9,10])
n = len(data)/4
indices = sorted(np.random.choice(len(data),len(data)-n,replace=False))
result = data[indices]

I think it will be more convenient this way:
import random
n = round(len(data) *0.3)
for i in range(n):
data.pop(random.randrange(len(data)))

Related

how to increase list valuesby list average value

I want to increase my values in list by average value of list.
lista1 = [2,3,4,5,6]
dlugoscListy = len(lista1)
sumaListy = sum(lista1)
srednia = sumaListy / dlugoscListy
lista2 = [i for i in lista1 += srednia]
print(lista2)
My code looks like above, but it is not working
The syntax in your list comprehension is wrong. It needs to look like this:
list2 = [element + list_avg for element in my_list]
The expression has to be at the start, not the end.
Also you shouldn't use i in this case, as you are not counting numbers in a range, but using a for _ in loop and dealing with actual values from the list. You should call it "element" or "value" instead. This can easily lead to confusion, especially if you deal with lists which contain numbers, like in your case.
The numpy library can also do it very well (and fast).
import numpy as np
lista1 = np.array([2, 3, 4, 5, 6])
lista2 = lista1 + np.mean(lista1)
print(lista2)

Excluding elements from an old array and creating a new array

I have an array that has 3 different values, I want to create a new array that has only the values that are smaller that 12 (for example).
import array as arr
numbers = arr.array([10,12, 12, 13])
numbers.remove(12)
numbers.remove(13)
print(numbers)
I don't know how to add them in a new array
There are a few ways to solve this problem. Here's how I would go about it.
import array as arr
numbers = arr.array('i', [10,12,12,13])
new_nums = arr.array('i', [i for i in numbers if i<12])
You can also use the pop() method, like so:
new_nums = arr.array('i', [numbers.pop(i) for i,val in enumerate(numbers) if val<12])
Alternatively, you can just use list comprehension on a Python list, like this:
new_nums = [i for i in numbers if i<12]
Hope this helps!
numbers = [10,12,13,14]
newarr = []
for i in numbers:
if i<12:
newarr.append(i)
print(newarr)
Use a python list and iterate over the previous list(Use list, not an array), check for the required condition and add(append) the results to a new list.
Assumption:
you want only unique elements from the initial list
the elements should be less than 12
Code:
initial_list = [10, 12, 12, 13]
new_list = [i for i in list(set(initial_list)) if i<12]
print(new_list)
Output:
[10]
Explanation:
The above code first creates a set of unique elements from the initial list. Then it selects the only items that meet the condition i<12 using a list comprehension.
N.B.: If you are bound to use array module, #Praveenkumar's answer is the way to use.
You can do the below if you want to stick with array only (I don't know why you want to do that). array and list in python are different things.
import array as arr
numbers = arr.array('i', [10,11, 12, 13])
print(arr.array('i', filter(lambda x : x < 12, numbers)))

Enumerate list to make a new list of indices?

I'm trying to make a new list of indices by enumerated a previous list. Basically, what I want is:
To enumerate a list of elements to obtain indices for each element. I coded this:
board = ["O","O","O","O","O"]
for index,y in enumerate(board):
print(index,end=" ")
which gives:
0 1 2 3 4
I now want to make those numbers into a new list, but have no clue how to do that.
Thanks! Sorry for the question, I'm still a beginner and am just trying to get the hang of things.
You should probably just make a range of the right length:
board = ["O","O","O","O","O"]
indices = list(range(len(board)))
print(indices)
> [0, 1, 2, 3, 4]
Use list comprehension:
indices = [index for index, y in enumerate(board)]
If board is always a object, which implements the __len__-method, you can also use range:
indices = list(range(len(board)))
If you just want all the numbers you can use this:
indices = list(range(len(board)))
If you pass one number to range it will return an iterator with the numbers 0 up to the passed number (excluding). After this we turn it into a list with the list function.
You can use list comprehension to do that:
result = [index for index,y in enumerate(board)]
Alternatively you can use the range function:
result = range(len(board))
I would just use numpy arange, which creates an array that looks like the one you are looking for:
Numpy Arange
import numpy as np
enumerated = np.arange(len(board))
The straightforward way is:
board = ["O","O","O","O","O"]
newlist = []
for index,y in enumerate(board):
newlist.append(index)
A more advanced way using list comprehensions would be:
newlist = [index for index, value in enumerate(board)]

Basic python: how to increase value of item in list [duplicate]

This question already has answers here:
Why does this iterative list-growing code give IndexError: list assignment index out of range? How can I repeatedly add (append) elements to a list?
(9 answers)
Closed 4 months ago.
This is such a simple issue that I don't know what I'm doing wrong. Basically I want to iterate through the items in an empty list and increase each one according to some criteria. This is an example of what I'm trying to do:
list1 = []
for i in range(5):
list1[i] = list1[i] + 2*i
This fails with an list index out of range error and I'm stuck. The expected result (what I'm aiming at) would be a list with values:
[0, 2, 4, 6, 8]
Just to be more clear: I'm not after producing that particular list. The question is about how can I modify items of an empty list in a recursive way. As gnibbler showed below, initializing the list was the answer. Cheers.
Ruby (for example) lets you assign items beyond the end of the list. Python doesn't - you would have to initialise list1 like this
list1 = [0] * 5
So when doing this you are actually using i so you can just do your math to i and just set it to do that. there is no need to try and do the math to what is going to be in the list when you already have i. So just do list comprehension:
list1 = [2*i for i in range(5)]
Since you say that it is more complex, just don't use list comprehension, edit your for loop as such:
for i in range(5):
x = 2*i
list1[i] = x
This way you can keep doing things until you finally have the outcome you want, store it in a variable, and set it accordingly! You could also do list1.append(x), which I actually prefer because it will work with any list even if it's not in order like a list made with range
Edit: Since you want to be able to manipulate the array like you do, I would suggest using numpy! There is this great thing called vectorize so you can actually apply a function to a 1D array:
import numpy as np
list1 = range(5)
def my_func(x):
y = x * 2
vfunc = np.vectorize(my_func)
vfunc(list1)
>>> array([0, 2, 4, 6, 8])
I would advise only using this for more complex functions, because you can use numpy broadcasting for easy things like multiplying by two.
Your list is empty, so when you try to read an element of the list (right hand side of this line)
list1[i] = list1[i] + 2*i
it doesn't exist, so you get the error message.
You may also wish to consider using numpy. The multiplication operation is overloaded to be performed on each element of the array. Depending on the size of your list and the operations you plan to perform on it, using numpy very well may be the most efficient approach.
Example:
>>> import numpy
>>> 2 * numpy.arange(5)
array([0, 2, 4, 6, 8])
I would instead write
for i in range(5):
list1.append(2*i)
Yet another way to do this is to use the append method on your list. The reason you're getting an out of range error is because you're saying:
list1 = []
list1.__getitem__(0)
and then manipulate this item, BUT that item does not exist since your made an empty list.
Proof of concept:
list1 = []
list1[1]
IndexError: list index out of range
We can, however, append new stuff to this list like so:
list1 = []
for i in range(5):
list1.append(i * 2)

Picking out items from a python list which have specific indexes

I'm sure there's a nice way to do this in Python, but I'm pretty new to the language, so forgive me if this is an easy one!
I have a list, and I'd like to pick out certain values from that list. The values I want to pick out are the ones whose indexes in the list are specified in another list.
For example:
indexes = [2, 4, 5]
main_list = [0, 1, 9, 3, 2, 6, 1, 9, 8]
the output would be:
[9, 2, 6]
(i.e., the elements with indexes 2, 4 and 5 from main_list).
I have a feeling this should be doable using something like list comprehensions, but I can't figure it out (in particular, I can't figure out how to access the index of an item when using a list comprehension).
[main_list[x] for x in indexes]
This will return a list of the objects, using a list comprehension.
t = []
for i in indexes:
t.append(main_list[i])
return t
map(lambda x:main_list[x],indexes)
If you're good with numpy:
import numpy as np
main_array = np.array(main_list) # converting to numpy array
out_array = main_array.take([2, 4, 5])
out_list = out_array.tolist() # if you want a list specifically
I think Yuval A's solution is a pretty clear and simple. But if you actually want a one line list comprehension:
[e for i, e in enumerate(main_list) if i in indexes]
As an alternative to a list comprehension, you can use map with list.__getitem__. For large lists you should see better performance:
import random
n = 10**7
L = list(range(n))
idx = random.sample(range(n), int(n/10))
x = [L[x] for x in idx]
y = list(map(L.__getitem__, idx))
assert all(i==j for i, j in zip(x, y))
%timeit [L[x] for x in idx] # 474 ms per loop
%timeit list(map(L.__getitem__, idx)) # 417 ms per loop
For a lazy iterator, you can just use map(L.__getitem__, idx). Note in Python 2.7, map returns a list, so there is no need to pass to list.
I have noticed that there are two optional ways to do this job, either by loop or by turning to np.array. Then I test the time needed by these two methods, the result shows that when dataset is large
【[main_list[x] for x in indexes]】is about 3~5 times faster than
【np.array.take()】
if your code is sensitive to the computation time, the highest voted answer is a good choice.

Categories

Resources