Separating array elements from each other - python

I'm trying to separate every continuous segment of consecutive numbers in a different array.
For example,
# Input
x=[1,2,3,4,5,8,11,12,13,18]
# Output:
x=[[1,2,3,4,5],[8],[11,12,13],[18]]
The existing code,
x=[1,2,3,4,5,8,11,12,13,18]
temp=[]
firstnumber=0
for i in range(1,len(x)-1,1):
current=x[i]
previous=x[i-1]
if ((current-previous)!=1):
mm=(x[firstnumber:i-1])
temp.append(mm)
firstnumber=x[i]
print(temp)
I only got [[1, 2, 3, 4], []] as a result and I can't figure out why.

I have tried to answer this question changing as little of your code as possible.
x=[1,2,3,4,5,8,11,12,13,18]
temp=[]
firstnumber=0
first_index = 0
for i in range(1, len(x)):
current=x[i]
previous=x[i-1]
if ((current-previous)!=1):
mm = x[first_index:i]
temp.append(mm)
firstnumber = x[i]
first_index = i
temp.append(x[first_index:])
print(temp) # [[1, 2, 3, 4, 5], [8], [11, 12, 13], [18]]
What I changed:
firstnumber is being used as an index, but in reality is an element of the list, so we need to use first_index = i, the current index on that iteration.
The loop did not cover all the elements of the list, we need to go all the way to the end of the list so we iterate over range(1, len(x)
Finally even if the loop completes it will be missing the last sequence unless we add it after the loop, hence the addition of temp.append(x[first_index:])
NOTE: This method will work with the input you have but it not robust for all cases, nor is it the most efficient way to do this, however, your question was why it did not work as is so hopefully this answers that.

My answer does not intend to provide repaired code, but rather doing described task.
Note that you might use -1 index meaning last element. I would do it following way
x=[1,2,3,4,5,8,11,12,13,18]
temp=[x[:1]]
for i in x[1:]:
if temp[-1][-1]+1!=i: temp.append([])
temp[-1].append(i)
print(temp)
Output:
[[1, 2, 3, 4, 5], [8], [11, 12, 13], [18]]
Explanation: I firstly load first element as one-element list, then for following elements, if there is difference other than 1 between current and last-seen element then I append new empty list to temp, then independently from full-filling or not condition I add current element to last sublist.

x=[1,2,3,4,5,8,11,12,13,18]
x.append(x[-1]-2)
temp=[]
firstnumber=0
for i in range(1, len(x)):
current=x[i]
previous=x[i-1]
if ((current-previous)!=1):
mm=(x[firstnumber:i])
temp.append(mm)
firstnumber=i
print(temp)

In the code, the variable firstnumber is, I believe, supposed to contain the index of the first element of any continuous segment of consecutive numbers.
However, when you do firstnumber=x[i] that purpose is lost. Instead you can do, firstnumber = i and then it will work.
Also, this implementation will not append the last consecutive segment. As a result, you will have to do that outside the loop.

Related

Appending value into list without duplicates

I have a list of integers containing:
intlist = [19,4,2,4]
and i want to append the value from the intlist into a list of list such that:
noDuplist = [[19,0],[4,1],[2,2]]
where the first index represents the value from the intlist and the second index represents the index of the value in the intlist. 19 is at index 0, 4 is at index 1 and 2 is at index 2. Since there is another 4 i do not want to include that since i didn't want duplicate hence the last 4 is just going to be left out.
I tried something like:
noDuplist = []
for i in range(len(intlist)):
if intlist[i] not in noDuplist:
noDuplist.append([intlist[i],i])
but I'm still getting
[[19, 0], [4, 1], [2, 2], [4, 3]]
where the [4,3] shouldnt be there. Would appreciate some help on this
I assume you want to retain the indices from the original sequence.
Thus what you want is something that remembers at what index was the value
first seen in the original sequence.
The problem is in your condition since
if intlist[i] not in noDuplist:
# something
would check if 4 was present in [[19, 0], [4, 1], [2, 2]]] which it isn't.
A cleaner way of doing this could be using dictionaries or sets.:
intlist = [19,4,2,4]
seen_so_far, noDuplist = set(), []
for i, v in enumerate(intlist):
if v not in seen_so_far:
noDuplist.append([v, i])
seen_so_far.add(v)
print(noDuplist)
Which gives the output [[19, 0], [4, 1], [2, 2]]
The first thing I'd suggest is not bothering storing the index as well as the value. You'll know the index when extracting elements anyway.
The first approach that comes to my mind (not sure if it's optimal) involves using a dictionary in combination with your list. Whenever you try to insert a value, check if it exists in your dictionary. If it doesn't, add it to the dictionary and your list. If it does, don't add it.
This would result in O(N) complexity.
Edit:
I didn't read your description thoroughly enough. Since you need the index from the original array, simply enter both as a key/value pair into your dictionary instead.

Extract index of Non duplicate elements in python list

I have a list:
input = ['a','b','c','a','b','d','e','d','g','g']
I want index of all elements except duplicate in a list.
output = [0,1,2,5,6,8]
You should iterate over the enumerated list and add each element to a set of "seen" elements and add the index to the output list if the element hasn't already been seen (is not in the "seen" set).
Oh, the name input overrides the built-in input() function, so I renamed it input_list.
output = []
seen = set()
for i,e in enumerate(input_list):
if e not in seen:
output.append(i)
seen.add(e)
which gives output as [0, 1, 2, 5, 6, 8].
why use a set?
You could be thinking, why use a set when you could do something like:
[i for i,e in enumerate(input_list) if input_list.index(e) == i]
which would work because .index returns you the index of the first element in a list with that value, so if you check the index of an element against this, you can assert that it is the first occurrence of that element and filter out those elements which aren't the first occurrences.
However, this is not as efficient as using a set, because list.index requires Python to iterate over the list until it finds the element (or doesn't). This operation is O(n) complexity and since we are calling it for every element in input_list, the whole solution would be O(n^2).
On the other hand, using a set, as in the first solution, yields an O(n) solution, because checking if an element is in a set is complexity O(1) (average case). This is due to how sets are implemented (they are like lists, but each element is stored at the index of its hash so you can just compute the hash of an element and see if there is an element there to check membership rather than iterating over it - note that this is a vague oversimplification but is the idea of them).
Thus, since each check for membership is O(1), and we do this for each element, we get an O(n) solution which is much better than an O(n^2) solution.
You could do a something like this, checking for counts (although this is computation-heavy):
indexes = []
for i, x in enumerate(inputlist):
if (inputlist.count(x) == 1
and x not in inputlist[:i]):
indexes.append(i)
This checks for the following:
if the item appears only once. If so, continue...
if the item hasn't appeared before in the list up till now. If so, add to the results list
In case you don't mind indexes of the last occurrences of duplicates instead and are using Python 3.6+, here's an alternative solution:
list(dict(map(reversed, enumerate(input))).values())
This returns:
[3, 4, 2, 7, 6, 9]
Here is a one-liner using zip and reversed
>>> input = ['a','b','c','a','b','d','e','d','g','g']
>>> sorted(dict(zip(reversed(input), range(len(input)-1, -1, -1))).values())
[0, 1, 2, 5, 6, 8]
This question is missing a pandas solution. 😉
>>> import pandas as pd
>>> inp = ['a','b','c','a','b','d','e','d','g','g']
>>>
>>> pd.DataFrame(list(enumerate(inp))).groupby(1).first()[0].tolist()
[0, 1, 2, 5, 6, 8]
Yet another version, using a side effect in a list comprehension.
>>> xs=['a','b','c','a','b','d','e','d','g','g']
>>> seen = set()
>>> [i for i, v in enumerate(xs) if v not in seen and not seen.add(v)]
[0, 1, 2, 5, 6, 8]
The list comprehension filters indices of values that have not been seen already.
The trick is that not seen.add(v) is always true because seen.add(v) returns None.
Because of short circuit evaluation, seen.add(v) is performed if and only if v is not in seen, adding new values to seen on the fly.
At the end, seen contains all the values of the input list.
>>> seen
{'a', 'c', 'g', 'b', 'd', 'e'}
Note: it is usually a bad idea to use side effects in list comprehension,
but you might see this trick sometimes.

Rearrange list in-place by modifying the original list, put even-index values at front

I am relatively new to python and I am still trying to learn the basics of the language. I stumbled upon a question which asks you to rearrange the list by modifying the original. What you are supposed to do is move all the even index values to the front (in reverse order) followed by the odd index values.
Example:
l = [0, 1, 2, 3, 4, 5, 6]
l = [6, 4, 2, 0, 1, 3, 5]
My initial approach was to just use the following:
l = l[::-2] + l[1::2]
However, apparently this is considered 'creating a new list' rather than looping through the original list to modify it.
As such, I was hoping to get some ideas or hints as to how I should approach this particular question. I know that I can use a for loop or a while loop to cycle through the elements / index, but I don't know how to do a swap or anything else for that matter.
You can do it by assigning to a list slice instead of a variable:
l[:] = l[::2][::-1] + l[1::2]
Your expression for the reversed even elements was also wrong. Use l[::2] to get all the even numbers, then reverse that with [::-1].
This is effectively equivalent to:
templ = l[::2][::-1] + l[1::2]
for i in range(len(l)):
l[i] = templ[i]
The for loop modifies the original list in place.

Bug In Using Del To Remove Elements In 2D List

So this is part of a bigger dataset but I simplified it here. Say You have the following code. What I'm trying to do is extract the sub arrays, based on what the middle element is (that's what dataStore is for). Side note, I know about list mutability and that when I do del row[1], I permanently affect data
dataStore = []
data = [[1,3,7], [1,0,1],[2,0,2],[9,0,9], [3,1,9]]
print(data)
for index in range(0, 5):
temp = []
for row in data:
if row[1] == index:
del row[1]
temp.append(row)
del data[ data.index(row)]
dataStore.append(temp)
The output is:
Data: [[2, 0, 2]]
DataStore: [[[1, 1], [9, 9]], [[3, 9]], [], [[1, 7]], []]
Now Data is supposed to be empty after I'm done, and the bug here is that (2,0,2) that doesn't get deleted because it's back to back with something that just got deleted. How do you get around this?
How I go about things: I feel like the reason is because when you do del data[ data.index(row)]it moves everything up by one but row is still iterating forward. So I was thinking of a solution of a 2nd mirror list but I couldn't figure out the syntax for it
Do you NEED the data array to be empty when you're done? I'm not sure, but maybe this is what you're looking for:
data = [[1,3,7], [1,0,1], [2,0,2], [9,0,9], [3,1,9]]
dataStore = []
for index in range(0, 5):
for row in data:
if row[1] == index:
dataStore.append([row[0], row[2]])
print(dataStore)
Everything is just stored in the new dataStore array instead of deleting the current data array. If you're iterating over an array, you never want to modify it while doing so. The output should be: [[1, 1], [2, 2], [9, 9], [3, 9], [1, 7]]
Note that this isn't the most efficient way to do this, but you won't notice unless you're working with very large sets of data.
Your feelings are correct. As a rule of thumb you should never modify a list if you are iterating through it.
If I understand your problem correctly, the end result of your code is that you should have an array in DataStore that is every element in data sorted by the middle element with the middle element removed. If that is the case then I would try something a little simpler like so:
dataStore = [[x[0],x[1]] for x in sorted(data,key=lambda i: i[1])]
del data

Why am I getting list index out of range in Python?

I recently asked a question here: How do I find if an int is in array of arrays? and the solution works well. I'm trying to write code which will remove an int from an array if another array doesn't contain it. The loop I'm trying to use is:
for index in range(len(numbers)):
if not any(numbers[index] in elem[:2] for elem in numbers2):
numbers.remove(numbers[index])
Say numbers = [1, 2, 4] and numbers2 = [[4,5,6], [2,8,9]] then after the loop, numbers[] should be numbers = [2, 4]. The above loop however keeps producing the error exceptions.IndexError: list index out of range but I can't see why the error keeps being thrown. Could anyone help with this issue?
The problem is that len(numbers) in only evaluated once, at the start of the loop.
I would rewrite the whole thing like so:
In [12]: numbers = [1, 2, 4]
In [13]: numbers2 = [[4,5,6], [2,8,9]]
In [15]: n2s = set(reduce(operator.add, (n[:2] for n in numbers2)))
In [17]: [n for n in numbers if n in n2s]
Out[17]: [2, 4]
Create a temporal list and save the positions you want to remove, then after all iterating is done delete the items in those positions. Remember to delete in reverse order to preserve index numbers while deleting.
to_remove = []
for index, number in enumerate(numbers):
if not any(number in elem[:2] for elem in numbers2):
to_remove.append(index)
for index in reversed(to_remove):
del numbers[index]

Categories

Resources