how to split a list every nth item

how to split a list every nth item - python

I am trying to split a list every 5th item, then delete the next two items ('nan'). I have attempted to use List[:5], but that does not seem to work in a loop. The desired output is: [['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5']]
List = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
for i in List:
# split first 5 items
# delete next two items
# Desired output:
# [['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5'],['1','2','3','4','5']]

There are lots of ways to do this. I recommend stepping by 7 then splicing by 5.
data = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
# Step by 7 and keep the first 5
chunks = [data[i:i+5] for i in range(0, len(data), 7)]
print(*chunks, sep='\n')
Output:
['1', '2', '3', '4', '5']
['1', '2', '3', '4', '5']
['1', '2', '3', '4', '5']
['1', '2', '3', '4', '5']
Reference: Split a python list into other “sublists”...

WARNING: make sure the list follows the rules as you said, after every 5 items 2 nan.
This loop will add the first 5 items as a list, and delete the first 7 items.
lst = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
output = []
while True:
if len(lst) <= 0:
break
output.append(lst[:5])
del lst[:7]
print(output) # [['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5']]

List=['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
new_list = list()
for k in range(len(List)//7):
new_list.append(List[k*7:k*7+5])
new_list.append(List[-len(List)%7])

Straightforward solution in case if the list doesn’t follow the rules you mentioned but you want to split sequence always between NAN's:
result, temp = [], []
for item in lst:
if item != 'nan':
temp.append(item)
elif temp:
result.append(list(temp))
temp = []

Using itertools.groupby would also support chunks of different lengths:
[list(v) for k, v in groupby(List, key='nan'.__ne__) if k]

I guess there is more pythonic way to do the same but:
result = []
while (len(List) > 5):
result.append(List[0:0+5])
del List[0:0+5]
del List[0:2]
This results: [['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5']]

mainlist=[]
sublist=[]
count=0
for i in List:
if i!="nan" :
if count==4:
# delete next two items
mainlist.append(sublist)
count=0
sublist=[]
else:
# split first 5 items
sublist.append(i)
count+=1

Generally numpy.split(...) will do any kind of custom splitting for you. Some reference:
https://docs.scipy.org/doc/numpy/reference/generated/numpy.split.html
And the code:
import numpy as np
lst = ['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan']
ind=np.ravel([[i*7+5, (i+1)*7] for i in range(len(lst)//7)])
lst2=np.split(lst, ind)[:-1:2]
print(lst2)
Outputs:
[array(['1', '2', '3', '4', '5'], dtype='<U3'), array(['1', '2', '3', '4', '5'], dtype='<U3'), array(['1', '2', '3', '4', '5'], dtype='<U3'), array(['1', '2', '3', '4', '5'], dtype='<U3')]

I like the splice answers.
Here is my 2 cents.
# changed var name away from var type
myList = ['1','2','3','4','5','nan','nan','1','2','3','4','10','nan','nan','1','2','3','4','15','nan','nan','1','2','3','4','20','nan','nan']
newList = [] # declare new list of lists to create
addItem = [] # declare temp list
myIndex = 0 # declare temp counting variable
for i in myList:
myIndex +=1
if myIndex==6:
nothing = 0 #do nothing
elif myIndex==7: #add sub list to new list and reset variables
if len(addItem)>0:
newList.append(list(addItem))
addItem=[]
myIndex = 0
else:
addItem.append(i)
#output
print(newList)

Related

Find all possible varients of max pair of 2

Given a string of numbers like 123456, I want to find all the possibilities they can be paired in by 2 or by itself. For example, from the string 123456 I would like to get the following:
12 3 4 5 6, 12 34 5 6, 1 23 4 56, etc.
The nearest I was able to come to was this:
strr = list("123456")
x = list("123456")
for i in range(int(len(strr)/2)):
newlist = []
for j in range(i):
newlist.append(x[j])
newlist.append(x[i] + x[i+1])
for j in range(len(x))[i+2:]:
newlist.append(x[j])
x = newlist.copy()
b = x.copy()
for f in range(len(b))[i:]:
if f == i:
print(b)
continue
b[f] = b[f - 1][1] + b[f]
b[f - 1] = b[f - 1][0]
print(b)
This code gives the output:

It's easy to solve this problem with a recursive generator. This is similar to how you solve change-making problems, just here we have only two "coins", either two characters together, or one character at a time. The total change we're trying to make is the length of the input string. The fact that the characters are digits in a numeric string is irrelevant.
def singles_and_pairs(string):
if len(string) <= 1: # base case
yield list(string) # yield either [] or [string] and then quit
return
for result in singles_and_pairs(string[:-1]): # first recursion
result.append(string[-1:])
yield result
for result in singles_and_pairs(string[:-2]): # second recursion
result.append(string[-2:])
yield result
If you plan on running this on large input strings, you might want to add memoization, since the recursive calls recalculate the same results quite often.

Pheew, this one took me some time to get right, but it seems to finally work (edited for prettier ordering):
def max_2_partitions(my_string):
if not my_string:
return [[]]
if len(my_string) == 1:
return [[my_string]]
ret = []
for i in range(len(my_string)):
for l in max_2_partitions(my_string[:i] + my_string[i + 1:]):
li = sorted([my_string[i]]+l, key = lambda x: (len(x),x))
if li not in ret:
ret.append(li)
for j in range(i+1,len(my_string)):
for l in max_2_partitions(my_string[:i]+my_string[i+1:j]+my_string[j+1:]):
li = sorted([my_string[i] + my_string[j]] + l, key = lambda x: (len(x),x))
if li not in ret:
ret.append(li)
return sorted(ret, key=lambda x: (-len(x),x))
Example:
print(max_2_partitions("1234"))
# [['1', '2', '3', '4'], ['1', '2', '34'], ['1', '3', '24'], ['1', '4', '23'], ['2', '3', '14'], ['2', '4', '13'], ['3', '4', '12'], ['12', '34'], ['13', '24'], ['14', '23']]

12 lines of code, full permutations:
You can first create permutations of the string, and then add spacing:
from itertools import permutations
def solution(A):
result = []
def dfs(A,B):
if not B:
result.append(A)
else:
for i in range(1,min(2,len(B))+1):
dfs(A+[B[:i]],B[i:])
for x in permutations(A):
dfs([],''.join(x))
return result
print(f"{solution('123') = }")
# solution('123') = [['1', '2', '3'], ['1', '23'], ['12', '3'], ['1', '3', '2'], ['1', '32'], ['13', '2'], ['2', '1', '3'], ['2', '13'], ['21', '3'], ['2', '3', '1'], ['2', '31'], ['23', '1'], ['3', '1', '2'], ['3', '12'], ['31', '2'], ['3', '2', '1'], ['3', '21'], ['32', '1']]

Remove sublist duplicates including reversed

For example i have following
list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
I want to match if a sub list has a reversed sub list within same list (i.e. ['1', '2'] = ['2', '1']) , and if True than to remove from the list the mirrored one.
The final list should look like :
list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5']['2', '6']]
This is what i tried:
for i in range(len(list)):
if list[i] == list[i][::-1]:
print("Match found")
del list[i][::-1]
print(list)
But finally I get the same list as original. I am not sure if my matching condition is correct.

You could iterate over the elements of the list, and use a set to keep track of those that have been seen so far. Using a set is a more convenient way to check for membership, since the operation has a lower complexity, and in that case you'll need to work with tuples, since lists aren't hashable. Then just keep those items if neither the actual tuple or the reversed have been seen (if you just want to ignore those which have a reversed you just need if tuple(reversed(t)) in s):
s = set()
out = []
for i in l:
t = tuple(i)
if t in s or tuple(reversed(t)) in s:
continue
s.add(t)
out.append(i)
print(out)
# [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

lists = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
for x in lists:
z=x[::-1]
if z in lists:
lists.remove(z)
Explanation: While looping over lists, reverse each element and store in 'z'. Now, if 'z' exists in lists, remove it using remove()
The problem with your solution is you are checking while using index 'i' which means if an element at 'i' is equal to its reverse which can never happen!! hence getting the same results

Approach1:
new_list = []
for l in List:
if l not in new_list and sorted(l) not in new_list:
new_list.append(l)
print(new_list)
Approach2:
You can try like this also:
seen = set()
print([x for x in List if frozenset(x) not in seen and not seen.add(frozenset(x))])
[['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

my_list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
my_list = list(set([sorted(l) for l in my_list]))

This is similar to solution by #Mehul Gupta, but I think their solution is traversing the list twice if matched: one for checking and one for removing. Instead, we could
the_list = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
for sub_list in the_list:
try:
idx = the_list.index(sub_list[::-1])
except ValueError:
continue
else:
the_list.pop(idx)
print(the_list)
# [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]
because it is easier to ask for forgiveness than permission.
Note: Removing elements whilst looping is not a good thing but for this specific problem, it does no harm. In fact, it is better because we do not check the mirrored again; we already removed it.

As I have written in a comment, do never use list (or any built-in) as a variable name:
L = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
Have a look at your code:
for i in range(len(L)):
if L[i] == L[i][::-1]:
print("Match found")
del L[i][::-1]
There are two issues. First, you compare L[i] with L[i][::-1], but you want to compare L[i] with L[j][::-1] for any j != i. Second, you try to delete elements of a list during an iteration. If you delete an element, then the list length is decreased and the index of the loop will be out of the bounds of list:
>>> L = [1,2,3]
>>> for i in range(len(L)):
... del L[i]
...
Traceback (most recent call last):
...
IndexError: list assignment index out of range
To fix the first issue, you can iterate twice over the elements: for each element, is there another element that is the reverse of the first? To fix the second issue, you have two options: 1. build a new list; 2. proceed in reverse order, to delete first the last indices.
First version:
new_L = []
for i in range(len(L)):
for j in range(i+1, len(L)):
if L[i] == L[j][::-1]:
print("Match found")
break
else: # no break
new_L.append(L[i])
print(new_L)
Second version:
for i in range(len(L)-1, -1, -1):
for j in range(0, i):
if L[i] == L[j][::-1]:
print("Match found")
del L[i]
print(L)
(For a better time complexity, see #yatu's answer.)
For a one-liner, you can use the functools module:
>>> L = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
>>> import functools
>>> functools.reduce(lambda acc, x: acc if x[::-1] in acc else acc + [x], L, [])
[['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]
The logic is the same as the logic of the first version.

You can try this also:-
l = [['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '1'], ['4', '1'], ['2', '6']]
res = []
for sub_list in l:
if sub_list[::-1] not in res:
res.append(sub_list)
print(res)
Output:-
[['1', '2'], ['1', '3'], ['1', '4'], ['1', '5'], ['2', '6']]

Remove duplicate items from list

I tried following this post but, it doesnt seem to be working for me.
I tried this code:
for bresult in response.css(LIST_SELECTOR):
NAME_SELECTOR = 'h2 a ::attr(href)'
yield {
'name': bresult.css(NAME_SELECTOR).extract_first(),
}
b_result_list.append(bresult.css(NAME_SELECTOR).extract_first())
#set b_result_list to SET to remove dups, then change back to LIST
set(b_result_list)
list(set(b_result_list))
for brl in b_result_list:
print("brl: {}".format(brl))
This prints out:
brl: https://facebook.site.com/users/login
brl: https://facebook.site.com/users
brl: https://facebook.site.com/users/login
When I just need:
brl: https://facebook.site.com/users/login
brl: https://facebook.site.com/users
What am I doing wrong here?
Thank you!

you are discarding the result when you need to save it ... b_result_list never actually changes... so you are just iterating over the original list. instead save the result of the set operation
b_result_list = list(set(b_result_list))
(note that sets do not preserve order)

If you want to maintain order and uniqueify, you can do:
>>> li
['1', '1', '2', '2', '3', '3', '3', '3', '1', '1', '4', '5', '4', '6', '6']
>>> seen=set()
>>> [e for e in li if not (e in seen or seen.add(e))]
['1', '2', '3', '4', '5', '6']
Or, you can use the keys of an OrderedDict:
>>> from collections import OrderedDict
>>> OrderedDict([(k, None) for k in li]).keys()
['1', '2', '3', '4', '5', '6']
But a set alone may substantially change the order of the original list:
>>> list(set(li))
['1', '3', '2', '5', '4', '6']

Python few elements of the list were skipped while removing them in for loop

I am wondering what is the reason why some elements weren't removed in this program. Could someone provide pointers?
Program:
t = ['1', '2', '2', '2', '2', '2', '2', '2', '2', '2', '7', '8', '9', '10']
print len(t)
for x in t:
if x == '2':
print x
t.remove(x)
else:
print 'hello: '+str(x)
print t
Output on my system:
14
hello: 1
2
2
2
2
2
hello: 8
hello: 9
hello: 10
['1', '2', '2', '2', '2', '7', '8', '9', '10']
I am using Python 2.6.2.

Never alter the sequence on which you're iterating.
#cjonhson318's list-comprehension will work fine, or, less efficiently but more closely akin to your code, just loop on a copy of the list while you're altering the list itself:
for x in list(t):
if x == '2':
print x
t.remove(x)
else:
print 'hello: '+str(x)
As you see the only change from your code is looping on list(t) (a copy of t's initial value) rather than on t itself -- this modest change lets you alter t itself within the loop to your heart's contents.

Say something like:
t = [ i for i in t if i != '2' ]
for item in t:
print "Hello "+item

An alternative would be to get functional
from operator import ne
from functools import partial
t = ['1', '2', '2', '2', '2', '2', '2', '2', '2', '2', '7', '8', '9', '10']
for n in filter(partial(ne, '2'), t):
print('hello {}'.format(n))
Use the filter function to create a new list minus the 2 values.
If the use of partial and operator.ne was not to your liking, you could use a lambda
for n in filter(lambda x: x != '2', t):
print('hello {}'.format(n))

How could i refresh a list once an item has been removed from a list within a list in python

This is quite complicated but i would like to be able to refresh a larger list once at item has been taken out of a mini list within the bigger list.
listA = ['1','2','3','4','5','6','6','8','9','5','3','7']
i used the code below to split it into lists of threes
split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
print(split)
# [['1','2','3'],['4','5','6'],['6','8','9'],['5','3','7']]
split = [['1','2','3'],['4','5','6'],['6','8','9'],['5','3','7']]
if i deleted #3 from the first list, split will now be
del split[0][-1]
split = [['1','2'],['4','5','6'],['6','8','9'],['5','3','7']]
after #3 has been deleted, i would like to be able to refresh the list so that it looks like;
split = [['1','2','4'],['5','6','6'],['8','9','5'],['3','7']]
thanks in advance

Not sure how big this list is getting, but you would need to flatten it and recalculate it:
>>> listA = ['1','2','3','4','5','6','6','8','9','5','3','7']
>>> split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
>>> split
[['1', '2', '3'], ['4', '5', '6'], ['6', '8', '9'], ['5', '3', '7']]
>>> del split[0][-1]
>>> split
[['1', '2'], ['4', '5', '6'], ['6', '8', '9'], ['5', '3', '7']]
>>> listA = sum(split, []) # <- flatten split list back to 1 level
>>> listA
['1', '2', '4', '5', '6', '6', '8', '9', '5', '3', '7']
>>> split = [listA[i:(i+3)] for i in range(0, len(listA) - 1, 3)]
>>> split
[['1', '2', '4'], ['5', '6', '6'], ['8', '9', '5'], ['3', '7']]

Just recreate the single list from your nested lists, then re-split.
You can join the lists, assuming they are only one level deep, with something like:
rejoined = [element for sublist in split for element in sublist]
There are no doubt fancier ways, or single-liners that use itertools or some other library, but don't overthink it. If you're only talking about a few hundred or even a few thousand items this solution is quite good enough.

I need this for turning of cards in the deck in a solitaire game.
You can deal your cards using itertools.groupby() with a good key function:
def group_key(x, n=3, flag=[0], counter=itertools.count(0)):
if next(counter) % n == 0:
flag[0] = flag[0] ^ 1
return flag[0]
^ is a bitwise operator, basically it change the value of the flag from 0 to 1 and viceversa. The flag value is an element of a list because we're doing some kind of memoization.
Example:
>>> deck = ['1', '2', '3', '4', '5', '6', '6', '8', '9', '5', '3', '7']
>>> for k,g in itertools.groupby(deck, key=group_key):
... print(list(g))
['1', '2', '3']
['4', '5', '6']
['6', '8', '9']
['5', '3', '7']
Now let's say you've used card '9' and '8', so your new deck looks like:
>>> deck = ['1', '2', '3', '4', '5', '6', '6', '5', '3', '7']
>>> for k,g in itertools.groupby(deck, key=group_key):
... print(list(g))
['1', '2', '3']
['4', '5', '6']
['6', '5', '3']
['7']

Build an object that contains a list and tracks when the list is altered (probably by controlling write to it), then have the object do it's own split every time the data is altered and save the split list to a member of the object.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to split a list every nth item - python

List=['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan'] new_list = list() for k in range(len(List)//7): new_list.append(List[k7:k7+5]) new_list.append(List[-len(List)%7])

Straightforward solution in case if the list doesn’t follow the rules you mentioned but you want to split sequence always between NAN's: result, temp = [], [] for item in lst: if item != 'nan': temp.append(item) elif temp: result.append(list(temp)) temp = []

Using itertools.groupby would also support chunks of different lengths: [list(v) for k, v in groupby(List, key='nan'.ne) if k]

I guess there is more pythonic way to do the same but: result = [] while (len(List) > 5): result.append(List[0:0+5]) del List[0:0+5] del List[0:2] This results: [['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5']]

mainlist=[] sublist=[] count=0 for i in List: if i!="nan" : if count==4: # delete next two items mainlist.append(sublist) count=0 sublist=[] else: # split first 5 items sublist.append(i) count+=1

Related

Find all possible varients of max pair of 2

Remove sublist duplicates including reversed

Remove duplicate items from list

Python few elements of the list were skipped while removing them in for loop

How could i refresh a list once an item has been removed from a list within a list in python

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

how to split a list every nth item - python

List=['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan'] new_list = list() for k in range(len(List)//7): new_list.append(List[k*7:k*7+5]) new_list.append(List[-len(List)%7])

Straightforward solution in case if the list doesn’t follow the rules you mentioned but you want to split sequence always between NAN's: result, temp = [], [] for item in lst: if item != 'nan': temp.append(item) elif temp: result.append(list(temp)) temp = []

Using itertools.groupby would also support chunks of different lengths: [list(v) for k, v in groupby(List, key='nan'.__ne__) if k]

I guess there is more pythonic way to do the same but: result = [] while (len(List) > 5): result.append(List[0:0+5]) del List[0:0+5] del List[0:2] This results: [['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5'], ['1', '2', '3', '4', '5']]

mainlist=[] sublist=[] count=0 for i in List: if i!="nan" : if count==4: # delete next two items mainlist.append(sublist) count=0 sublist=[] else: # split first 5 items sublist.append(i) count+=1

Related

Find all possible varients of max pair of 2

Remove sublist duplicates including reversed

Remove duplicate items from list

Python few elements of the list were skipped while removing them in for loop

How could i refresh a list once an item has been removed from a list within a list in python

Categories

Resources

List=['1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan','1','2','3','4','5','nan','nan'] new_list = list() for k in range(len(List)//7): new_list.append(List[k7:k7+5]) new_list.append(List[-len(List)%7])

Using itertools.groupby would also support chunks of different lengths: [list(v) for k, v in groupby(List, key='nan'.ne) if k]