How to get particular index number of list items - python

my_list = ['A', 'B', 'C', 'D', 'E', 'B', 'F', 'D', 'C', 'B']
idx = my_list.index('B')
print("index :", idx)
In here I used the '.index()' function.
for i in my_list:
print(f"index no. {my_list.index(i)}")
I tried to find each index number of the items of the (my_list) list.
But it gave same result for same values. But they located in difference places of the list.
if 'B' == my_list[(len(my_list) - 1)]:
print("True")
if 'B' == my_list[(len(my_list) - 4)]:
print("True")
I need to mention particular values by the index number of their (to do something).
Imagine; I need to set values to nesting with the values of the list.
i.e :
my_list_2 = ['A', 'B', '2', 'C', '3', 'D', '4', 'E', 'B', '2', 'F', '6', 'D', 'C', '3', 'B']
- ------ ------ ------ - ------ ------ - ------ -
If I want to nesting values with their Consecutive (number type) items and
the other values need to nest with '*' mark (as default).Because they have no any Consecutive (numeric) values.
so then how I mention each (string) values and (numeric) values in a coding part to nesting them.
In this case as my example I expected result:
--> my_list_2 = [['A', ''], ['B', '2'], ['C', '3'], ['D', '4'], ['E', ''], ['B', '2'], ['F', '6'], ['D', ''], ['C', '3'], ['B', '']]
This is the coding part which I tried to do this :
def_setter = [
[my_list_2[i], '*'] if my_list_2[i].isalpha() and my_list_2[i + 1].isalpha() else [my_list_2[i], my_list_2[i + 1]]
for i in range(0, len(my_list_2) - 1)]
print("Result : ", def_setter)
But it not gave me the expected result.
Could you please help me to do this !

There might be a more pythonic way to reorganize this array, however, with the following function you can loop through the list and append [letter, value] if value is a number, append [letter, ''] if value is a letter.
def_setter = []
i = 0
while i < len(my_list_2):
if i + 1 == len(my_list_2):
if my_list_2[i].isalpha():
def_setter.append([my_list_2[i], ''])
break
prev, cur = my_list_2[i], my_list_2[i + 1]
if cur.isalpha():
def_setter.append([prev, ''])
i += 1
else:
def_setter.append([prev, cur])
i += 2
print(def_setter)
>>> [['A', ''],
['B', '2'],
['C', '3'],
['D', '4'],
['E', ''],
['B', '2'],
['F', '6'],
['D', ''],
['C', '3'],
['B', '']]

Related

Python: sort 2D list according to the properties of sublists

I have a 2D list:
ls = [
['-2,60233106656288100', '2', 'C'],
['-9,60233106656288100', '2', 'E'],
['-4,60233106656288100', '2', 'E'],
['-3,60233106656288100', '2', 'C'],
['-5,60233106656288100', '4', 'T'],
['-0,39019660724115224', '3', 'E'],
['-3,60233106656288100', '2', 'T'],
['-6,01086748514074000', '1', 'Q'],
['-5,02684650459461800', '0', 'X'],
['-1,25228509312138300', 'A', 'N'],
['-0,85517128843547330', '3', 'E'],
['1,837508975733196200', '3', '-', 'E'],
['1,850925075915637700', '5', '-', 'T'],
['1,826767133229081000', '4', '-', 'C'],
['1,845357865328532300', '3', '-', 'E'],
['0,636275318914609100', 'a', 'n', 'N']
]
I want to sort it first so that the shorter sublists are sorted according to the second column and after that according to the third column so that the list stays sorted according to the second column (first row has 0 in the second column, then 1, then five twos etc. but the twos switch places so that I first have two E's and then two C's and then T). After that I want to sort the longer sublists according to the fourth column. The row where I have A should be the last one of the shorter lists and the row where I have a should be the last row. So the output should be as follows:
[
['-5,02684650459461800', '0', 'X'],
['-6,01086748514074000', '1', 'Q'],
['-9,60233106656288100', '2', 'E'],
['-4,60233106656288100', '2', 'E'],
['-3,60233106656288100', '2', 'C'],
['-2,60233106656288100', '2', 'C'],
['-3,60233106656288100', '2', 'T'],
['-0,39019660724115224', '3', 'E'],
['-0,85517128843547330', '3', 'E'],
['-5,60233106656288100', '4', 'T'],
['-1,25228509312138300', 'A', 'N'],
['1,837508975733196200', '3', '-', 'E'],
['1,845357865328532300', '3', '-', 'E'],
['1,826767133229081000', '4', '-', 'C'],
['1,850925075915637700', '5', '-', 'T'],
['0,636275318914609100', 'a', 'n', 'N']
]
I know that I can sort according to the second column as:
ls.sort(key=lambda x:x[1])
But this sorts the whole list and gives:
['-5,02684650459461800', '0', 'X']
['-6,01086748514074000', '1', 'Q']
['-2,60233106656288100', '2', 'C']
['-9,60233106656288100', '2', 'E']
['-4,60233106656288100', '2', 'E']
['-3,60233106656288100', '2', 'C']
['-3,60233106656288100', '2', 'T']
['-0,39019660724115224', '3', 'E']
['-0,85517128843547330', '3', 'E']
['1,837508975733196200', '3', '-', 'E']
['1,845357865328532300', '3', '-', 'E']
['-5,60233106656288100', '4', 'T']
['1,826767133229081000', '4', '-', 'C']
['1,850925075915637700', '5', '-', 'T']
['-1,25228509312138300', 'A', 'N']
['0,636275318914609100', 'a', 'n', 'N']
How can I implement the sorting so that I can choose a certain portion of the list and then sort it and after that sort it again according to other column?
If I understand you correctly, you want to sort the list
first by the len of the sublists,
then by each of the elements in the list, except for the first, using the next element as a tie-breaker in case the previous are all equal
For this, you can use a tuple as the search key, using the len and a slice of the sublist starting at the second element (i.e. at index 1):
ls.sort(key=lambda x: (len(x), x[1:]))
Note that this will also use elements after the fourth as further tie-breakers, which might not be wanted. Also this creates temporary (near) copies of all the sublists, which may be prohibitive if the lists are longer, even if all comparisons may be decided after the 3rd or 4th element.
Alternatively, if you only need the first four, or ten, or whatever number of elements, you can create a closed slice and used that to compare:
ls.sort(key=lambda x: (len(x), x[1:4]))
Since out-of-bounds slices are evaluated as empty lists, this works even if the lists have fewer elements than either the start- or end-index.
How about:
ls.sort(key=lambda x: (l := len(x), x[1], '' if l < 4 else x[3]))
That would sort it by length of the sublist first, then by the 2nd column and finally by the 4th column, if there is one (picking '' in case there isn't, which would still sort it all the way to the top).

Creating a tree diagram from list of lists

I have a 2D list:
# # # ^ # ^ # ^
l = [['A', '1', '2'], ['B', 'xx', 'A'], ['C', 'B', 's'], ['D', 'd', 'B']]
and the first element in each list can be treated as an #ID string (in the example: A, B, C, D). Anywhere where the ID's (A, B, C, D) occur in the second dimension's lists I would like to replace it with the content of the actual list. Example: ['B', 'xx', 'A'] should become ['B', 'xx', ['A', '1', '2']] because A is an #ID (first string of list) and it occurs in the second list. Output should be:
n = [['A', '1', '2'], ['B', 'xx', ['A', '1', '2']], ['C', ['B', 'xx', ['A', '1', '2']], 's'],
['D', 'd', ['B', 'xx', ['A', '1', '2']]]]
The problem I am facing is that there can be longer lists and more branches so it's getting complicated. In the end I am trying to build a tree diagram. I was thinking of calculting first what is the highest branching but don't have a solution in mind yet.
l = [['A', '1', '2'], ['B', 'xx', 'A'], ['C', 'B', 's'], ['D', 'd', 'B']]
dic = {i[0]:i for i in l}
for i in l:
fv = i[0]
for j, v in enumerate(i):
if v in dic and j!=0:
dic[fv][j] = dic[v]
res = [v for i,v in dic.items()]
print(res)
output
[['A', '1', '2'],
['B', 'xx', ['A', '1', '2']],
['C', ['B', 'xx', ['A', '1', '2']], 's'],
['D', 'd', ['B', 'xx', ['A', '1', '2']]]]
Have you tried using a dictionary? If you have the ID's then you could possibly refer to them and then loop through the array and change entries. Below is what I had
l = [['A', '1', '2'], ['B', 'xx', 'A'], ['C', 'B', 's'], ['D', 'd', 'B'], ['E', 'C', 'b']]
dt = {}
for i in l:
dt[i[0]] = i
for i in range(len(l)):
for j in range(1, len(l[i])):
if(l[i][j] in dt):
l[i][j] = dt.get(l[i][j])
print(l)
Another more succinct version:
d = {item[0]: item for item in l}
for item in l:
item[1:] = [d.get(element, element) for element in item[1:]]

How to join innermost elements of a deep nested list using zip

Suppose that I have the following list of lists containing lists:
samples = [
# First sample
[
# Think 'x' as in input variable in ML
[
['A','E'], # Data
['B','F'] # Metadata
],
# Think 'y' as in target variable in ML
[
['C','G'], # Data
['D','H'], # Metadata
]
],
# Second sample
[
[
['1'],
['2']
],
[
['3'],
['4']
]
]
]
The output that I'm after looks like the following:
>>> samples
[
['A','E','1'], # x.data
['B','F','2'], # x.metadata
['C','G','3'], # y.data
['D','H','4'] # y.metadata
]
My question is that does there exist a way to utilize Python's zip function and maybe some list comprehensions to achieve this?
I have searched for some solutions, but for example this and this deal with using zip to address different lists, not inner lists.
A way to achieve this could very well be just a simple iteration over the samples like this:
x,x_len,y,y_len=[],[],[],[]
for sample in samples:
x.append(sample[0][0])
x_len.append(sample[0][1])
y.append(sample[1][0])
y_len.append(sample[1][1])
samples = [
x,
x_len,
y,
y_len
]
I'm still curious if there exists a way to utilize zip over for looping the samples and their nested lists.
Note that the data and metadata can vary in length across samples.
IIUC, one way is to use itertools.chain to flatten the results of zip(samples):
from itertools import chain
new_samples = [
list(chain.from_iterable(y)) for y in zip(
*((chain.from_iterable(*x)) for x in zip(samples))
)
]
print(new_samples)
#[['A', 'E', '1'], ['B', 'F', '2'], ['C', 'G', '3'], ['D', 'H', '4']]
Step by step explanation
1) First call zip on samples:
print(list(zip(samples)))
#[([[['A', 'E'], ['B', 'F']], [['C', 'G'], ['D', 'H']]],),
# ([[['1'], ['2']], [['3'], ['4']]],)]
Notice that in the two lines in the output above, if the elements were flattened, you'd have the structure needed to zip in order to get your final results.
2) Use itertools.chain to flatten (which will be much more efficient than using sum).
print([list(chain.from_iterable(*x)) for x in zip(samples)])
#[[['A', 'E'], ['B', 'F'], ['C', 'G'], ['D', 'H']],
# [['1'], ['2'], ['3'], ['4']]]
3) Now call zip again:
print(list(zip(*((chain.from_iterable(*x)) for x in zip(samples)))))
#[(['A', 'E'], ['1']),
# (['B', 'F'], ['2']),
# (['C', 'G'], ['3']),
# (['D', 'H'], ['4'])]
4) Now you basically have what you want, except the lists are nested. So use itertools.chain again to flatten the final list.
print(
[
list(chain.from_iterable(y)) for y in zip(
*((chain.from_iterable(*x)) for x in zip(samples))
)
]
)
#[['A', 'E', '1'], ['B', 'F', '2'], ['C', 'G', '3'], ['D', 'H', '4']]
Here's another solution. Quite ugly, but it does use zip, even twice!
>>> sum(map(lambda y: list(map(lambda x: sum(x, []), zip(*y))), zip(*samples)), [])
[['A', '1'], ['B', '2'], ['C', '3'], ['D', '4']]
It is interesting to see how it works, but please don't actually use it; it is both hard to read and algorithmically bad.
You could do:
res = [[y for l in x for y in l] for x in zip(*([x for var in sample for x in var] for sample in samples))]
print([list(i) for i in res])
Gives on your example:
[['A', 'E', '1'], ['B', 'F', '2'], ['C', 'G', '3'], ['D', 'H', '4']]
This basically flattens each "sample" to a list and packs that in a big list, then unbpacks that into zip and then packs each zipped element to a list.
Not the most comfortable data structure to work with you have there. I would advise to refactor the code and choose something else than 3-times nested lists to keep the data, but if it is currently not possible, I suggest the following approach:
import itertools
def flatten(iterable):
yield from itertools.chain.from_iterable(iterable)
result = []
for elements in zip(*map(flatten, samples)):
result.append(list(flatten(elements)))
For your example it gives:
[['A', 'E', '1'],
['B', 'F', '2'],
['C', 'G', '3'],
['D', 'H', '4']]
Test for more than 2 samples:
samples = [[[['A', 'E'], ['B', 'F']],
[['C', 'G'], ['D', 'H']]],
[[['1'], ['2']],
[['3'], ['4']]],
[[['5'], ['6']],
[['7'], ['8']]]]
gives:
[['A', 'E', '1', '5'],
['B', 'F', '2', '6'],
['C', 'G', '3', '7'],
['D', 'H', '4', '8']]
Explanation:
The flatten generator function simply flattens 1 level of a nested iterable. It is based on itertools.chain.from_iterable function. In map(flatten, samples) we apply this function to each element of samples:
>>> map(flatten, samples)
<map at 0x3c6685fef0> # <-- map object returned, to see result wrap it in `list`:
>>> list(map(flatten, samples))
[<generator object flatten at 0x0000003C67A2F9A8>, # <-- will flatten the 1st sample
<generator object flatten at 0x0000003C67A2FA98>, # <-- ... the 2nd
<generator object flatten at 0x0000003C67A2FB10>] # <-- ... the 3rd and so on if there are more
# We can see what each generator will give by applying `list` on each one of them
>>> list(map(list, map(flatten, samples)))
[[['A', 'E'], ['B', 'F'], ['C', 'G'], ['D', 'H']],
[['1'], ['2'], ['3'], ['4']],
[['5'], ['6'], ['7'], ['8']]]
Next, we can use zip to iterate over the flattened samples. Note that we cannot apply it on map object directly:
>>> list(zip(map(flatten, samples)))
[(<generator object flatten at 0x0000003C66944138>,),
(<generator object flatten at 0x0000003C669441B0>,),
(<generator object flatten at 0x0000003C66944228>,)]
we should unpack it first:
>>> list(zip(*map(flatten, samples)))
[(['A', 'E'], ['1'], ['5']),
(['B', 'F'], ['2'], ['6']),
(['C', 'G'], ['3'], ['7']),
(['D', 'H'], ['4'], ['8'])]
# or in a for loop:
>>> for elements in zip(*map(flatten, samples)):
... print(elements)
(['A', 'E'], ['1'], ['5'])
(['B', 'F'], ['2'], ['6'])
(['C', 'G'], ['3'], ['7'])
(['D', 'H'], ['4'], ['8'])
Finally, we just have to join all the lists in each elements tuple together. We can use the same flatten function for that:
>>> for elements in zip(*map(flatten, samples)):
... print(list(flatten(elements)))
['A', 'E', '1', '5']
['B', 'F', '2', '6']
['C', 'G', '3', '7']
['D', 'H', '4', '8']
And you just have to put it all back in a list as shown in the first code sample.

Remove space charater in list

How do I go about removing the space (' ') in this list?
list = ['a', 'b', 'c', ' ', '1', '2', '3', ' ', 'd', 'e','f']
As far as I know, pop / remove method works with slices but the space character changes position depending on the input.
A conditional comprehension will do:
lst = ['a', 'b', 'c', ' ', '1', '2', '3', ' ', 'd', 'e','f'] # do not shadow 'list'
lst = [x for x in lst if x != ' ']
If you have to mutate the existing list object and not just rebind the variable, use slice assignment
lst[:] = [x for x in lst if x != ' ']
In case you want to remove any string that consists solely of whitespace characters, you can utilize str.strip()
lst = [x for x in lst if x.strip()]
Note that rebuilding the list from scratch is often better performance-wise than repeatedly calling del, pop or remove as each of those calls has linear complexity since all the elements after the deletion index need to be shifted in the underlying array.
you can do by using del function that delete the element from the list.
Code :
lst = ['a', 'b', 'c', ' ', '1', '2', '3', ' ', 'd', 'e','f']
count = 0
for i in lst:
if i == ' ':
del lst[count]
count = count + 1
print(lst)
Output :
['a', 'b', 'c', '1', '2', '3', 'd', 'e', 'f']
Below is the functional approach to achieve what you want:
list_input = ['a', 'b', 'c', ' ', '1', '2', '3', ' ', 'd', 'e','f']
print(list(filter(lambda elem: elem != ' ', list_input)))
# Output: ['a', 'b', 'c', '1', '2', '3', 'd', 'e', 'f']
More pythonic list-comprehension approach:
list_input = ['a', 'b', 'c', ' ', '1', '2', '3', ' ', 'd', 'e','f']
print([elem for elem in list_input if elem != ' '])
# Output: ['a', 'b', 'c', '1', '2', '3', 'd', 'e', 'f']
Just to remember of itertools:
from itertools import filterfalse
list(filterfalse(lambda x: x == ' ', lst))
#=> ['a', 'b', 'c', '1', '2', '3', 'd', 'e', 'f']

How to count the number of times a certain pattern in a sublist occurs within a list and then append that count to the sublist?

The challenge is that I want to count the number of times a certain pattern of items occurs in a sub-list at certain indices.
For example, I'd like to count the number of times a unique patter occurs at index 0 and index 1. 'a' and 'z' occur three times below at index 0 and index 1 while '1' and '2' occur two times below at index 0 and index 1. I'm only concerned at the pair that occurs at index 0 and 1 and I'd like to know the count of unique pairs that are there and then append that count back to the sub-list.
List = [['a','z','g','g','g'],['a','z','d','d','d'],['a','z','z','z','d'],['1','2','f','f','f'],['1','2','3','f','f'],['1','1','g','g','g']]
Desired_List = [['a','z','g','g','g',3],['a','z','d','d','d',3],['a','z','z','z','d',3],['1','2','f','f','f',2],['1','2','3','f','f',2],['1','1','g','g','g',1]]
Currently, my attempt is this:
from collections import Counter
l1 = Counter(map(lambda x: (x[0] + "|" + x[1]),List)
Deduped_Original_List = map(lambda x: Counter.keys().split("|"),l1)
Counts = map(lambda x: Counter.values(),l1)
for ele_a, ele_b in zip(Deduped_Original_List, Counts):
ele_a.append(ele_b)
This clearly doesn't work because in the process I lose index 2,3, and 4.
You can use list comprehension with collections.Counter:
from collections import Counter
lst = [['a','z','g','g','g'],['a','z','d','d','d'],['a','z','z','z','d'],['1','2','f','f','f'],['1','2','3','f','f'],['1','1','g','g','g']]
cnt = Counter([tuple(l[:2]) for l in lst])
lst_output = [l + [cnt[tuple(l[:2])]] for l in lst]
print lst_output
Ouput:
[['a', 'z', 'g', 'g', 'g', 3], ['a', 'z', 'd', 'd', 'd', 3], ['a', 'z', 'z', 'z', 'd', 3], ['1', '2', 'f', 'f', 'f', 2], ['1', '2', '3', 'f', 'f', 2], ['1', '1', 'g', 'g', 'g', 1]]
>>> import collections
>>> List = [['a','z','g','g','g'],['a','z','d','d','d'],['a','z','z','z','d'],['1','2','f','f','f'],['1','2','3','f','f'],['1','1','g','g','g']]
>>> patterns = ['az', '12']
>>> answer = collections.defaultdict(int)
>>> for subl in List:
... for pattern in patterns:
... if all(a==b for a,b in zip(subl, pattern)):
... answer[pattern] += 1
... break
...
>>> for i,subl in enumerate(List):
... if ''.join(subl[:2]) in answer:
... List[i].append(answer[''.join(subl[:2])])
...
>>> List
[['a', 'z', 'g', 'g', 'g', 3], ['a', 'z', 'd', 'd', 'd', 3], ['a', 'z', 'z', 'z', 'd', 3], ['1', '2', 'f', 'f', 'f', 2], ['1', '2', '3', 'f', 'f', 2], ['1', '1', 'g', 'g', 'g']]
>>>
I like the Counter approach of YS-L. Here is another approach:
>>> List = [['a','z','g','g','g'], ['a','z','d','d','d'], ['a','z','z','z','d'],['1','2','f','f','f'], ['1','2','3','f','f'], ['1','1','g','g','g']]
>>> d = {}
>>> for i in List:
key = i[0] + i[1]
if not d.get(key, None): d[key] = 1
else: d[key] += 1
>>> Desired_List = [li + [d[li[0] + li[1]]] for li in List]
>>> Desired_List
[['a', 'z', 'g', 'g', 'g', 3], ['a', 'z', 'd', 'd', 'd', 3], ['a', 'z', 'z', 'z', 'd', 3], ['1', '2', 'f', 'f', 'f', 2], ['1', '2', '3', 'f', 'f', 2], ['1', '1', 'g', 'g', 'g', 1]]

Categories

Resources