Need take indices from list in specific order

Need take indices from list in specific order - python

I have some list A = ['a', 'b', 'c', 'd', 'e', 'f']
I need take indices of elements in this order 0 1 1 2 2 3 3 4 4 5
But my code made this order 0 1 2 3 4 5
A = ['a', 'b', 'c', 'd', 'e', 'f']
for i in A:
print(A.index(i), end=' ')

If you have the desired indices why not try this:
X = [0, 1, 1, 2, 2, 3, 3, 4, 4, 5]
A = ['a', 'b', 'c', 'd', 'e', 'f']
for i in X:
print(A[i], end=' ')

Using list comprehension to extract the values corresponding to the indices.
X = [0, 1, 1, 2, 2, 3, 3, 4, 4, 5]
A = ['a', 'b', 'c', 'd', 'e', 'f']
new_list = [A[x] for x in X]
Update
How to make a flat list out of a list of lists Used to flatten nested list
list_of_list = [[x,x] for x in range(1,len(A))]
new_list = [0]+[item for sublist in list_of_list for item in sublist]

Related

Filter list using duplicates in another list

I have two equal-length lists, a and b:
a = [1, 1, 2, 4, 5, 5, 5, 6, 1]
b = ['a','b','c','d','e','f','g','h', 'i']
I would like to keep only those elements from b, which correspond to an element in a appearing for the first time. Expected result:
result = ['a', 'c', 'd', 'e', 'h']
One way of reaching this result:
result = [each for index, each in enumerate(b) if a[index] not in a[:index]]
# result will be ['a', 'c', 'd', 'e', 'h']
Another way, invoking Pandas:
import pandas as pd
df = pd.DataFrame(dict(a=a,b=b))
result = list(df.b[~df.a.duplicated()])
# result will be ['a', 'c', 'd', 'e', 'h']
Is there a more efficient way of doing this for large a and b?

You could try if this is faster:
firsts = {}
result = [firsts.setdefault(x, y) for x, y in zip(a, b) if x not in firsts]

Convert a list of string to category integer in Python

Given a list of string,
['a', 'a', 'c', 'a', 'a', 'a', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'b', 'b', 'b', 'd', 'b', 'b', 'b']
I would like to convert to an integer-category form
[0, 0, 2, 0, 0, 0, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 1, 1, 1, 3, 1, 1, 1]
This can achieve using numpy unique as below
ipt=['a', 'a', 'c', 'a', 'a', 'a', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'b', 'b', 'b', 'd', 'b', 'b', 'b']
_, opt = np.unique(np.array(ipt), return_inverse=True)
But, I curious if there is another alternative without the need to import numpy.

If you are solely interested in finding integer representation of factors, then you can use a dict comprehension along with enumerate to store the mapping, after using set to find unique values:
lst = ['a', 'a', 'c', 'a', 'a', 'a', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'b', 'b', 'b', 'd', 'b', 'b', 'b']
d = {x: i for i, x in enumerate(set(lst))}
lst_new = [d[x] for x in lst]
print(lst_new)
# [3, 3, 0, 3, 3, 3, 2, 0, 2, 2, 2, 2, 0, 2, 2, 2, 2, 0, 2, 2, 2, 2, 0, 1, 1, 1, 2, 1, 1, 1]
This approach can be used for general factors, i.e., the factors do not have to be 'a', 'b' and so on, but can be 'dog', 'bus', etc. One drawback is that it does not care about the order of factors. If you want the representation to preserve order, you can use sorted:
d = {x: i for i, x in enumerate(sorted(set(lst)))}
lst_new = [d[x] for x in lst]
print(lst_new)
# [0, 0, 2, 0, 0, 0, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 3, 3, 3, 3, 2, 1, 1, 1, 3, 1, 1, 1]

You could take a note out of the functional programming book:
ipt=['a', 'a', 'c', 'a', 'a', 'a', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'd', 'd', 'd', 'd', 'c', 'b', 'b', 'b', 'd', 'b', 'b', 'b']
opt = list(map(lambda x: ord(x)-97, ipt))
This code iterates through the input array and passes each element through the lambda function, which takes the ascii value of the character, and subtracts 97 (to convert the characters to 0-25).
If each string isn't a single character, then the lambda function may need to be adapted.

You could write a custom function to do the same thing as you are using numpy.unique() for.
def unique(my_list):
''' Takes a list and returns two lists, a list of each unique entry and the index of
each unique entry in the original list
'''
unique_list = []
int_cat = []
for item in my_list:
if item not in unique_list:
unique_list.append(item)
int_cat.append(unique_list.index(item))
return unique_list, int_cat
Or if you wanted your indexing to be ordered.
def unique_ordered(my_list):
''' Takes a list and returns two lists, an ordered list of each unique entry and the
index of each unique entry in the original list
'''
# Unique list
unique_list = []
for item in my_list:
if item not in unique_list:
unique_list.append(item)
# Sorting unique list alphabetically
unique_list.sort()
# Integer category list
int_cat = []
for item in my_list:
int_cat.append(unique_list.index(item))
return unique_list, int_cat
Comparing the computation time for these two vs numpy.unique() for 100,000 iterations of your example list, we get:
numpy = 2.236004s
unique = 0.460719s
unique_ordered = 0.505591s
Showing that either option would be faster than numpty for simple lists. More complicated strings decrease the speed of unique() and unique_ordered much more than numpy.unique(). Doing 10,000 iterations of a random, 100 element list of 20 character strings, we get times of:
numpy = 0.45465s
unique = 1.56963s
unique_ordered = 1.59445s
So if efficiency was important and your list had more complex/a larger variety of strings, it would likely be better to use numpy.unique()

Get right label using indices?

Really stupid question as I am new to python:
If I have labels = ['a', 'b', 'c', 'd'],
and indics = [2, 3, 0, 1]
How should I get the corresponding label using each index so I can get: ['c', 'd', 'a', 'b']?

There are a few alternatives, one, is to use a list comprehension:
labels = ['a', 'b', 'c', 'd']
indices = [2, 3, 0, 1]
result = [labels[i] for i in indices]
print(result)
Output
['c', 'd', 'a', 'b']
Basically iterate over each index and fetch the item at that position. The above is equivalent to the following for loop:
result = []
for i in indices:
result.append(labels[i])
A third option is to use operator.itemgetter:
from operator import itemgetter
labels = ['a', 'b', 'c', 'd']
indices = [2, 3, 0, 1]
result = list(itemgetter(*indices)(labels))
print(result)
Output
['c', 'd', 'a', 'b']

Appending to a list of lists sequentially

I have two list of lists:
my_list = [[1,2,3,4], [5,6,7,8]]
my_list2 = [['a', 'b', 'c'], ['d', 'e', 'f']]
I want my output to look like this:
my_list = [[1,2,3,4,'a','b','c'], [5,6,7,8,'d','e','f']]
I wrote the following code to do this but I end up getting more lists in my result.
my_list = map(list, (zip(my_list, my_list2)))
this produces the result as:
[[[1, 2, 3, 4], ['a', 'b', 'c']], [[5, 6, 7, 8], ['d', 'e', 'f']]]
Is there a way that I can remove the redundant lists.
Thanks

Using zip is the right approach. You just need to add the elements from the tuples zip produces.
>>> my_list = [[1,2,3,4], [5,6,7,8]]
>>> my_list2 = [['a', 'b', 'c'], ['d', 'e', 'f']]
>>> [x+y for x,y in zip(my_list, my_list2)]
[[1, 2, 3, 4, 'a', 'b', 'c'], [5, 6, 7, 8, 'd', 'e', 'f']]

You can use zip in a list comprehension:
my_list = [[1,2,3,4], [5,6,7,8]]
my_list2 = [['a', 'b', 'c'], ['d', 'e', 'f']]
new_list = [i+b for i, b in zip(my_list, my_list2)]

As an alternative you may also use map with sum and lambda function to achieve this (but list comprehension approach as mentioned in other answer is better):
>>> map(lambda x: sum(x, []), zip(my_list, my_list2))
[[1, 2, 3, 4, 'a', 'b', 'c'], [5, 6, 7, 8, 'd', 'e', 'f']]

Repeat each elements based on a list of values

Is there a Python builtin that repeats each element of a list based on the corresponding value in another list? For example A in list x position 0 is repeated 2 times because of the value 2 at position 0 in the list y.
>>> x = ['A', 'B', 'C']
>>> y = [2, 1, 3]
>>> f(x, y)
['A', 'A', 'B', 'C', 'C', 'C']
Or to put it another way, what is the fastest way to achieve this operation?

Just use a simple list comprehension:
>>> x = ['A', 'B', 'C']
>>> y = [2, 1, 3]
>>> [x[i] for i in range(len(x)) for j in range(y[i])]
['A', 'A', 'B', 'C', 'C', 'C']
>>>

One way would be the following
x = ['A', 'B', 'C']
y = [2, 1, 3]
s = []
for a, b in zip(x, y):
s.extend([a] * b)
print(s)
result
['A', 'A', 'B', 'C', 'C', 'C']

from itertools import chain
list(chain(*[[a] * b for a, b in zip(x, y)]))
['A', 'A', 'B', 'C', 'C', 'C']
There is itertools.repeat as well, but that ends up being uglier for this particular case.

Try this
x = ['A', 'B', 'C']
y = [2, 1, 3]
newarray = []
for i in range(0,len(x)):
newarray.extend(x[i] * y[i])
print newarray

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Need take indices from list in specific order - python

I have some list A = ['a', 'b', 'c', 'd', 'e', 'f'] I need take indices of elements in this order 0 1 1 2 2 3 3 4 4 5 But my code made this order 0 1 2 3 4 5 A = ['a', 'b', 'c', 'd', 'e', 'f'] for i in A: print(A.index(i), end=' ')

If you have the desired indices why not try this: X = [0, 1, 1, 2, 2, 3, 3, 4, 4, 5] A = ['a', 'b', 'c', 'd', 'e', 'f'] for i in X: print(A[i], end=' ')

Related

Filter list using duplicates in another list

Convert a list of string to category integer in Python

Get right label using indices?

Appending to a list of lists sequentially

Repeat each elements based on a list of values

Categories

Resources