restructure lists of lists by iteration in Python - python

I have a 2D list or list of lists.
Input file is
A 58.76-65.9
B 58.76-65.9
C 58.76-65.9
A 24.8-62.8
I then created a list of lists:
with open("Input.txt", "r") as file:
raw = [[str(x) for x in line.split()] for line in file]
print (raw)
which returns
[['A', '58.76-65.9'], ['B', '58.76-65.9'], ['C', '58.76-65.9'], ['A', '24.8-62.8']]
My aim is to now create a new list of lists with a new structure. How can I obtain a new list of lits like this?
[['58.76-65.9', 'A', 'B', 'C'], ['A', '24.8-62.8']]
I first tried unioning sets, but that creates one large list and I need lists of lists. Therefore my plan is to (1) Create a new empty list of lists,
(2) iterate through the original list of lists,
(3) check if the 2nd element (i.e. 58.76-65.9) exists in the new list lists of lists. If it does not, extend both elements. If it does, just the first element (ie A)
# Defining empty list
matches=[]
# Accesing each row in the 2d list
for r in raw:
if r[1] not in matches[0][]:
matches.append([r[1], r[0]])
I realize that matches[0][] is not correct, what is the correct way to access it?

Use the grouping idiom:
>>> data = [['A', '58.76-65.9'], ['B', '58.76-65.9'], ['C', '58.76-65.9'], ['A', '24.8-62.8']]
>>> from collections import defaultdict
>>> grouper = defaultdict(list)
>>> for x, y in data:
... grouper[y].append(x)
...
>>> grouper
defaultdict(<class 'list'>, {'24.8-62.8': ['A'], '58.76-65.9': ['A', 'B', 'C']})
Now, I honestly think the above data-structure is much more practical, but you can easily convert into a list-of-lists if you really want:
>>> [[k] + v for k, v in grouper.items()]
[['24.8-62.8', 'A'], ['58.76-65.9', 'A', 'B', 'C']]
Or even nicer:
>>> [[k, *v] for k, v in grouper.items()]
[['24.8-62.8', 'A'], ['58.76-65.9', 'A', 'B', 'C']]

Just use the dictionary data structure. It does, what you want:
# Load data:
my_array = [[1 , 10], [2, 10], [3, 20]]
# Result as a dictionary:
result = {}
# Loop over data:
for value, key in my_array:
if key not in result:
# Create new list
result[key]=[]
result[key].append(value)
# If you really need a list of lists as output, do something like:
result_l = [list(elem) for elem in result.items()]
# (in python3)

Related

I have a list and numpy array of lists. How to find the index and extract the value from index?

I have a list,
list = ["A","B","C","D","E"]
values = np.array([[1,0,0,1,1],[0,1,0,0,1]])
where values is of type numpy array.
I want my output to look like,
["A","D","E"]
["B","E"]
I want to loop through every element inside a list and extract the index of elements having values 1. Using the index from Values get the names for the same index from the list and store them as a list inside a DataFrame. This has to be done for every list inside values.
Kindly help. Thanks
Try a list comprehension:
l = ["A","B","C","D","E"]
values = ([[1,0,0,1,1],[0,1,0,0,1]])
print([[x for x, y in zip(l, i) if y] for i in values])
Output:
[['A', 'D', 'E'], ['B', 'E']]
Try itertools.compress:
>>> from itertools import compress
>>> list_ = ["A","B","C","D","E"]
>>> values = np.array([[1,0,0,1,1],[0,1,0,0,1]])
>>> result = [[*compress(list_,val)] for val in values]
>>> print(*result, sep = '\n')
['A', 'D', 'E']
['B', 'E']

Find index element in a list of lists and strings

I have a list containing strings and lists. Something like:
l = ['a', 'b', ['c', 'd'], 'e']
I need to find the index of an element I'm looking for in this nested list. For instance, if I need to find c, the function should return 2, and if I'm looking for d, it should return 2 too. Consider that I have to do this for a large number of elements. Before I was simply using
idx = list.index(element)
but this does not work anymore, because of the nested lists. I cannot simply flatten the list, as I then shall use the index in another list with the same shape as this one.
Any suggestion?
This is one approach, Iterating the list.
Ex:
l = ['a', 'b', ['c', 'd'], 'e']
toFind = "c"
toFind1 = "d"
for i, v in enumerate(l):
if isinstance(v, list):
if toFind1 in v:
print(i)
else:
if toFind1 == v:
print(i)

Extract unique list from nested list in Python

I want to extract unique data from nested list, see below. I implemented two way of this. First one works good, but second one failed. Is new_data is empty during calculation? And how do I fix it?
data = [
['a', 'b'],
['a', 'c'],
['a', 'b'],
['b', 'a']
]
# working
new_data = []
for d in data:
if d not in new_data:
new_data.append(d)
print(new_data)
# [['a', 'b'], ['a','c'], ['b','a']]
# Failed to extract unique list
new_data = []
new_data = [d for d in data if d not in new_data]
print(new_data)
# [['a', 'b'], ['a', 'c'], ['a', 'b'], ['b', 'a']]
Just try:
new_data = [list(y) for y in set([tuple(x) for x in data])]
You cannot use set() on a list of lists because lists are not hashable. You convert the list of lists into a list of tuples. Apply set() to remove the duplicates. Then convert the de duplicated list of tuples back into a list of lists.
you could use enumerate to test that there are no copies before the current value such that only the first instance of a copy is taken:
new_data = [item for index, item in enumerate(data) if item not in data[:index]]

Replacing an element in a list with multiple elements

I am trying to modify a list of two lists. For each of the two inside lists, I perform some operation and 'split' them into new lists.
Here is a simple example of what I'm trying to do:
[['a', 'b'], ['c', 'd']] --> [['a'], ['b'], ['c', 'd']]
Currently my algorithm passes ['a', 'b'] to a function that determines whether or not it should be split into [['a'], ['b']] (e.g. based on their correlations). The function returns [['a'], ['b']] which tells me that ['a', 'b'] should be split, or returns ['a', 'b'] (the original list) which indicates that it should not be split.
Currently I have something like this:
blist = [['a', 'b'], ['c', 'd']] #big list
slist = [['a'], ['b']] #small list returned by function
nlist = [items for i in xrange(len(blist)) for items in (slist if i==0 else blist[i])]
This produces [['a'], ['b'], 'c', 'd'] as opposed to the desired output [['a'], ['b'], ['c', 'd']] which does not alter the second list in the original blist. I understand why this is happening--my second loop is also applied to blist[1] in this case, but I am not sure how to fix it as I do not understand list comprehension completely.
A 'pythonic' solution is preferred.
Any feedback would be appreciated, thank you!
EDIT: Like the title suggests, I am trying to 'replace' ['a', 'b'] with ['a'], ['b']. So I would like the 'position' to be the same, having ['a'], ['b'] appear in the original list before ['c', 'd']
RESULTS
Thank you Christian, Paul and schwobaseggl for your solutions! They all work :)
Try
... else [blist[i]])]
to create a list of lists.
You can use slice assignment:
>> l1 = [[1, 2], [3, 4]]
>>> l2 = [[1], [2]]
>>> l1[0:1] = l2
>>> l1
[[1], [2], [3, 4]]
This changes l1, so if you want to keep it make a copy before.
Another way that doesn't change l1 is addition:
>> l1 = [[1, 2], [3, 4]]
>>> l3 = l2 + l1[1:]
>>> l3
[[1], [2], [3, 4]]
You could alter your split function to return structurally adequate lists. Then you can use a comprehension:
def split_or_not(l):
if condition: # split
return [l[:1], l[1:]]
return [l] # wrap in extra list
# using map
nlist = [x for sub_l in map(split_or_not, blist) for x in sub_l]
# or nested comprehension
nlist = [x for sub_l in (split_or_not(l) for l in blist) for x in sub_l]
Assuming you have the mentioned funtion that decides whether to split an item:
def munch(item):
if item[0] == 'a': # split
return [[item[0]], [item[1]]]
return [item] # don't split
You can use it in s simple for-loop.
nlist = []
for item in blist:
nlist.extend(munch(item))
"Pythonic" is whatever is easy to read and understand. Don't use list comprehensions just because you can.

Combining elements in list using python

Given input:
list = [['a']['a', 'c']['d']]
Expected Ouput:
mylist = a,c,d
Tried various possible ways, but the error recieved is TypeError: list indices must be integers not tuple.
Tried:
1.
k= []
list = [['a']['a', 'c']['d']]
#k=str(list)
for item in list:
k+=item
print k
2.
print zip(*list)
etc.
Also to strip the opening and closing parenthesis.
What you want is flattening a list.
>>> import itertools
>>> l
[['a'], ['a', 'c'], ['d']]
>>> res = list(itertools.chain.from_iterable(l))
>>> res
['a', 'a', 'c', 'd']
>>> set(res) #for uniqify, but doesn't preserve order
{'a', 'c', 'd'}
Edit: And your problem is, when defining a list, you should seperate values with a comma. So, not:
list = [['a']['a', 'c']['d']]
Use commas:
list = [['a'], ['a', 'c'], ['d']]
And also, using list as a variable is a bad idea, it conflicts with builtin list type.
And, if you want to use a for loop:
l = [['a'], ['a', 'c'], ['d']]
k = []
for sublist in l:
for item in sublist:
if item not in k: #if you want list to be unique.
k.append(item)
But using itertools.chain is better idea and more pythonic I think.
While utdemir's answer does the job efficiently, I think you should read this - start from "11.6. Recursion".
The first examples deals with a similar problem, so you'll see how to deal with these kinds of problems using the basic tools.

Categories

Resources