Manipulating a list of lists multiple times in order - python

I'm trying to alter a list of lists in multiple ways by using a function (as I will have more than one list of lists).
I know how to change something once, but how do I do more than that? I get the error:
AttributeError: 'int' object has no attribute 'insert'
I understand that the error essentially means (whatever I'm trying to use .insert() on is not a list) but I don't quite understand why it's not a list...
See my code below:
This works and gives me the desired output
list_of_list3 = [['a', 1], ['b', 2], ['c', 3]]
list_to_add = ['Z', 'X', 'Y']
for list_position in range(len(list_of_list3)):
original_list = list_of_list3[list_position]
element_to_add = list_to_add[list_position]
original_list.insert(0, element_to_add)
print(list_of_list3)
This will give me what I want:
[['Z', 'a', 1], ['X', 'b', 2], ['Y', 'c', 3]]
However, what I need is a function which does more than one thing at once. I am trying the code below:
def output_function(add_list, list_of_list):
for list_position in range(len(list_of_list)):
list_within_list = cleaned_list[list_position]
add_element1 = add_list[list_position] # The two lists will always have the same length
list_within_list = list_within_list.pop() # I want to remove the last element
list_with_element1 = list_within_list.insert(0, add_element1) # I then want to add a new element
list_with_new_list = list_with_element1.insert(0, ['Column1', 'Column2', 'Column3']) #Then I want to add a new list to the beginning of list of lists
new_elements = ['A', 'B', 'C']
original_list_list = [['D', 1, 2], ['E', 3, 4], ['F', 5, 6']
output_function(new_elements, original_list_list)
My desired output is (ultimately will turn this into a pandas df)
[['Column1', 'Column2', 'Column3'], ['A', 'D', 1], ['B','E', 3], ['C', 'F', 5]]
Any help is appreciated. Thanks!

I believe you are having some misunderstanding with the methods you are calling.
Your comments indicate you are going to throw away the popped element, but you are actually throwing away the list, and using the element instead.
These 2 lines:
list_within_list = list_within_list.pop() # I want to remove the last element
list_with_element1 = list_within_list.insert(0, add_element1) # I then want to add a new element
One way to accomplish:
list_with_element1 = list_within_list[:-1]
list_with_element1.insert(0, add_element1)

Related

Joining values of a list inside another list into a string

Im trying to join the letters as a string that's inside the list which is also inside the list. So for example, it looks like this [['a', 'b', 'c'], ['d', 'e', 'f']] however I want the result to look like 'ad be cf' which is basically taking the element that lies in the same position in the list. I know how to join the elements into a list that can look like 'abcdef', however, i don't know which I could add in order to return a string that looks like above.
Any advice would be thankful!
string = ''
new_grid = []
for a in grid:
for b in a:
string += b
return string
When you want to transpose lists into columns, you typically reach for zip(). For example:
l = [['a', 'b', 'c'], ['d', 'e', 'f']]
# make a list of columns
substrings = ["".join(sub) for sub in zip(*l)]
#['ad', 'be', 'cf']
print(" ".join(substrings))
# alternatively print(*substrings, sep=" ")
# ad be cf
This works:
my_list = [['a', 'b', 'c'], ['d', 'e', 'f']]
sorted_list = [list(pair) for pair in zip(my_list[0], my_list[1])]
for i in range(3):
string = ''.join(sorted_list[i])
print(string, end=" ")
First, we are pairing each individual list to its corresponding value using [zip][1], then we are joining it into a string, and printing it out.
This solution may not be the most efficient, but it's simple to understand.
Another quick solution without zip could look like this:
my_list = [['a', 'b', 'c'], ['d', 'e', 'f']]
sorted_list = list(map(lambda a, b: a + b, my_list[0], my_list[1]))
print(" ".join(sorted_list))

Extract non-repeated lists from a list of lists based on a criteria on a common element in all lists

Is there any way to extract non-repeated lists from a list of lists based on a criteria on a common element in all lists? For example if I have the following list of list:
list_lists = [['a', [1,2]], ['b', [2,5]], ['c', [1,2]], ['d', [2,5]], ['e', [2,6]]]
Let's assume that my criteria to call a list unique is the last element in that list. So, since [2,6] is repeated once then ['e', [2,6]] is the only element that is unique and I can say:
list_of_unique = [['e', [2,6]]]
Naive solution that firstly comes to mind
list_lists = [['a', [1,2]], ['b', [2,5]], ['c', [1,2]], ['d', [2,5]], ['e', [2,6]]]
counter = {}
for i in range(len(list_lists)):
last = tuple(list_lists[i][-1])
if last not in counter:
counter[last] = 1, i
else:
counter[last] = counter[last][0] + 1, i
print([list_lists[i] for c, i in counter.values() if c == 1])

Combine two lists for web scraping project

I have two lists: one is a basic list, with some being "new line" symbols (\n), and the other is a list of lists.
I would like to combine these, inserting the elements from the second list into the first list where \n appears so that the end result looks like this:
first_list = ['a','b','c',\n, 'd','e','f','g','h',\n]
second_list = [[1,2,3], [4,5,6]]
combine the two lists to get:
combined_list = ['a','b','c',1,2,3,'d','e','f','g','h',4,5,6].
I'm not quite sure why, but all of the \n's in the first list in my example have the same index position. Thus, when I try to loop through both lists to first find the position of the first \n and insert [1,2,3] at that point, it ends up inserting [1,2,3] at all positions where \n appears. I tried to simplify the problem here to make it easier to communicate, but the original problem comes from a web scraping project I am working on to retrieve information from Linkedin, with the elements in these lists being profile attributes for Linkedin users. Perhaps that could help to explain why the \n's all have the same index position?
Any help with how to properly combine these lists in the above way/explanations for why the \n's have the same index position would be greatly appreciated! Please let me know if I can provide any additional details. Thanks.
I know you mentioned there were some indexing issues with the \n values, but hopefully this sets you on the right track..it works for the simplified example data you provided (re-formatted to be proper considering the letters are not variables)
l1 = ['a','b','c','\n','d','e','f','g','h','\n']
l2 = [[1,2,3], [4,5,6]]
l3 = []
n_count = 0
for i,l in zip(range(len(l1)),l1):
if l != '\n':
l3.append(l)
elif l == '\n':
l3.extend(l2[n_count])
n_count += 1
print(l3)
['a', 'b', 'c', 1, 2, 3, 'd', 'e', 'f', 'g', 'h', 4, 5, 6]
if you can figure out the indexing issue this might help you with minor modifications
I assume that List1 and/or List2 can be continued.
The number of lists in List2 needs to be higher or equal than '\n's in List1.
List1 = ['a','b','c', '\n', 'd','e','f','g','h', '\n']
List2 = [[1,2,3], [4,5,6]]
# wanted = [a,b,c,1,2,3,d,e,f,g,h,4,5,6]
list3 = []
counter = 0
for val in List1:
if val == '\n':
[list3.append(elem) for elem in List2[counter]]
counter += 1
else:
list3.append(val)
print(list3)
['a', 'b', 'c', 1, 2, 3, 'd', 'e', 'f', 'g', 'h', 4, 5, 6]

Pandas multiindex: get level values without duplicates

So I'm sure this is pretty trivial but I'm pretty new to python/pandas.
I want to get a certain column (Names of my measurements) of my Multiindex as a list to use it in a for loop later to name and save my plots. I'm pretty confident in getting the data I need from my dataframe but i can't figure out how to get certain columns from my index.
So actually while writing the question I kind of figured the answer out but it still seems kind of clunky. There has to be a direct command to do this.
That would be my code:
a = df.index.get_level_values('File')
a = a.drop_duplicates()
a = a.values
index.levels
You can access unique elements of each level of your MultiIndex directly:
df = pd.DataFrame([['A', 'W', 1], ['B', 'X', 2], ['C', 'Y', 3],
['D', 'X', 4], ['E', 'Y', 5]])
df = df.set_index([0, 1])
a = df.index.levels[1]
print(a)
Index(['W', 'X', 'Y'], dtype='object', name=1)
To understand the information available, see how the Index object is stored internally:
print(df.index)
MultiIndex(levels=[['A', 'B', 'C', 'D', 'E'], ['W', 'X', 'Y']],
labels=[[0, 1, 2, 3, 4], [0, 1, 2, 1, 2]],
names=[0, 1])
However, the below methods are more intuitive and better documented.
One point worth noting is you don't have to explicitly extract the NumPy array via the values attribute. You can iterate Index objects directly. In addition, method chaining is possible and encouraged with Pandas.
drop_duplicates / unique
Returns an Index object, with order preserved.
a = df.index.get_level_values(1).drop_duplicates()
# equivalently, df.index.get_level_values(1).unique()
print(a)
Index(['W', 'X', 'Y'], dtype='object', name=1)
set
Returns a set. Useful for O(1) lookup, but result is unordered.
a = set(df.index.get_level_values(1))
print(a)
{'X', 'Y', 'W'}

Replacing an element in a list with multiple elements

I am trying to modify a list of two lists. For each of the two inside lists, I perform some operation and 'split' them into new lists.
Here is a simple example of what I'm trying to do:
[['a', 'b'], ['c', 'd']] --> [['a'], ['b'], ['c', 'd']]
Currently my algorithm passes ['a', 'b'] to a function that determines whether or not it should be split into [['a'], ['b']] (e.g. based on their correlations). The function returns [['a'], ['b']] which tells me that ['a', 'b'] should be split, or returns ['a', 'b'] (the original list) which indicates that it should not be split.
Currently I have something like this:
blist = [['a', 'b'], ['c', 'd']] #big list
slist = [['a'], ['b']] #small list returned by function
nlist = [items for i in xrange(len(blist)) for items in (slist if i==0 else blist[i])]
This produces [['a'], ['b'], 'c', 'd'] as opposed to the desired output [['a'], ['b'], ['c', 'd']] which does not alter the second list in the original blist. I understand why this is happening--my second loop is also applied to blist[1] in this case, but I am not sure how to fix it as I do not understand list comprehension completely.
A 'pythonic' solution is preferred.
Any feedback would be appreciated, thank you!
EDIT: Like the title suggests, I am trying to 'replace' ['a', 'b'] with ['a'], ['b']. So I would like the 'position' to be the same, having ['a'], ['b'] appear in the original list before ['c', 'd']
RESULTS
Thank you Christian, Paul and schwobaseggl for your solutions! They all work :)
Try
... else [blist[i]])]
to create a list of lists.
You can use slice assignment:
>> l1 = [[1, 2], [3, 4]]
>>> l2 = [[1], [2]]
>>> l1[0:1] = l2
>>> l1
[[1], [2], [3, 4]]
This changes l1, so if you want to keep it make a copy before.
Another way that doesn't change l1 is addition:
>> l1 = [[1, 2], [3, 4]]
>>> l3 = l2 + l1[1:]
>>> l3
[[1], [2], [3, 4]]
You could alter your split function to return structurally adequate lists. Then you can use a comprehension:
def split_or_not(l):
if condition: # split
return [l[:1], l[1:]]
return [l] # wrap in extra list
# using map
nlist = [x for sub_l in map(split_or_not, blist) for x in sub_l]
# or nested comprehension
nlist = [x for sub_l in (split_or_not(l) for l in blist) for x in sub_l]
Assuming you have the mentioned funtion that decides whether to split an item:
def munch(item):
if item[0] == 'a': # split
return [[item[0]], [item[1]]]
return [item] # don't split
You can use it in s simple for-loop.
nlist = []
for item in blist:
nlist.extend(munch(item))
"Pythonic" is whatever is easy to read and understand. Don't use list comprehensions just because you can.

Categories

Resources