Python list index splitting and manipulation

Python list index splitting and manipulation - python

My question seems simple, but for a novice to python like myself this is starting to get too complex for me to get, so here's the situation:
I need to take a list such as:
L = [(a, b, c), (d, e, d), (etc, etc, etc), (etc, etc, etc)]
and make each index an individual list so that I may pull elements from each index specifically. The problem is that the list I am actually working with contains hundreds of indices such as the ones above and I cannot make something like:
L_new = list(L['insert specific index here'])
for each one as that would mean filling up the memory with hundreds of lists corresponding to individual indices of the first list and would be far too time and memory consuming from my point of view. So my question is this, how can I separate those indices and then pull individual parts from them without needing to create hundreds of individual lists (at least to the point where I wont need hundreds of individual lines to create them).

I might be misreading your question, but I'm inclined to say that you don't actually have to do anything to be able to index your tuples. See my comment, but: L[0][0] will give "a", L[0][1] will give "b", L[2][1] will give "etc" etc...
If you really want a clean way to turn this into a list of lists you could use a list comprehension:
cast = [list(entry) for entry in L]
In response to your comment: if you want to access across dimensions I would suggest list comprehension. For your comment specifically:
crosscut = [entry[0] for entry in L]
In response to comment 2: This is largely a part of a really useful operation called slicing. Specifically to do the referenced operation you would do this:
multiple_index = [entry[0:3] for entry in L]
Depending on your readability preferences there are actually a number of possibilities here:
list_of_lists = []
for sublist in L:
list_of_lists.append(list(sublist))
iterator = iter(L)
for i in range(0,iterator.__length_hint__()):
return list(iterator.next())
# Or yield list(iterator.next()) if you want lazy evaluation

What you have there is a list of tuples, access them like a list of lists
L[3][2]
will get the second element from the 3rd tuple in your list L

Two way of using inner lists:
for index, sublist in enumerate(L):
# do something with sublist
pass
or with an iterator
iterator = iter(L)
sublist = L.next() # <-- yields the first sublist
in both case, sublist elements can be reached via
direct index
sublist[2]
iteration
iterator = iter(sublist)
iterator.next() # <-- yields first elem of sublist
for elem in sublist:
# do something with my elem
pass

Related

How do you convert a list of strings to separate strings in Python 3?

I want to know if you have a list of strings such as:
l = ['ACGAAAG', 'CAGAAGC', 'ACCTGTT']
How do you convert it to:
O = 'ACGAAAG'
P = 'CAGAAGC'
Q = 'ACCTGTT'
Can you do this without knowing the number of items in a list? You have to store them as variables.
(The variables don't matter.)

Welcome to SE!
Structure Known
If you know the structure of the string, then you might simply unpack it:
O, P, Q = my_list
Structure Unknown
Unpack your list using a for loop. Do your work on each string inside the loop. For the below, I am simply printing each one:
for element in l:
print(element)
Good luck!

If you don't know the number of items beforehand, a list is the right structure to keep the items in.
You can, though, cut off fist few known items, and leave the unknown tail as a list:
a, b, *rest = ["ay", "bee", "see", "what", "remains"]
print("%r, %r, rest is %r" % (a, b, rest))

a,b,c = my_list
this will work as long as the numbers of elements in the list is equal to the numbers of variables you want to unpack, it actually work with any iterable, tuple, list, set, etc
if the list is longer you can always access the first 3 elements if that is what you want
a = my_list[0]
b = my_list[1]
c = my_list[2]
or in one line
a, b, c = my_list[0], my_list[1], my_list[2]
even better with the slice notation you can get a sub list of the right with the first 3 elements
a, b, c = my_list[:3]
those would work as long as the list is at least of size 3, or the numbers of variables you want
you can also use the extended unpack notation
a, b, c, *the_rest = my_list
the rest would be a list with everything else in the list other than the first 3 elements and again the list need to be of size 3 or more
And that pretty much cover all the ways to extract a certain numbers of items
Now depending of what you are going to do with those, you may be better with a regular loop
for item in my_list:
#do something with the current item, like printing it
print(item)
in each iteration item would take the value of one element in the list for you to do what you need to do one item at the time
if what you want is take 3 items at the time in each iteration, there are several way to do it
like for example
for i in range(3,len(my_list),3)
a,b,c = my_list[i-3:i]
print(a,b,c)
there are more fun construct like
it = [iter(my_list)]*3
for a,b,c in zip(*it):
print(a,b,c)
and other with the itertools module.
But now you said something interesting "so that every term is assigned to a variable" that is the wrong approach, you don't want an unknown number of variables running around that get messy very fast, you work with the list, if you want to do some work with each element it there are plenty of ways of doing it like list comprehension
my_new_list = [ some_fun(x) for x in my_list ]
or in the old way
my_new_list = []
for x in my_list:
my_new_list.append( some_fun(x) )
or if you need to work with more that 1 item at the time, combine that with some of the above

I do not know if your use case requires the strings to be stored in different variables. It usually is a bad idea.
But if you do need it, then you can use exec builtin which takes the string representation of a python statement and executes it.
list_of_strings = ['ACGAAAG', 'CAGAAGC', 'ACCTGTT']
Dynamically generate variable names equivalent to the column names in an excel sheet. (A,B,C....Z,AA,AB........,AAA....)
variable_names = ['A', 'B', 'C'] in this specific case
for vn, st in zip(variable_names, list_of_strings):
exec('{} = "{}"'.format(vn, st))
Test it out, print(A,B,C) will output the three strings and you can use A,B and C as variables in the rest of the program

Python: replace values of sublist, with values looked up from another sublist without indexing

Description
I have two lists of lists which are derived from CSVs (minimal working example below). The real dataset for this too large to do this manually.
mainlist = [["MH75","QF12",0,38], ["JQ59","QR21",105,191], ["JQ61","SQ48",186,284], ["SQ84","QF36",0,123], ["GA55","VA63",80,245], ["MH98","CX12",171,263]]
replacelist = [["MH75","QF12","BA89","QR29"], ["QR21","JQ59","VA51","MH52"], ["GA55","VA63","MH19","CX84"], ["SQ84","QF36","SQ08","JQ65"], ["SQ48","JQ61","QF87","QF63"], ["MH98","CX12","GA34","GA60"]]
mainlist contains a pair of identifiers (mainlist[x][0], mainlist[x][1]) and these are associated with to two integers (mainlist[x][2] and mainlist[x][3]).
replacelist is a second list of lists which also contains the same pairs of identifiers (but not in the same order within a pair, or across rows). All sublist pairs are unique. Importantly, replacelist[x][2],replacelist[x][3] corresponds to a replacement for replacelist[x][0],replacelist[x][1], respectively.
I need to create a new third list, newlist which copies mainlist but replaces the identifiers with those from replacelist[x][2],replacelist[x][3]
For example, given:
mainlist[2] is: [JQ61,SQ48,186,284]
The matching pair in replacelist is
replacelist[4]: [SQ48,JQ61,QF87,QF63]
Therefore the expected output is
newlist[2] = [QF87,QF63,186,284]
More clearly put:
if replacelist = [[A, B, C, D]]
A is replaced with C, and B is replaced with D.
but it may appear in mainlist as [[B, A]]
Note newlist row position uses the same as mainlist
Attempt
What has me totally stumped on a simple problem is I feel I can't use basic list comprehension [i for i in replacelist if i in mainlist] as the order within a pair changes, and if I sorted(list) then I lose information about what to replace the lists with. Current solution (with commented blanks):
newlist = []
for k in replacelist:
for i in mainlist:
if k[0] and k[1] in i:
# retrieve mainlist order, then use some kind of indexing to check a series of nested if statements to work out positional replacement.
As you can see, this solution is clearly inefficient and I can't work out the best way to perform the final step in a few lines.
I can add more information if this is not clear

It'll help if you had replacelist as a dict:
mainlist = [[MH75,QF12,0,38], [JQ59,QR21,105,191], [JQ61,SQ48,186,284], [SQ84,QF36,0,123], [GA55,VA63,80,245], [MH98,CX12,171,263]]
replacelist = [[MH75,QF12,BA89,QR29], [QR21,JQ59,VA51,MH52], [GA55,VA63,MH19,CX84], [SQ84,QF36,SQ08,JQ65], [SQ48,JQ61,QF87,QF63], [MH98,CX12,GA34,GA60]]
replacements = {frozenset(r[:2]):dict(zip(r[:2], r[2:])) for r in replacements}
newlist = []
for *ids, val1, val2 in mainlist:
reps = replacements[frozenset([id1, id2])]
newlist.append([reps[ids[0]], reps[ids[1]], val1, val2])

First thing you do - transform both lists in a dictionary:
from collections import OrderedDict
maindct = OrderedDict((frozenset(item[:2]),item[2:]) for item in mainlist)
replacedct = {frozenset(item[:2]):item[2:] for item in replacementlist}
# Now it is trivial to create another dict with the desired output:
output_list = [replacedct[key] + maindct[key] for key in maindct]
The big deal here is that by using a dictionary, you cancel up the search time for the indices on the replacement list - in a list you have to scan all the list for each item you have, which makes your performance worse with the square of your list length. With Python dictionaries, the search time is constant - and do not depend on the data length at all.

How to 'hide' an element in a list when passing list to a function in Python?

Suppose I have a list
myList = [a,b,c,d,e]
And a function
def doSomething(list):
#Does something to the list
And I want to call the function iteratively like this:
doSomething([b,c,d,e])
doSomething([a,c,d,e])
doSomething([a,b,d,e])
doSomething([a,b,c,e])
doSomething([a,b,c,d])
The first thing that comes to mind would be something like this:
for x in range(0,len(myList)):
del myList[x]
doSomething(myList)
But this doesn't really work, because each time I call del it actually deletes the element. I sort of just want to 'hide' the element each time I call the function. Is there a way to do this?

You can use itertools.combinations for this:
import itertools
for sublist in itertools.combinations([a, b, c, d, e], 4):
# 4 is the number of elements in each sublist.
# If you do not know the length of the input list, use len() - 1
doSomething(sublist)
This will make sublist a tuple. If you need it to be a list, you can call list() on it before passing it to doSomething().
If you care about the order in which the doSomething() calls are done, you will want to reverse the order of iteration so that it begins by removing the first element instead of the last element:
for sublist in reversed(list(itertools.combinations([a, b, c, d, e], 4))):
doSomething(sublist)
This is less efficient because all of the sublists must be generated up front instead of one at a time. mgilson in the comments suggests reversing the input list and then reversing each sublist, which should be more efficient but the code may be harder to read.

Normally, looping over indices is a bad idea -- but in this case, it seems that you want to remove elements at a given index (iteratively) so looping over indices actually seems appropriate for once.
You could use list.pop for this purpose, but it turns out that would be an extra O(N) operation for each turn of the loop (once to copy the list, once to remove the i'th element). We can do it differently by removing the element while we're copying...
for i in range(len(lst)):
new_list = [x for j, x in enumerate(lst) if j != i]
doSomething(new_list)
Note however that it isn't guaranteed that this will be faster than the naive approach:
for i in range(len(lst)):
new_list = lst[:] # lst.copy() in python3.x
new_list.pop(i)
doSomething(new_list)
The naive approach has the advantage that the any indexing that needs to be done in .pop is pushed to C code which is genearally faster than doing python comparisons.

How to iterate a list while deleting items from list using range() function? [duplicate]

This question already has answers here:
Strange result when removing item from a list while iterating over it
(8 answers)
Closed 7 years ago.
This is the most common problem I face while trying to learn programming in python. The problem is, when I try to iterate a list using "range()" function to check if given item in list meets given condition and if yes then to delete it, it will always give "IndexError". So, is there a particular way to do this without using any other intermediate list or "while" statement? Below is an example:
l = range(20)
for i in range(0,len(l)):
if l[i] == something:
l.pop(i)

First of all, you never want to iterate over things like that in Python. Iterate over the actual objects, not the indices:
l = range(20)
for i in l:
...
The reason for your error was that you were removing an item, so the later indices cease to exist.
Now, you can't modify a list while you are looping over it, but that isn't a problem. The better solution is to use a list comprehension here, to filter out the extra items.
l = range(20)
new_l = [i for i in l if not i == something]
You can also use the filter() builtin, although that tends to be unclear in most situations (and slower where you need lambda).
Also note that in Python 3.x, range() produces a generator, not a list.
It would also be a good idea to use more descriptive variable names - I'll presume here it's for example, but names like i and l are hard to read and make it easier to introduce bugs.
Edit:
If you wish to update the existing list in place, as pointed out in the comments, you can use the slicing syntax to replace each item of the list in turn (l[:] = new_l). That said, I would argue that that case is pretty bad design. You don't want one segment of code to rely on data being updated from another bit of code in that way.
Edit 2:
If, for any reason, you need the indices as you loop over the items, that's what the enumerate() builtin is for.

You can always do this sort of thing with a list comprehension:
newlist=[i for i in oldlist if not condition ]

As others have said, iterate over the list and create a new list with just the items you want to keep.
Use a slice assignment to update the original list in-place.
l[:] = [item for item in l if item != something]

You should look the problem from the other side: add an element to a list when it is equal with "something". with list comprehension:
l = [i for i in xrange(20) if i != something]

you should not use for i in range(0,len(l)):, use for i, item in enumerate(l): instead if you need the index, for item in l: if not
you should not manipulate a structure you are iterating over. when faced to do so, iterate over a copy instead
don't name a variable l (may be mistaken as 1 or I)
if you want to filter a list, do so explicitly. use filter() or list comprehensions
BTW, in your case, you could also do:
while something in list_: list_.remove(something)
That's not very efficient, though. But depending on context, it might be more readable.

The reason you're getting an IndexError is because you're changing the length of the list as you iterate in the for-loop. Basically, here's the logic...
#-- Build the original list: [0, 1, 2, ..., 19]
l = range(20)
#-- Here, the range function builds ANOTHER list, in this case also [0, 1, 2, ..., 19]
#-- the variable "i" will be bound to each element of this list, so i = 0 (loop), then i = 1 (loop), i = 2, etc.
for i in range(0,len(l)):
if i == something:
#-- So, when i is equivalent to something, you "pop" the list, l.
#-- the length of l is now *19* elements, NOT 20 (you just removed one)
l.pop(i)
#-- So...when the list has been shortened to 19 elements...
#-- we're still iterating, i = 17 (loop), i = 18 (loop), i = 19 *CRASH*
#-- There is no 19th element of l, as l (after you popped out an element) only
#-- has indices 0, ..., 18, now.
NOTE also, that you're making the "pop" decision based on the index of the list, not what's in the indexed cell of the list. This is unusual -- was that your intention? Or did you
mean something more like...
if l[i] == something:
l.pop(i)
Now, in your specific example, (l[i] == i) but this is not a typical pattern.
Rather than iterating over the list, try the filter function. It's a built-in (like a lot of other list processing functions: e.g. map, sort, reverse, zip, etc.)
Try this...
#-- Create a function for testing the elements of the list.
def f(x):
if (x == SOMETHING):
return False
else:
return True
#-- Create the original list.
l = range(20)
#-- Apply the function f to each element of l.
#-- Where f(l[i]) is True, the element l[i] is kept and will be in the new list, m.
#-- Where f(l[i]) is False, the element l[i] is passed over and will NOT appear in m.
m = filter(f, l)
List processing functions go hand-in-hand with "lambda" functions - which, in Python, are brief, anonymous functions. so, we can re-write the above code as...
#-- Create the original list.
l = range(20)
#-- Apply the function f to each element of l.
#-- Where lambda is True, the element l[i] is kept and will be in the new list, m.
#-- Where lambda is False, the element l[i] is passed over and will NOT appear in m.
m = filter(lambda x: (x != SOMETHING), l)
Give it a go and see it how it works!

append/extend list in loop

I would like to extend a list while looping over it:
for idx in xrange(len(a_list)):
item = a_list[idx]
a_list.extend(fun(item))
(fun is a function that returns a list.)
Question:
Is this already the best way to do it, or is something nicer and more compact possible?
Remarks:
from matplotlib.cbook import flatten
a_list.extend(flatten(fun(item) for item in a_list))
should work but I do not want my code to depend on matplotlib.
for item in a_list:
a_list.extend(fun(item))
would be nice enough for my taste but seems to cause an infinite loop.
Context:
I have have a large number of nodes (in a dict) and some of them are special because they are on the boundary.
'a_list' contains the keys of these special/boundary nodes. Sometimes nodes are added and then every new node that is on the boundary needs to be added to 'a_list'. The new boundary nodes can be determined by the old boundary nodes (expresses here by 'fun') and every boundary node can add several new nodes.

Have you tried list comprehensions? This would work by creating a separate list in memory, then assigning it to your original list once the comprehension is complete. Basically its the same as your second example, but instead of importing a flattening function, it flattens it through stacked list comprehensions. [edit Matthias: changed + to +=]
a_list += [x for lst in [fun(item) for item in a_list] for x in lst]
EDIT: To explain what going on.
So the first thing that will happen is this part in the middle of the above code:
[fun(item) for item in a_list]
This will apply fun to every item in a_list and add it to a new list. Problem is, because fun(item) returns a list, now we have a list of lists. So we run a second (stacked) list comprehension to loop through all the lists in our new list that we just created in the original comprehension:
for lst in [fun(item) for item in a_list]
This will allow us to loop through all the lists in order. So then:
[x for lst in [fun(item) for item in a_list] for x in lst]
This means take every x (that is, every item) in every lst (all the lists we created in our original comprehension) and add it to a new list.
Hope this is clearer. If not, I'm always willing to elaborate further.

Using itertools, it can be written as:
import itertools
a_list += itertools.chain(* itertools.imap(fun, a_list))
or, if you're aiming for code golf:
a_list += sum(map(fun, a_list), [])
Alternatively, just write it out:
new_elements = map(fun, a_list) # itertools.imap in Python 2.x
for ne in new_elements:
a_list.extend(ne)

As you want to extend the list, but loop only over the original list, you can loop over a copy instead of the original:
for item in a_list[:]:
a_list.extend(fun(item))

Using generator
original_list = [1, 2]
original_list.extend((x for x in original_list[:]))
# [1, 2, 1, 2]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python list index splitting and manipulation - python

What you have there is a list of tuples, access them like a list of lists L[3][2] will get the second element from the 3rd tuple in your list L

Related

How do you convert a list of strings to separate strings in Python 3?

Python: replace values of sublist, with values looked up from another sublist without indexing

How to 'hide' an element in a list when passing list to a function in Python?

How to iterate a list while deleting items from list using range() function? [duplicate]

append/extend list in loop

Categories

Resources