Disclaimer:beginner, self-teaching Python user.
A pretty cool feature of ndarrays is their ability to accept a tuple of integers as indices (e.g. myNDArray[(1,2)] == myNDArray[1][2]). This allows me to leave the indices unspecified as a variable (e.g. indicesTuple ) until a script determines what part of an ndarray to work with, in which case the variable is specified as a tuple of integers and used to access part of an ndarray (e.g. myNDArray[indicesTuple]). The utility in using a variable is that the LENGTH of the tuple can be varied depending on the dimensions of the ndarray.
However, this limits me to working with arrays of numerical values. I tried using lists, but they can't take in a tuple as indices (e.g. myList[(1,2)] gives an error.). Is there a way to "unwrap" a tuple for list indices as one could for function arguments? Or something far easier or more efficient?
UPDATE: Holy shite I forgot this existed. Basically I eventually learned that you can initialize the ndarray with the argument dtype=object, which allows the ndarray to contain multiple types of Python objects, much like a list. As for accessing a list, as a commenter pointed out, I could use a for-loop to iterate through the variable indicesTuple to access increasingly nested elements of the list. For in-place editing, see the accepted comment, really went the extra mile there.
I'm interpreting your question as:
I have an N-dimensional list, and a tuple containing N values (T1, T2... TN). How can I use the tuple values to access the list? I don't know what N will be ahead of time.
I don't know of a built-in way to do this, but you can write a method that iteratively digs into the list until you reach the innermost value.
def get(seq, indices):
for index in indices:
seq = seq[index]
return seq
seq = [
[
["a","b"],
["c","d"]
],
[
["e","f"],
["g","h"]
]
]
indices = [0,1,0]
print get(seq, indices)
Result:
c
You could also do this in one* line with reduce, although it won't be very clear to the reader what you're trying to accomplish.
print reduce(lambda s, idx: s[idx], indices, seq)
(*if you're using 3.X, you'll need to import reduce from functools. So, two lines.)
If you want to set values in the N-dimensional list, use get to access the second-deepest level of the list, and assign to that.
def set(seq, indices, value):
innermost_list = get(seq, indices[:-1])
innermost_list[indices[-1]] = value
Say you have a list of (i,j) indexes
indexList = [(1,1), (0,1), (1,2)]
And some 2D list you want to index from
l = [[1,2,3],
[4,5,6],
[7,8,9]]
You could get those elements using a list comprehension as follows
>>> [l[i][j] for i,j in indexList]
[5, 2, 6]
Then your indexes can be whatever you want them to be. They will be unpacked in the list comprehension, and used as list indices. For your specific application, we'd have to see where your index variables were coming from, but that's the general idea.
Python doesn't have multidimensional lists, so myList[(1,2)] could only conceivably be considered a shortcut for (myList[1], myList[2]) (which would be pretty convenient sometimes, although you can use import operator; x = operator.itemgetter(1,2)(myList) to accomplish the same).
If your myList looks something like
myList = [ ["foo", "bar", "baz"], ["a", "b", c" ] ]
then myList[(1,2)] won't work (or make sense) because myList is not a two-dimensional list: it's a list that contains references to lists. You use myList[1][2] because the first index myList[1] returns the references to ["a", "b", "c"], to which you apply the second index [2] to get "c".
Slightly related, you could use a dictionary to simulate a sparse array precisely by using tuples as keys to a default dict.
import collections
d = collections.defaultdict(str)
d[(1,2)] = "foo"
d[(4,5)] = "bar"
Any other tuple you try to use as a key would return the empty string. It's not a perfect simulation, as you can't access full rows or columns of the array without using something like
row1 = [d[1, x] for x in range(C)] # where C is the number of columns
col3 = [d[x, 3] for x in range(R)] # where R is the number of columns
Use dictionaries indexed by tuple
>>> width, height = 7, 6
>>> grid = dict(
((x,y),"x={} y={}".format(x,y))
for x in range(width)
for y in range(height))
>>> print grid[3,1]
x=3 y=1
Use lists of lists
>>> width, height = 7, 6
>>> grid = [
["x={} y={}".format(x,y) for x in range(width)]
for y in range(width)]
>>> print grid[1][3]
x=3 y=1
In this case, you could make a getter and setter function:
def get_grid(grid, index):
x, y = index
return grid[y][x]
def set_grid(grid, index, value):
x, y = index
grid[y][x] = value
You could go a step further and create your own class that contains a list of lists and defines an indexer that takes tuples as indexes and does this same process. It can do slightly more sensible bounds-checking and give better diagnostics than the dictionary, but it takes a bit of setup. I think the dictionary approach is fine for quick exploration.
Related
I want to know if you have a list of strings such as:
l = ['ACGAAAG', 'CAGAAGC', 'ACCTGTT']
How do you convert it to:
O = 'ACGAAAG'
P = 'CAGAAGC'
Q = 'ACCTGTT'
Can you do this without knowing the number of items in a list? You have to store them as variables.
(The variables don't matter.)
Welcome to SE!
Structure Known
If you know the structure of the string, then you might simply unpack it:
O, P, Q = my_list
Structure Unknown
Unpack your list using a for loop. Do your work on each string inside the loop. For the below, I am simply printing each one:
for element in l:
print(element)
Good luck!
If you don't know the number of items beforehand, a list is the right structure to keep the items in.
You can, though, cut off fist few known items, and leave the unknown tail as a list:
a, b, *rest = ["ay", "bee", "see", "what", "remains"]
print("%r, %r, rest is %r" % (a, b, rest))
a,b,c = my_list
this will work as long as the numbers of elements in the list is equal to the numbers of variables you want to unpack, it actually work with any iterable, tuple, list, set, etc
if the list is longer you can always access the first 3 elements if that is what you want
a = my_list[0]
b = my_list[1]
c = my_list[2]
or in one line
a, b, c = my_list[0], my_list[1], my_list[2]
even better with the slice notation you can get a sub list of the right with the first 3 elements
a, b, c = my_list[:3]
those would work as long as the list is at least of size 3, or the numbers of variables you want
you can also use the extended unpack notation
a, b, c, *the_rest = my_list
the rest would be a list with everything else in the list other than the first 3 elements and again the list need to be of size 3 or more
And that pretty much cover all the ways to extract a certain numbers of items
Now depending of what you are going to do with those, you may be better with a regular loop
for item in my_list:
#do something with the current item, like printing it
print(item)
in each iteration item would take the value of one element in the list for you to do what you need to do one item at the time
if what you want is take 3 items at the time in each iteration, there are several way to do it
like for example
for i in range(3,len(my_list),3)
a,b,c = my_list[i-3:i]
print(a,b,c)
there are more fun construct like
it = [iter(my_list)]*3
for a,b,c in zip(*it):
print(a,b,c)
and other with the itertools module.
But now you said something interesting "so that every term is assigned to a variable" that is the wrong approach, you don't want an unknown number of variables running around that get messy very fast, you work with the list, if you want to do some work with each element it there are plenty of ways of doing it like list comprehension
my_new_list = [ some_fun(x) for x in my_list ]
or in the old way
my_new_list = []
for x in my_list:
my_new_list.append( some_fun(x) )
or if you need to work with more that 1 item at the time, combine that with some of the above
I do not know if your use case requires the strings to be stored in different variables. It usually is a bad idea.
But if you do need it, then you can use exec builtin which takes the string representation of a python statement and executes it.
list_of_strings = ['ACGAAAG', 'CAGAAGC', 'ACCTGTT']
Dynamically generate variable names equivalent to the column names in an excel sheet. (A,B,C....Z,AA,AB........,AAA....)
variable_names = ['A', 'B', 'C'] in this specific case
for vn, st in zip(variable_names, list_of_strings):
exec('{} = "{}"'.format(vn, st))
Test it out, print(A,B,C) will output the three strings and you can use A,B and C as variables in the rest of the program
I've tried using Counter and itertools, but since a list is unhasable, they don't work.
My data looks like this: [ [1,2,3], [2,3,4], [1,2,3] ]
I would like to know that the list [1,2,3] appears twice, but I cant figure out how to do this. I was thinking of just converting each list to a tuple, then hashing with that. Is there a better way?
>>> from collections import Counter
>>> li=[ [1,2,3], [2,3,4], [1,2,3] ]
>>> Counter(str(e) for e in li)
Counter({'[1, 2, 3]': 2, '[2, 3, 4]': 1})
The method that you state also works as long as there are not nested mutables in each sublist (such as [ [1,2,3], [2,3,4,[11,12]], [1,2,3] ]:
>>> Counter(tuple(e) for e in li)
Counter({(1, 2, 3): 2, (2, 3, 4): 1})
If you do have other unhasable types nested in the sub lists lists, use the str or repr method since that deals with all sub lists as well. Or recursively convert all to tuples (more work).
ll = [ [1,2,3], [2,3,4], [1,2,3] ]
print(len(set(map(tuple, ll))))
Also, if you wanted to count the occurences of a unique* list:
print(ll.count([1,2,3]))
*value unique, not reference unique)
I think, using the Counter class on tuples like
Counter(tuple(item) for item in li)
Will be optimal in terms of elegance and "pythoniticity": It's probably the shortest solution, it's perfectly clear what you want to achieve and how it's done, and it uses resp. combines standard methods (and thus avoids reinventing the wheel).
The only performance drawback I can see is, that every element has to be converted to a tuple (in order to be hashable), which more or less means that all elements of all sublists have to be copied once. Also the internal hash function on tuples may be suboptimal if you know that list elements will e.g. always be integers.
In order to improve on performance, you would have to
Implement some kind of hash algorithm working directly on lists (more or less reimplementing the hashing of tuples but for lists)
Somehow reimplement the Counter class in order to use this hash algorithm and provide some suitable output (this class would probably use a dictionary using the hash values as key and a combination of the "original" list and the count as value)
At least the first step would need to be done in C/C++ in order to match the speed of the internal hash function. If you know the type of the list elements you could probably even improve the performance.
As for the Counter class I do not know if it's standard implementation is in Python or in C, if the latter is the case you'll probably also have to reimplement it in C in order to achieve the same (or better) performance.
So the question "Is there a better solution" cannot be answered (as always) without knowing your specific requirements.
list = [ [1,2,3], [2,3,4], [1,2,3] ]
repeats = []
unique = 0
for i in list:
count = 0;
if i not in repeats:
for i2 in list:
if i == i2:
count += 1
if count > 1:
repeats.append(i)
elif count == 1:
unique += 1
print "Repeated Items"
for r in repeats:
print r,
print "\nUnique items:", unique
loops through the list to find repeated sequences, while skipping items if they have already been detected as repeats, and adds them into the repeats list, while counting the number of unique lists.
So I have a list of tuples. Each tuple in the list will be the same length, but tuple size will vary based on list. For example, one list could contain tuples of length 4, another could contain tuples of length 5. I want to unpack each individual value of a tuple, and use each value to multiply it by an element in another list. For example(with a list of tuples of length 3):
somelist = [a,b,c]
tuplelist = [(2,3,5),(5,7,5),(9,2,4)]
listMult = []
for x,y,z in tuplelist:
listMult.append([somelist[0]*x,somelist[1]*y,somelist[2]*z])
The problem with this is that it won't scale if I'm using another list with tuples of a different size.
If you don't know how many elements each tuple has, unpacking would be a bad idea. In your example, you would instead do the following:
listMult = [sum(x*y for x, y in zip(tup, somelist)) for tup in tuplelist]
In general, you'd try to use iteration, starargs, and other things that operate on an iterable directly instead of unpacking.
As presented, the question is incompletely specified. But there is an interesting and useful variant of the question, "How do I unpack a fixed number of elements from tuples of an unknown length?".
The answer to that might be useful to you:
tuple_list = [(2,3), (5,7,5), (9,2,4,2)]
pad_tuple = (0, 0, 0)
for t in tuple_list:
t += pad_tuple # make sure the tuple is sufficiently long
x, y, z = t[:3] # only extract the first three elements
print(x,y,z)
My question seems simple, but for a novice to python like myself this is starting to get too complex for me to get, so here's the situation:
I need to take a list such as:
L = [(a, b, c), (d, e, d), (etc, etc, etc), (etc, etc, etc)]
and make each index an individual list so that I may pull elements from each index specifically. The problem is that the list I am actually working with contains hundreds of indices such as the ones above and I cannot make something like:
L_new = list(L['insert specific index here'])
for each one as that would mean filling up the memory with hundreds of lists corresponding to individual indices of the first list and would be far too time and memory consuming from my point of view. So my question is this, how can I separate those indices and then pull individual parts from them without needing to create hundreds of individual lists (at least to the point where I wont need hundreds of individual lines to create them).
I might be misreading your question, but I'm inclined to say that you don't actually have to do anything to be able to index your tuples. See my comment, but: L[0][0] will give "a", L[0][1] will give "b", L[2][1] will give "etc" etc...
If you really want a clean way to turn this into a list of lists you could use a list comprehension:
cast = [list(entry) for entry in L]
In response to your comment: if you want to access across dimensions I would suggest list comprehension. For your comment specifically:
crosscut = [entry[0] for entry in L]
In response to comment 2: This is largely a part of a really useful operation called slicing. Specifically to do the referenced operation you would do this:
multiple_index = [entry[0:3] for entry in L]
Depending on your readability preferences there are actually a number of possibilities here:
list_of_lists = []
for sublist in L:
list_of_lists.append(list(sublist))
iterator = iter(L)
for i in range(0,iterator.__length_hint__()):
return list(iterator.next())
# Or yield list(iterator.next()) if you want lazy evaluation
What you have there is a list of tuples, access them like a list of lists
L[3][2]
will get the second element from the 3rd tuple in your list L
Two way of using inner lists:
for index, sublist in enumerate(L):
# do something with sublist
pass
or with an iterator
iterator = iter(L)
sublist = L.next() # <-- yields the first sublist
in both case, sublist elements can be reached via
direct index
sublist[2]
iteration
iterator = iter(sublist)
iterator.next() # <-- yields first elem of sublist
for elem in sublist:
# do something with my elem
pass
This is an incredibly simple question (I'm new to Python).
I basically want a data structure like a PHP array -- i.e., I want to initialise it and then just add values into it.
As far as I can tell, this is not possible with Python, so I've got the maximum value I might want to use as an index, but I can't figure out how to create an empty list of a specified length.
Also, is a list the right data structure to use to model what feels like it should just be an array? I tried to use an array, but it seemed unhappy with storing strings.
Edit: Sorry, I didn't explain very clearly what I was looking for. When I add items into the list, I do not want to put them in in sequence, but rather I want to insert them into specified slots in the list.
I.e., I want to be able to do this:
list = []
for row in rows:
c = list_of_categories.index(row["id"])
print c
list[c] = row["name"]
Depending on how you are going to use the list, it may be that you actually want a dictionary. This will work:
d = {}
for row in rows:
c = list_of_categories.index(row["id"])
print c
d[c] = row["name"]
... or more compactly:
d = dict((list_of_categories.index(row['id']), row['name']) for row in rows)
print d
PHP arrays are much more like Python dicts than they are like Python lists. For example, they can have strings for keys.
And confusingly, Python has an array module, which is described as "efficient arrays of numeric values", which is definitely not what you want.
If the number of items you want is known in advance, and you want to access them using integer, 0-based, consecutive indices, you might try this:
n = 3
array = n * [None]
print array
array[2] = 11
array[1] = 47
array[0] = 42
print array
This prints:
[None, None, None]
[42, 47, 11]
Use the list constructor, and append your items, like this:
l = list ()
l.append ("foo")
l.append (3)
print (l)
gives me ['foo', 3], which should be what you want. See the documentation on list and the sequence type documentation.
EDIT Updated
For inserting, use insert, like this:
l = list ()
l.append ("foo")
l.append (3)
l.insert (1, "new")
print (l)
which prints ['foo', 'new', 3]
http://diveintopython3.ep.io/native-datatypes.html#lists
You don't need to create empty lists with a specified length. You just add to them and query about their current length if needed.
What you can't do without preparing to catch an exception is to use a non existent index. Which is probably what you are used to in PHP.
You can use this syntax to create a list with n elements:
lst = [0] * n
But be careful! The list will contain n copies of this object. If this object is mutable and you change one element, then all copies will be changed! In this case you should use:
lst = [some_object() for i in xrange(n)]
Then you can access these elements:
for i in xrange(n):
lst[i] += 1
A Python list is comparable to a vector in other languages. It is a resizable array, not a linked list.
Sounds like what you need might be a dictionary rather than an array if you want to insert into specified indices.
dict = {'a': 1, 'b': 2, 'c': 3}
dict['a']
1
I agree with ned that you probably need a dictionary for what you're trying to do. But here's a way to get a list of those lists of categories you can do this:
lst = [list_of_categories.index(row["id"]) for row in rows]
use a dictionary, because what you're really asking for is a structure you can access by arbitrary keys
list = {}
for row in rows:
c = list_of_categories.index(row["id"])
print c
list[c] = row["name"]
Then you can iterate through the known contents with:
for x in list.values():
print x
Or check if something exists in the "list":
if 3 in list:
print "it's there"
I'm not sure if I understood what you mean or want to do, but it seems that you want a list which
is dictonary-like where the index is the key. Even if I think, the usage of a dictonary would be a better
choice, here's my answer: Got a problem - make an object:
class MyList(UserList.UserList):
NO_ITEM = 'noitem'
def insertAt(self, item, index):
length = len(self)
if index < length:
self[index] = item
elif index == length:
self.append(item)
else:
for i in range(0, index-length):
self.append(self.NO_ITEM)
self.append(item)
Maybe some errors in the python syntax (didn't check), but in principle it should work.
Of course the else case works also for the elif, but I thought, it might be a little harder
to read this way.