Is this correct use of flatten? - python

I am attempting to flatten a list using:
wd = ['this' , 'is']
np.asarray(list(map(lambda x : list(x) , wd))).flatten()
which returns:
array([['t', 'h', 'i', 's'], ['i', 's']], dtype=object)
when I'm expecting a char array: ['t','h','i','s','i','s']
Is this correct use of flatten?

No, this isn't a correct use for numpy.ndarray.flatten.
Two-dimensional NumPy arrays have to be rectangular or they will be cast to object arrays (or it will throw an exception). With object arrays flatten won't work correctly (because it won't flatten the "objects") and rectangular is impossible because your words have different lengths.
When dealing with strings (or arrays of strings) NumPy won't flatten them at all, neither if you create the array, nor when you try to "flatten" it:
>>> import numpy as np
>>> np.array(['fla', 'tten'])
array(['fla', 'tten'], dtype='<U4')
>>> np.array(['fla', 'tten']).flatten()
array(['fla', 'tten'], dtype='<U4')
Fortunately you can simply use "normal" Python features to flatten iterables, just to mention one example:
>>> wd = ['this' , 'is']
>>> [element for sequence in wd for element in sequence]
['t', 'h', 'i', 's', 'i', 's']
You might want to have a look at the following Q+A for more solutions and explanations:
Making a flat list out of list of lists in Python
Flatten (an irregular) list of lists

with just a list iteration:
[u for i in np.asarray(list(map(lambda x : list(x) , wd))) for u in i]
gives you this:
['t', 'h', 'i', 's', 'i', 's']
Although, as the comments say, you can just use ''.join() for your specific example, this has the advantage of working for numpy arrays and lists of lists:
test = np.array(range(10)).reshape(2,-1)
[u for i in test for u in i]
returns a flat list:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In[8]: from itertools import chain
In[9]: list(chain.from_iterable(['this' , 'is']))
Out[9]: ['t', 'h', 'i', 's', 'i', 's']

Related

How to iterate over position through Numpy- Python

I am wondering if there is a way to iterate over individual positions in a sequence list using NumPy. For example, if I had a list of sequences:
a = ['AGHT','THIS','OWKF']
The function would be able to go through each individual characters in their position. So for the first sequence 'AGHT', it would be broken down into 'A','G','H','T'. The ultimate goal is to create individual grids based on character abundance in each one of these sequences. So far I have only been able to make a loop that goes through each character, but I need this in NumPy:
b = np.array(a)
for c in b:
for d in c:
print(d)
I would prefer this in NumPy, but if there are other ways I would like to know as well. Thanks!
list expands a string into a list:
In [406]: a = ['AGHT','THIS','OWKF']
In [407]: [list(item) for item in a]
Out[407]: [['A', 'G', 'H', 'T'], ['T', 'H', 'I', 'S'], ['O', 'W', 'K', 'F']]
You can use join() to join the array into a sequence of characters, then iterate over each character or print it like this:
>>> a = ['AGHT','THIS','OWKF']
>>> print(''.join(a))
'AGHTTHISOWKF'
Or to turn it into an array of individual characters:
>>> out = ''.join(a)
>>> b = np.array(list(out))
array(['A', 'G', 'H', 'T', 'T', 'H', 'I', 'S', 'O', 'W', 'K', 'F'],
dtype='<U1')

How can I split a list in two unique lists in Python?

Hi I have a list as following:
listt = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
15 members.
I want to turn it into 3 lists, I used this code it worked but I want unique lists. this give me 3 lists that have mutual members.
import random
listt = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
print(random.sample(listt,5))
print(random.sample(listt,5))
print(random.sample(listt,5))
Try this:
from random import shuffle
def randomise():
listt = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
shuffle(listt)
return listt[:5], listt[5:10], listt[10:]
print(randomise())
This will print (for example, since it is random):
(['i', 'k', 'c', 'b', 'a'], ['d', 'j', 'h', 'n', 'f'], ['e', 'l', 'o', 'g', 'm'])
If it doesn't matter to you which items go in each list, then you're better off partitioning the list into thirds:
In [23]: L = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o']
In [24]: size = len(L)
In [25]: L[:size//3]
Out[25]: ['a', 'b', 'c', 'd', 'e']
In [26]: L[size//3:2*size//3]
Out[26]: ['f', 'g', 'h', 'i', 'j']
In [27]: L[2*size//3:]
Out[27]: ['k', 'l', 'm', 'n', 'o']
If you want them to have random elements from the original list, you'll just need to shuffle the input first:
random.shuffle(L)
Instead of sampling your list three times, which will always give you three independent results where individual members may be selected for more than a single list, you could just shuffle the list once and then split it in three parts. That way, you get three random subsets that will not share any items:
>>> random.shuffle(listt)
>>> list[0:5]
>>> listt[0:5]
['b', 'a', 'f', 'e', 'h']
>>> listt[5:10]
['c', 'm', 'g', 'j', 'o']
>>> listt[10:15]
['d', 'l', 'i', 'n', 'k']
Note that random.shuffle will shuffle the list in place, so the original list is modified. If you don’t want to modify the original list, you should make a copy first.
If your list is larger than the desired result set, then of course you can also sample your list once with the combined result size and then split the result accordingly:
>>> sample = random.sample(listt, 5 * 3)
>>> sample[0:5]
['h', 'm', 'i', 'k', 'd']
>>> sample[5:10]
['a', 'b', 'o', 'j', 'n']
>>> sample[10:15]
['c', 'l', 'f', 'e', 'g']
This solution will also avoid modifying the original list, so you will not need a copy if you want to keep it as it is.
Use [:] for slicing all members out of the list which basically copies everything into a new object. Alternatively just use list(<list>) which copies too:
print(random.sample(listt[:],5))
In case you want to shuffle only once, store the shuffle result into a variable and copy later:
output = random.sample(listt,5)
first = output[:]
second = output[:]
print(first is second, first is output) # False, False
and then the original list can be modified without the first or second being modified.
For nested lists you might want to use copy.deepcopy().

Indexing failure/odd behaviour with array

I have some code that is intended to convert a 3-dimensional list to an array. Technically it works in that I get a 3-dimensional array, but indexing only works when I don't iterate accross one of the dimensions, and doesn't work if I do.
Indexing works here:
listTempAllDays = []
for j in listGPSDays:
listTempDay = []
for i in listGPSDays[0]:
arrayDay = np.array(i)
listTempDay.append(arrayDay)
arrayTemp = np.array(listTempDay)
listTempAllDays.append(arrayTemp)
arrayGPSDays = np.array(listTempAllDays)
print(arrayGPSDays[0,0,0])
It doesn't work here:
listTempAllDays = []
for j in listGPSDays:
listTempDay = []
for i in j:
arrayDay = np.array(i)
listTempDay.append(arrayDay)
arrayTemp = np.array(listTempDay)
listTempAllDays.append(arrayTemp)
arrayGPSDays = np.array(listTempAllDays)
print(arrayGPSDays[0,0,0])
The difference between the two pieces of code is in the inner for loop. The first piece of code also works for all elements in listGPSDays (e.g. for i in listGPSDays[1]: etc...).
Removing the final print call allows the code to run in the second case, or changing the final line to print(arrayGPSDays[0][0,0]) does also run.
In both cases checking the type at all levels returns <class 'numpy.ndarray'>.
I would like this array indexing to work, if possible - what am I missing?
The following is provided as example data:
Anonymised results from print(arrayGPSDays[0:2,0:2,0:2]), generated using the first piece of code (so that the indexing works! - but also resulting in arrayGPSDays[0] being the same as arrayGPSDays[1]):
[[['1' '2']
['3' '4']]
[['1' '2']
['3' '4']]]
numpy's array constructor can handle arbitrarily dimensioned iterables. They only stipulation is that they can't be jagged (i.e. each "row" in each dimension must have the same length).
Here's an example:
In [1]: list_3d = [[['a', 'b', 'c'], ['d', 'e', 'f']], [['g', 'h', 'i'], ['j', 'k', 'l']]]
In [2]: import numpy as np
In [3]: np.array(list_3d)
Out[3]:
array([[['a', 'b', 'c'],
['d', 'e', 'f']],
[['g', 'h', 'i'],
['j', 'k', 'l']]], dtype='<U1')
In [4]: array_3d = np.array(list_3d)
In [5]: array_3d[0,0,0]
Out[5]: 'a'
In [6]: array_3d.shape
Out[6]: (2, 2, 3)
If the array is jagged, numpy will "squash" down to the dimension where the jagged-ness happens. Since that explanation is clear as mud, an example might help:
In [20]: jagged_3d = [ [['a', 'b'], ['c', 'd']], [['e', 'f'], ['g', 'h'], ['i', 'j']] ]
In [21]: jagged_arr = np.array(jagged_3d)
In [22]: jagged_arr.shape
Out[22]: (2,)
In [23]: jagged_arr
Out[23]:
array([list([['a', 'b'], ['c', 'd']]),
list([['e', 'f'], ['g', 'h'], ['i', 'j']])], dtype=object)
The reason the constructor isn't working out of the box is because you have a jagged array. numpy simply does not support jagged arrays due to the fact that each numpy array has a well-defined shape representing the length of each dimension. So if the items in a given dimension are different lengths, this abstraction falls apart, and numpy simply doesn't allow it.
HTH.
So Isaac, it seems your code have some syntax misinterpretations,
In your for statement, j represents an ITEM inside the list listGPSDays (I assume it is a list), not the ITEM INDEX inside the list, and you don't need to "get" the range of the list, python can do it for yourself, try:
for j in listGPSdays:
instead of
for j in range(len(listGPSDays)):
Also, try changing this line of code from:
for i in listGPSDays[j]:
to:
for i in listGPSDays.index(j):
I think it will solve your problem, hope it works!

Numpy, 1:M joins on Arrays

I was wondering if there is a way to join an numpy array.
Example:
array1 = [[1,c,d], [2,a,b], [3, e,f]]
array2 = [[2,g,g,t], [1,alpha, beta, gamma], [1,t,y,u], [3,dog, cat, fish]]
I need to join these array, but the Numpy documentation says if the records are not unique, the functions will fail or return unknown results.
Does anyone have any sample to do a 1:M join instead of a 1:1 join on numpy arrays? Also, I know my examples are in the proper numpy format, but it's just to give a general idea.
What you are willing to achieve looks more like a new nested list based on your two input arrays.
Treating them as lists:
list1 = [[1,'c','d'], [2,'a','b'], [3, 'e','f']]
list2 = [[2,'g','g','t'], [1,'alpha', 'beta', 'gamma'], [1,'t','y','u'], [3,'dog', 'cat', 'fish']]
You can build your desired result doing:
result = [i+j[1:] for i in list1 for j in list2 if i[0]==j[0]]
Which will look like this:
[[1, 'c', 'd', 'alpha', 'beta', 'gamma'],
[1, 'c', 'd', 't', 'y', 'u'],
[2, 'a', 'b', 'g', 'g', 't'],
[3, 'e', 'f', 'dog', 'cat', 'fish']]

Split words in a nested list into letters

I was wondering how I can split words in a nested list into their individual letters such that
[['ANTT'], ['XSOB']]
becomes
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
Use a list comprehension:
[list(l[0]) for l in mylist]
Your input list simply contains nested lists with 1 element each, so we need to use l[0] on each element. list() on a string creates a list of the individual characters:
>>> mylist = [['ANTT'], ['XSOB']]
>>> [list(l[0]) for l in mylist]
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
If you ever fix your code to produce a straight up list of strings (so without the single-element nested lists), you only need to remove the [0]:
>>> mylist = ['ANTT', 'XSOB']
>>> [list(l) for l in mylist]
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
You could you a functional approach (still I would prefer list comprehensions as in Martijn Pieters' answer):
>>> from operator import itemgetter
>>> delisted = map(itemgetter(0),[['ANTT'],['XSOB']]) # -> ['ANTT', 'XSOB']
>>> splited = map(list,delisted) # -> [['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
Or, as a oneliner:
>>> map(list,map(itemgetter(0),[['ANTT'],['XSOB']]))
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
>>> map(lambda s: map(list, s)[0], [['ANTT'],['XSOB']])
[['A', 'N', 'T', 'T'], ['X', 'S', 'O', 'B']]
All answers given so far include a - somewhat suspicious - 0 literal. At first glance this seem to be a somewhat artificial additive, since the original task doesn't include any integer literal.
But the input has - of course - some special properties besides being just a list. It's a list of lists with exactly one string literal in it. In other words, a package of individually boxed objects (strings). If we rephrase the original question and the accepted answer, we can make the relation between [x] and [0] more obvious:
package = [['ANTT'], ['XSOB']]
result = [list(box[0]) for box in package] # 0 indicates first (and only) element in the box

Categories

Resources