Python: Inserting an element into an array of different size - python

I have a numpy array like this:
[[[287 26]]
[[286 27]]
[[285 27]]
...
[[290 27]]
[[289 26]]
[[288 26]]]
and I would like to insert an integer and make it an array like
[
[287 26 integer]
[286 27 integer]
.......]
however, since the first array has different size than what I want at the end, simply using insert() function did not work for me.
Is there a work around?
Thanks in advance.
EDIT: So the closest I came so far is the following:
outline_poses = [] # final array
for cnt in cnts: # loop over each element
outline_poses.append(cnt[0])
outline_poses.append(SENSOR_MEASURED_HEIGHT) #Append this integer
Output:
[array([287, 26], dtype=int32), 60, array([286, 27], dtype=int32), 60,....]
How can I organize this array and make it look like [287, 26, 60],...?

If I understand your problem right, you could use a list comprehension.
newList = np.array([np.append(x[0],integer) for x in myList])

This is a three-dimensional list you have right here...
>>> myList = [[[287, 26]],
[[286, 27]],
[[285, 27]],
[[290, 27]],
[[289, 26]],
[[288, 26]]]
...so you'll need to access your list with two-levels of depth before inserting or appending elements into the deepest lists.
>>> myList[0][0].append(42)
>>> myList[5][0].append(42)
>>> myList
[[[287, 26, 42]],
[[286, 27]],
[[285, 27]],
[[290, 27]],
[[289, 26]],
[[288, 26, 42]]]
What happens when you insert or append elements with shallower depths? 🤔
Appending at Depth 0
>>> myList.append('Catapult')
>>> myList
[[[287, 26, 42], 'Trebuchet'],
[[286, 27]],
[[285, 27]],
[[290, 27]],
[[289, 26]],
[[288, 26, 42]],
'Catapult']
Appending at Depth 1
>>> myList[0].append('Trebuchet')
>>> myList[3].append('Treebuchet')
>>> myList
[[[287, 26, 42], 'Trebuchet'],
[[286, 27]],
[[285, 27]],
[[290, 27], 'Treebuchet'],
[[289, 26]],
[[288, 26, 42]]]

If I'm correct you are trying to insert an integer to all inner lists. You can use numpy concatenate method to achieve this.
integer_to_insert = 6
original_array = np.array([[[290, 27]],[[289, 26]],[[288, 26]]])
concat_integer = np.array([integer_to_insert]* original_array.shape[0]).reshape(original_array.shape[0], 1,1)
# This is correct if you are inserting the same integer to all lists. But as long as length of this array is equal to length of original list this array can be replaced.
concatenated = np.concatenate([original_array, concat_integer], 2)
print(concatenated)
# array([[[290, 27, 6]],
# [[289, 26, 6]],
# [[288, 26, 6]]])

Related

Reading items from .txt in specific order

I'm trying to read items from a .txt file that has the following:
294.nii.gz [[9, 46, 54], [36, 48, 44], [24, 19, 46], [15, 0, 22]]
296.nii.gz [[10, 13, 62], [40, 1, 64], [34, 0, 49], [27, 0, 49]]
312.nii.gz [[0, 27, 57], [25, 25, 63], [0, 42, 38], [0, 11, 21]]
The way I want to extract the data is:
Get the item name: 294.nii.gz
Item's coordinates serially: [9, 46, 54] [36, 48, 44] ...
Get the next item:
N.B. all the items have the same number of 3D coordinates.
So far I can read the data by following codes:
coortxt = os.path.join(coordir, 'coor_downsampled.txt')
with open(coortxt) as f:
content = f.readlines()
content = [x.strip() for x in content]
for item in content:
print(item.split(' ')[0])
This only prints the item names:
294.nii.gz
296.nii.gz
312.nii.gz
How do I get the rest of the data in the format I need?
So you have the fun task of converting a string representation of a list to a list.
To do this, you'll can use the ast library. Specifically, the ast.literal_eval method.
Disclaimer:
According to documentation:
Warning It is possible to crash the Python interpreter with a sufficiently large/complex string due to stack depth limitations in Python’s AST compiler.
This is NOT the same as using eval. From the docs:
Safely evaluate an expression node or a string containing a Python expression. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
This can be used for safely evaluating strings containing Python expressions from untrusted sources without the need to parse the values oneself.
You get the first part of the data with item.split(' ')[0].
Then, you'll use item.split(' ')[1:] to get (for example) a string with contents "[[9, 46, 54], [36, 48, 44], [24, 19, 46], [15, 0, 22]]".
If this is a risk you're willing to accept:
A demonstration using ast:
import ast
list_str = "[[9, 46, 54], [36, 48, 44], [24, 19, 46], [15, 0, 22]]"
list_list = ast.literal_eval(list_str)
print(isinstance(list_list, list))
#Outputs True
print(list_list)
#Outputs [[9, 46, 54], [36, 48, 44], [24, 19, 46], [15, 0, 22]]
Tying it together with your code:
import os
import ast
coortxt = os.path.join(coordir, 'coor_downsampled.txt')
with open(coortxt) as f:
content = f.readlines()
content = [x.strip() for x in content]
for item in content:
name,coords_str = item.split(' ')[0], item.split(' ')[1:]
coords = ast.literal_eval(coords_str)
#name,coords now contain your required data
#use as needed
Relevant posts:
https://stackoverflow.com/a/10775909/5763413
How to convert string representation of list to a list?
Others have suggested using the dynamic evaluator eval in Python (and even the ast.literal_eval, which definitely works, but there are still ways to perform this kind of parsing without that.
Given that the formatting of the coordinate list in the coor_downsampled.txt file is very json-esque, we can parse it using the very cool json module instead.
NOTE:
There are sources claiming that json.loads is 4x faster than eval, and almost 7x faster than ast.literal_eval, which depending on if you are in the need for speed, I'd recommend using the faster option.
Complete example
import os
import json
coortxt = 'coor_downsampled.txt'
with open(coortxt) as f:
content = f.readlines()
content = [x.strip() for x in content]
for item in content:
# split the line just like you did in your own example
split_line = item.split(" ")
# the "name" is the first element
name = split_line[0]
# here's the tricky part.
coords = json.loads("".join(split_line[1:]))
print(name)
print(coords)
Explanation
Let's break down this tricky line coords = json.loads("".join(split_line[1:]))
split_line[1:] will give you everything past the first space, so something like this:
['[[9,', '46,', '54],', '[36,', '48,', '44],', '[24,', '19,', '46],', '[15,', '0,', '22]]']
But by wrapping it with a "".join(), we can turn it into
'[[9,46,54],[36,48,44],[24,19,46],[15,0,22]]' as a string instead.
Once we have it like that, we simply do json.loads() to get the actual list object
[[9, 46, 54], [36, 48, 44], [24, 19, 46], [15, 0, 22]].

How to get duplicate strings of list with indices in Python

I do realize this has already been addressed here (e.g., Removing duplicates in the lists), Accessing the index in 'for' loops?, Append indices to duplicate strings in Python efficiently and many more...... Nevertheless, I hope this question was different.
Pretty much I need to write a program that checks if a list has any duplicates and if it does, returns the duplicate element along with the indices.
The sample list sample_list
sample = """An article is any member of a class of dedicated words that are used with noun phrases to
mark the identifiability of the referents of the noun phrases. The category of articles constitutes a
part of speech. In English, both "the" and "a" are articles, which combine with a noun to form a noun
phrase."""
sample_list = sample.split()
my_list = [x.lower() for x in sample_list]
len(my_list)
output: 55
The common approach to get a unique collection of items is to use a set, set will help here to remove duplicates.
unique_list = list(set(my_list))
len(unique_list)
output: 38
This is what I have tried but honestly, I don't know what to do next...
from functools import partial
def list_duplicates_of(seq,item):
start_at = -1
locs = []
while True:
try:
loc = seq.index(item,start_at+1)
except ValueError:
break
else:
locs.append(loc)
start_at = loc
return locs
dups_in_source = partial(list_duplicates_of, my_list)
for i in my_list:
print(i, dups_in_source(i))
This returns all the elements with indices and duplicate indices
an [0]
article [1]
.
.
.
form [51]
a [6, 33, 48, 52]
noun [15, 26, 49, 53]
phrase. [54]
Here I want to return only duplicate elements along with their indices like below
of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
.
.
.
noun [15, 26, 49, 53]
You could do something along these lines:
from collections import defaultdict
indeces = defaultdict(list)
for i, w in enumerate(my_list):
indeces[w].append(i)
for k, v in indeces.items():
if len(v) > 1:
print(k, v)
of [5, 8, 21, 24, 30, 35]
a [6, 33, 48, 52]
are [12, 43]
with [14, 47]
noun [15, 26, 49, 53]
to [17, 50]
the [19, 22, 25, 28]
This uses collections.defaultdict and enumerate to efficiently collect the indeces of each word. Ridding this of duplicates remains a simple conditional comprehension or loop with an if statement.

python: How to interpret indexing of slicing

Here is the code:
a = [0, 11, 22, 33, 44, 55]
a[1:4][1] = 666
print(a)
The output is [0, 11, 22, 33, 44, 55]
So list a is not updated, then what is the effect of that assignment?
[UPDATE]
Thanks #Amadan for explanation, it makes sense. But I am still puzzled, the following slicing directly updates the list:
a[1:4] = [111, 222, 333]
Intuitively I expect a[1:4][1] still operates on the list, but it is not.
Is my intuition wrong?
a[1:4] creates a new list, whose elements are [11, 22, 33]. Then you replace its #1 element with 666, which results in a list [11, 666, 33]. Then, because this list is not referred to by any variable, it is forgotten and garbage collected.
Note that the result is very different if you have a numpy array instead of the list, since slicing of a numpy array creates a view, not a new array, if at all possible:
import numpy as np
a = np.array([0, 11, 22, 33, 44])
a[1:4][1] = 666
a
# => array([ 0, 11, 666, 33, 44])
Here, a[1:4] is not an independent [11, 22, 33], but a view into the original list, where changing a[1:4] actually changes a.
Just another solution to think off in case you didn't know the position of 22 (and wanted to replace it with 666) or didn't care about removing other items from the list.
a = [0, 11, 22, 33, 44, 55]
# make use of enumerate to keep track of when the item is 22 and replace that
# with the help of indexing count i.e the position at which 22 is and replace it
# with 666.
for count,item in enumerate(a):
if item==22:
a[count]=666
print(a)
Output:
>>>[0, 11, 666, 33, 44, 55]
Hope that helps, cheers!

Converting generators to lists overwrites the values in Python [duplicate]

This question already has an answer here:
Python: How to append generator iteration values to a list
(1 answer)
Closed 4 years ago.
I have this piece of code in Python 3.5 that implements generators for batching of potentially large buffered data:
def batched_data(data, size=20):
batch = []
for d in data:
batch.append(d)
if len(batch) == size:
yield batch
batch.clear()
yield batch
def buffered_data(data, bufminsize=10, bufmaxsize=20):
diter = iter(data)
buffer = collections.deque(next(diter) for _ in range(bufmaxsize))
while buffer:
yield buffer.popleft()
if len(buffer) < bufminsize:
buffer.extend(
next(diter) for _ in range(bufmaxsize - len(buffer)))
def batched_buffered_data(data, bufsize=100, batch=20):
yield from batched_data(
buffered_data(data, bufmaxsize=bufsize,
bufminsize=bufsize - batch),
size=batch)
Now, when I initialize my generator and traverse the batches in a for-loop, everything's fine:
In [351]: gen = batched_buffered_data(range(27), bufsize=10, batch=7)
In [352]: for g in gen:
...: print(g)
...:
[0, 1, 2, 3, 4, 5, 6]
[7, 8, 9, 10, 11, 12, 13]
[14, 15, 16, 17, 18, 19, 20]
[21, 22, 23, 24, 25, 26]
However, when I try to use list comprehension or conversion using list, this happens:
In [353]: gen = batched_buffered_data(range(27), bufsize=10, batch=7)
In [354]: list(gen)
Out[354]:
[[21, 22, 23, 24, 25, 26],
[21, 22, 23, 24, 25, 26],
[21, 22, 23, 24, 25, 26],
[21, 22, 23, 24, 25, 26]]
I am really puzzled about this. There must be some sort of mutable elements involved, but I really don't know what's behind this behavior.
Changing batch.clear() to batch = [] should fix it. The problem is the batch list, after clear(), is still a reference to the one original list. The yield seems to work because it's printing the one list as it appears at the time before mutating the elements on the next yield. Setting it to a new list on each yield breaks the reference to the previous iteration, so there's no aliasing going on.
If you're still confused, check out this example using your original code with .clear():
result = []
gen = batched_buffered_data(range(27), bufsize=10, batch=7)
for g in gen:
result.append(g)
[print(x) for x in result]
Output:
[21, 22, 23, 24, 25, 26]
[21, 22, 23, 24, 25, 26]
[21, 22, 23, 24, 25, 26]
[21, 22, 23, 24, 25, 26]
You should change batch.clear() to batch = [].
That is because .clear() clear the list and all the variables that point that list become [], on other way batch = [] just create a new list and assign it to batch

2d arrays and how to populate them with one dimensional arrays

This is my code:
def SetUpScores():
scoreBoard= []
names = ["George", "Paul", "John", "Ringo", "Bryan"]
userScores = [17, 19, 23, 25, 35]
for i in range(0,5):
scoreBoard.append([])
for j in range(0,2):
scoreBoard[i].append(names[i])
scoreBoard[i][1]= userScores[i]
I'm basically trying to create a two dimensional array that holds the name and the userScore, I have looked this up alot and so far I keep getting the error of list assignment index out of range or 'list' cannot be called.
If i remove the last line from my code i.e:
def SetUpScores():
scoreBoard= []
names = ["George", "Paul", "John", "Ringo", "Bryan"]
userScores = [17, 19, 23, 25, 35]
for i in range(0,5):
scoreBoard.append([])
for j in range(0,2):
scoreBoard[i].append(names[i])
I get
[['George', 'George'], ['Paul', 'Paul'], ['John', 'John'], ['Ringo', 'Ringo'], ['Bryan', 'Bryan']] without any errors (this is just to test if the array was made).
I would like to make something like:
[['George', 17], ['Paul', 19], ['John', 23], ['Ringo', 25], ['Bryan', 35]]
Any help would be appreciated, thank you!
With the line scoreBoard[i].append(names[i]), you add a single element, not a list. So, the next line scoreBoard[i][1]= userScores[i] causes an error, because it refers to the second element of names[i], which is just a string.
The most compact way to do what you want would be
for name, score in zip(names, userScores):
scoreBoard.append([name, score])
names = ["George", "Paul", "John", "Ringo", "Bryan"]
userScores = [17, 19, 23, 25, 35]
L3 =[]
for i in range(0, len(L1)):
L3.append([L1[i], L2[i]])
print(L3)
Output:
[[17, 'George'], [19, 'Paul'], [23, 'John'], [25, 'Ringo'], [35, 'Bryan']]

Categories

Resources