Append to Numpy Using a For Loop

Append to Numpy Using a For Loop - python

I am working on a Python script that takes live streaming data and appends it to a numpy array. However I noticed that if I append to four different arrays one by one it works. For example:
openBidArray = np.append(openBidArray, bidPrice)
highBidArray = np.append(highBidArray, bidPrice)
lowBidArray = np.append(lowBidArray, bidPrice)
closeBidArray = np.append(closeBidArray, bidPrice)
However If I do the following it does not work:
arrays = ["openBidArray", "highBidArray", "lowBidArray", "closeBidArray"]
for array in arrays:
array = np.append(array, bidPrice)
Any idea on why that is?

Do this instead:
arrays = [openBidArray, highBidArray, lowBidArray, closeBidArray]
In other words, your list should be a list of arrays, not a list of strings that coincidentally contain the names of arrays you happen to have defined.
Your next problem is that np.append() returns a copy of the array with the item appended, rather than appending in place. You store this result in array, but array will be assigned the next item from the list on the next iteration, and the modified array will be lost (except for the last one, of course, which will be in array at the end of the loop). So you will want to store each modified array back into the list. To do that, you need to know what slot it came from, which you can get using enumerate().
for i, array in enumerate(arrays):
arrays[i] = np.append(array, bidPrice)
Now of course this doesn't update your original variables, openBidArray and so on. You could do this after the loop using unpacking:
openBidArray, highBidArray, lowBidArray, closeBidArray = arrays
But at some point it just makes more sense to store the arrays in a list (or a dictionary if you need to access them by name) to begin with and not use the separate variables.
N.B. if you used regular Python lists here instead of NumPy arrays, some of these issues would go away. append() on lists is an in-place operation, so you wouldn't have to store the modified array back into the list or unpack to the individual variables. It might be feasible to do all the appending with lists and then convert them to arrays afterward, if you really need NumPy functionality on them.

In your second example, you have strings, not np.array objects. You are trying to append a number(?) to a string.
The string "openBidArray" doesn't hold any link to an array called openBidArray.

Related

Why do the same operations on numpy and python list get different results?

I try to replace the value of the first element with the value of the second element on a numpy array and a list whose elements are exactly the same, but the result I get is different.
1) test on a numpy array:
test=np.array([2,1])
left=test[:1]
right=test[1:]
test[0]=right[0]
print('left=:',left)
I get: left=: [1]
2) test on a python list:
test=[2,1]
left=test[:1]
right=test[1:]
test[0]=right[0]
print('left=:',left)
I get: left=: [2]
Could anyone explain why the results are different? Thanks in advance.

Slicing (indexing with colons) a numpy array returns a view into the numpy array so when you later update the value of test[0] it updates the value of left as left is just a view into the array.
When you slice into a python list it just returns a copy so when you update the value of test[0], the value of left doesn't change.
This is done because numpy arrays are often very large and creating lots of copies of arrays could be quite taxing.

To expand on James Down explanation of numpy arrays, you can use .copy() if you really want a COPY and not a VIEW of your array slice. However, when you make a copy, you would have to do the copy of left again after reassigning test[0]=right[0] to get the new value.
Also, regarding the list method, you set test[0]=right[0], so if you print (list) after the assignment, you will get [1 1] instead of the original [2, 1]. As James pointed out, left is a copy of the list item, so not updated with the change to the list.

Python creating a list with variable size which contains floating points

I want to be able to create a 2d list with a variable size.
test = [[]]
The problem is the data I want to put inside of it is a floating point. This makes it incompatible with the append function
TempData[0] = 1
TempData[1] = 2.32
TempData[2] = 3.65
test.append(float(TempData))
Is There any way around this? I don't really want to declare a huge list because sometimes the 2D list size may be very big or very small.

It looks like your issue is due to passing an object, TempData to a list and then changing the contents of that object. A reference to TempData is stored in the list, not the values contained in that list. When you alter TempData, it alters every element in the list. Instead, try this:
test = []
test.append([1, 2.32, 3.65])
test.append([2.312, 1.231, 1.111])

The python array module is made specifically for the purpose of holding numeric values. Here is an example using a list and an array.array:
import array
mylist = []
mylist.append(array.array('f', [1.43, 1.54, 1.24]))

Dynamic arrays inside a dynamic array

I want to create dynamic arrays inside a dynamic array because I dont know how many lists it will take to get the actual result. So using python 2.x when I write
Arrays = [[]]
does this mean that there is only one dynamic array inside an array or it can mean to be more than one when call for it in for loop like arrays[i]?
If it's not the case do you know a different method?

You can just define
Arrays = []
It is enough to hold your dynamic array.
AnotherArray1 = []
AnotherArray2 = []
Arrays.append(AnotherArray1)
Arrays.append(AnotherArray2)
print Arrays
Hope this solves your problem!

Consider using
Arrays = []
and later, when you are assigning your results use
Arrays.append([result])
This is assuming that your result comes in slices, but not as an array. No matter your actual return value layout, a variation of the above .append() should do the trick, as it allows you to dynamically extend your array. If your result comes as an array, it would simply be
Arrays.append(result)
and so on

If your array is going to be sparse, that is a lot of empty elements, you can consider to have a dict with coordinates as keys instead of nested lists:
grid = {}
grid[(x, y)] = value
print(grid)
output: {(x, y): value}

Concatenate dicts of numpy arrays retaining numpy dtype

I'm concatenating python dicts within a loop (not shown). I declare a new empty dict (dsst_mean_all) on the first instance of the loop:
if station_index == 0:
dsst_mean_all = {}
for key in dsst_mean:
dsst_mean_all[key] = []
source = [dsst_mean_all, dsst_mean]
for key in source[0]:
dsst_mean_all[key] = np.concatenate([d[key] for d in source])
and then, as you can see in the second part of the code above, I concatenate the dict that has been obtained within the loop (dsst_mean) with the large dict that's going to hold all the data (dsst_mean_all).
Now dsst_mean is a dict whose elements are numpy arrays of different types. Mostly they are float32. My question is, how can I retain the datatype during concatenation? My dsst_mean_all dict ends up being float64 numpy arrays for all elements. I need these to match dsst_mean to save memory and reduce file size. Note that dsst_mean for all iterations of the loop has the same structure and elements of the same dtype.
Thanks.

You can define the dtype of your arrays in the list comprehension.
Either hardecoded:
dsst_mean_all[key] = np.concatenate([d[key].astype('float32') for d in source])
Or dynamic:
dsst_mean_all[key] = np.concatenate([d[key].astype(d[key].dtype) for d in source])
Docs: https://docs.scipy.org/doc/numpy-1.13.0/user/basics.types.html

Ok one way to solve this is to avoid declaring dsst_mean_all as a new empty dict. This - I think - is why everything is being cast to float64 by default. With an if/else statement, on the first iteration simply set dsst_mean_all to dsst_mean, whilst for all subsequent iterations do the concatenation as shown in my original question.

Fast way of slicing columns from tuples

I have a huge list of tuples from which I want to extract individual columns. I have tried two methods.
Assuming the name of the list name is List and I want to extract the jth column.
First one is
column=[item[j] for item in List]
Second one is
newList=zip(*List)
column=newList[j]
However both the methods are too slow since the length of the list is about 50000 and length of each tuple is about 100. Is there a faster way to extract the columns from the list?

this is something numpy does well
A = np.array(Lst) # this step may take a while now ... maybe you should have Lst as a np.array before you get to this point
sliced = A[:,[j]] # this should be really quite fast
that said
newList=zip(*List)
column=newList[j]
takes less than a second for me with a 50kx100 tuple ... so maybe profile your code and make sure the bottleneck is actually where you think it is...

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.