Converting astropy.table.columns to a numpy array - python

I'd like to plot points:
points = np.random.multivariate_normal(mean=(0,0), cov=[[0.4,9],[9,10]],size=int(1e4))
print(points)
[[-2.50584156 2.77190372]
[ 2.68192136 -3.83203819]
...,
[-1.10738221 -1.72058301]
[ 3.75168017 5.6905342 ]]
print(type(points))
<class 'numpy.ndarray'>
data = ascii.read(datafile)
type(data['ra'])
astropy.table.column.Column
type(data['dec'])
astropy.table.column.Column
and then I try:
points = np.array([data['ra']], [data['dec']])
and get a
TypeError: data type not understood
Thoughts?

An astropy Table Column object can be converted to a numpy array using the data attribute:
In [7]: c = Column([1, 2, 3])
In [8]: c.data
Out[8]: array([1, 2, 3])
You can also convert an entire table to a numpy structured array with the as_array() Table method (e.g. data.as_array() in your example).
BTW I think the actual problem is not about astropy Column but your numpy array creation statement. It should probably be:
arr = np.array([data['ra'], data['dec']])
This works with Column objects.

The signature of numpy.array is numpy.array(object, dtype=None,)
Hence, when calling np.array([data['ra']], [data['dec']]), [data['ra']] is the object to convert to a numpy array, and [data['dec']] is the data type, which is not understood (as the error says).
It's not actually clear from the question what you are trying to achieve instead - possibly something like
points = np.array([data['ra'], data['dec']])

Keep in mind, though, that if you actiually want is to plot points you don't need to convert to arrays. The following will work just fine:
from matplotlib import pyplot as plt
plt.scatter(data['ra'], data['dec'])
With no need to do any conversion to arrays.

Related

How to properly change dtype of numpy array

I have a numpy array that I obtained from pandas dataframe
data_array = df['column_name'].to_numpy()
The resulting array has dtype object, just like the original column, and consists of lists of integer values with shape (2000,). I would like it be of int32 type. However when I attempt to use
data_array = data_array.astype(np.int32)
I get exception
setting an array element with a sequence.
All elements in array are lists with same number of integers (a hundred or so).
The general format is:
[[1,0,1,0],[0,0,0,0],[1,0,0,1]]
Is there something obvious I'm missing? Or is there another, better way, to convert pandas dataframes into numpy arrays of desired type?
Because it seems to me I'm running out of options.
EDIT
I figured it out, although the approach was a bit hacky.
data_array = np.array(df['column_name'].to_list(), np.int32)
I'm still not sure why it was needed. But apparently one can turn two dimensional list of integers to numpy array with the right dtype and a list of numpy arrays instead of the two dimensional list.

How to create a new numpy array from a calculation of elements within an existing numpyarray

I'm a Python and Numpy newbie...and I'm stuck. I'm trying to create a new numpy array from the log returns of elements in an existing numpy array (i.e. new array = old array(with ln(x/x-1)). I was not using Pandas dataframe because I plan to incorporate the correlations of the returns (i.e. "new array) into a large monte carlo simulation. Open to suggestions if this is not the right path.
This is the closest result I found in stackflow search, but it is not working:
What is the most efficient way to get log returns in numpy
My guess is that I need to pass in the elements of the existing array but I thought using arrays and functions within Numpy was the whole benefit of moving away from Pandas series and Python base code. Appreciate help and feedback!
code link(I'm new so stackflow won't let me embed images): http://i.stack.imgur.com/wkf56.png
Numpy as log function, you can apply it directly to an array. The return value will be a new array of the same shape. Keep in mind that the input should be an array of positive values with dtype == float.
import numpy
old_array = numpy.random.random(5.) * 10.
new_array = numpy.log(old_array / (old_array - 1.))
print type(old_array)
# <type 'numpy.ndarray'>
print old_array.dtype
# float64
print old_array
# [ 8.56610175 6.40508542 2.00956942 3.33666968 8.90183905]
print new_array
# [ 0.12413478 0.16975202 0.68839656 0.35624651 0.11916237]

How to delete a specific terms in a long array (python)?

I have a long array/list of numbers (from a netcdf file), and I want to a specific term which appears multiple times in the array. This is what I have:
lon = np.array(ncfile.variables['LONGITUDE'][:])
lon[lon>1000]=float('nan');
lat = np.array(ncfile.variables['LATITUDE'][:])
lat[lat>1000]=float('nan');
What I want to do is to have no values of lon/lat over 1000 (hence the 'nan'); however, I also want all 'nan's deleted from the array, as it messes up my graph.
My question: how do I delete all the 'nan' terms from my array? I know a similar question was asked, but it did not really answer my question.
If you're using numpy for your arrays, you can do
x = x[~numpy.isnan(x)]
Note: NetCDF variables are cast to numpy arrays, so there is no need to include the np.array call during the read in.
>>> lat = ncfile.variables['LATITUDE'][:]
>>> type(lat)
<class 'numpy.ndarray'>
If you want to simply retain the portion of the lat/lon arrays that are less than 1000, you can use numpy where:
lat_new = lat[np.where(lat < 1000.)[0]]

How to combine np string array with float array python

I would like to combine an array full of floats with an array full of strings. Is there a way to do this?
(I am also having trouble rounding my floats, insert is changing them to scientific notation; I am unable to reproduce this with a small example)
A=np.array([[1/3,257/35],[3,4],[5,6]],dtype=float)
B=np.array([7,8,9],dtype=float)
C=np.insert(A,A.shape[1],B,axis=1)
print(np.arround(B,decimals=2))
D=np.array(['name1','name2','name3'])
How do I append D onto the end of C in the same way that I appended B onto A (insert D as the last column of C)?
I suspect that there is a type issue between having strings and floats in the same array. It would also answer my questions if there were a way to change a float (or maybe a scientific number, my numbers are displayed as '5.02512563e-02') to a string with about 4 digits (.0502).
I believe concatenate will not work, because the array dimensions are (3,3) and (,3). D is a 1-D array, D.T is no different than D. Also, when I plug this in I get "ValueError: all the input arrays must have same number of dimensions."
I don't care about accuracy loss due to appending, as this is the last step before I print.
Use dtype=object in your numpy array; like bellow:
np.array([1, 'a'], dtype=object)
Try making D a numpy array first, then transposing and concatenating with C:
D=np.array([['name1','name2','name3']])
np.concatenate((C, D.T), axis=1)
See the documentation for concatenate for explanation and examples:
http://docs.scipy.org/doc/numpy/reference/generated/numpy.concatenate.html
numpy arrays support only one type of data in the array. Changing the float to str is not a good idea as it will only result in values very close to the original value.
Try using pandas, it support multiple data types in single column.
import numpy as np
import pandas as pd
np_ar1 = np.array([1.3, 1.4, 1.5])
np_ar2 = np.array(['name1', 'name2', 'name3'])
df1 = pd.DataFrame({'ar1':np_ar1})
df2 = pd.DataFrame({'ar2':np_ar2})
pd.concat([df1.ar1, df2.ar2], axis=0)

Insert numpy array to an empty numpy array

I am trying to create an empty numpy array and then insert newly created arrays into than one. It is important for me not to shape the first numpy array and it has to be empty and then I can be able to add new numpy arrays with different sizes into that one. Something like the following:
A = numpy.array([])
B = numpy.array([1,2,3])
C = numpy.array([5,6])
A.append(B, axis=0)
A.append(C, axis=0)
and I want A to look like this:
[[1,2,3],[5,6]]
When I do the append command I get the following error:
AttributeError: 'numpy.ndarray' object has no attribute 'append'
Any idea how this can be done?
PS: This is not similar to the questions asked before because I am not trying to concatenate two numpy arrays. I am trying to insert a numpy array to another empty numpy array. I know how to do this using lists but it has to be numpy array.
Thanks
You can't do that with numpy arrays, because a real 2D numpy is rectangular. For example, np.arange(6).reshape(2,3) return array([[0, 1, 2],[3, 4, 5]]).
if you really want to do that, try array([array([1,2,3]),array([5,6])]) which create array([array([1, 2, 3]), array([5, 6])], dtype=object) But you will loose all the numpy power with misaligned data.
You can do this by converting the arrays to lists:
In [21]: a = list(A)
In [22]: a.append(list(B))
In [24]: a.append(list(C))
In [25]: a
Out[25]: [[1, 2, 3], [5, 6]]
My intuition is that there's a much better solution (either more pythonic or more numpythonic) than this, which might be gleaned from a more complete description of your problem.
Taken from here. Maybe search for existing questions first.
numpy.append(M, a)

Categories

Resources