map int over 2D array python - python

To parse
s="1,2,3,4_5,6,7,8"
as [[1,2,3,4],[5,6,7,8]]
I am currently using
import numpy as np
a=np.array([list(map(int,r.split(","))) for r in s.split("_")])
Is there a more pythonic or one-shot inbuilt way of doing this or am I on the right track here?
Python newbie.

Using list-comprehensions:
s="1,2,3,4_5,6,7,8"
a = np.array([[int(x) for x in r.split(',')] for r in s.split('_')])

You can use np.genfromtxt:
from io import StringIO
import numpy as np
s="1,2,3,4_5,6,7,8"
np.genfromtxt(StringIO(s.replace("_", "\n")), delimiter=",")
array([[1., 2., 3., 4.],
[5., 6., 7., 8.]])

Related

matlab single() function in numpy?

I try to convert matlab code to python/numpy code.
I have this line:
l = single(l)
"l" is a array of arrays and as the matlab docu says "Convert to single precision".
How can I do that with numpy?
To convert a two-dimensional numpy array to single-precision, use astype and give it the float32 argument. For example:
>>> import numpy as np
>>> a = np.array([[1.], [2.], [3.]])
>>> a
array([[ 1.],
[ 2.],
[ 3.]])
>>> a = a.astype('float32')
>>> a
array([[ 1.],
[ 2.],
[ 3.]], dtype=float32)
For more about numeric and array data types, see the documentation.

I don't understand the k-means scipy algorithm

I'm trying to use the scipy kmeans algorithm.
So I have this really simple example:
from numpy import array
from scipy.cluster.vq import vq, kmeans, whiten
features = array([[3,4],[3,5],[4,2],[4,2]])
book = array((features[0],features[2]))
final = kmeans(features,book)
and the result is
final
(array([[3, 4],
[4, 2]]), 0.25)
What I don't understand is, for me the centroids coordinate should be the barycentre of all the points belongings to the cluster, so in this exemple
[3,9/2] and [4,2]
can anyone explain me the result the scipy algorithm is giving?
It looks like it is preserving the data type that you are giving it (int). Try:
features = array([[3., 4.], [3., 5.], [4., 2.], [4., 2.]])

Writing multiple seperate numpy arrays to a single comma delimited text file

I would like to take multiple numpy arrays and write them to a text file that is comma delimited. Here is the example of my original data and the final data that I am trying to produce:
array([[1., 3., 0., 1.],
[2., 5., 3., 1.]].....
and so forth. For multiple different arrays of four-column lengths. I can get an out put txt file using write() but I can't get the data into the format shown below:
1., 3., 0., 1.
2., 5., 3., 1.
Also, I need to have the 0th column integers and the 1st through 3rd be floating point.
Cheers.
How about this?
data = array([[1., 3., 0., 1.],
[2., 5., 3., 1.]].....
with open('output.csv', 'w') as f:
for x in data:
f.write('%d,%f,%f,%f\n' % tuple(x))
This outputs
1,3.000000,0.000000,1.000000
2,5.000000,3.000000,1.000000
You can adjust the precision of the floating point output by changing %f to %.2f if you want two decimal places, for instance.
I would recommend using pandas for this:
import numpy
import pandas
data = numpy.array([[1., 3., 0., 1.],
[2., 5., 3., 1.]])
data = pandas.DataFrame(data,columns=['a','b','c','d'])
data['a'] = data['a'].astype(int)
data.to_csv('outfile.csv')

Best way to serialize GeoTIFF in a file to be opened by Python as Numpy array

I have a file in GeoTIFF (geo-referenced TIFF image), which I can load in Python using GDAL and convert into a Numpy array, which my program then processes using the geo-referencing info taken from the file by GDAL.
Since I'd like to remove the GDAL dependency, I plan to serialize the GeoTIFF information to another file format (JSON comes to mind), with the following desireable requirements:
Small file size;
Fast access;
Random-access (slicing) if possible;
Numpy-friendly (doesn't need a fancy class or another module dependency to decode);
Simple/straightforward/"human-readable";
Could be easily used by other scripts in other languages, not cryptic;
JSON would work fine but I'm concerned it's not the smallest neither the fastest access format. Since the array type is uint16, binary could be an option. Pickle might be too cryptic. CSV would make difficult to separate the geo-referencing info (corner coordinates and resolution) from the grid values.
Thanks for reading!
I'm not familiar with GeoTIFF information, but for storing flat data I'd highly recommend the hdf5 format, which has a nice set of python bindings called h5py. Here's a quick example, showing how easy it is to work with:
>>> import h5py
>>> f = h5py.File('data.hdf5')
>>> a = np.arange(12.0).reshape((4,3))
>>> a
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.],
[ 9., 10., 11.]])
>>> f.create_dataset('array', data=a)
<HDF5 dataset "array": shape (4, 3), type "<f8">
>>> f['array'].attrs['info'] = 'some data I want to store'
>>> f['array'].attrs['date'] = (6, 21, 2012)
>>> f.close()
>>> f = h5py.File('data.hdf5')
>>> f['array']
<HDF5 dataset "array": shape (4, 3), type "<f8">
>>> f['array'].value
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.],
[ 9., 10., 11.]])
>>> f['array'].attrs['info']
'some data I want to store'
>>> f['array'].attrs['date']
array([ 6, 21, 2012])

How do I convert a numpy array to (and display) an image?

I have created an array thusly:
import numpy as np
data = np.zeros( (512,512,3), dtype=np.uint8)
data[256,256] = [255,0,0]
What I want this to do is display a single red dot in the center of a 512x512 image. (At least to begin with... I think I can figure out the rest from there)
The following should work:
from matplotlib import pyplot as plt
plt.imshow(data, interpolation='nearest')
plt.show()
If you are using Jupyter notebook/lab, use this inline command before importing matplotlib:
%matplotlib inline
A more featureful way is to install ipyml pip install ipympl and use
%matplotlib widget
see an example.
You could use PIL to create (and display) an image:
from PIL import Image
import numpy as np
w, h = 512, 512
data = np.zeros((h, w, 3), dtype=np.uint8)
data[0:256, 0:256] = [255, 0, 0] # red patch in upper left
img = Image.fromarray(data, 'RGB')
img.save('my.png')
img.show()
Note: both these APIs have been first deprecated, then removed.
Shortest path is to use scipy, like this:
# Note: deprecated in v0.19.0 and removed in v1.3.0
from scipy.misc import toimage
toimage(data).show()
This requires PIL or Pillow to be installed as well.
A similar approach also requiring PIL or Pillow but which may invoke a different viewer is:
# Note: deprecated in v1.0.0 and removed in v1.8.0
from scipy.misc import imshow
imshow(data)
How to show images stored in numpy array with example (works in Jupyter notebook)
I know there are simpler answers but this one will give you understanding of how images are actually drawn from a numpy array.
Load example
from sklearn.datasets import load_digits
digits = load_digits()
digits.images.shape #this will give you (1797, 8, 8). 1797 images, each 8 x 8 in size
Display array of one image
digits.images[0]
array([[ 0., 0., 5., 13., 9., 1., 0., 0.],
[ 0., 0., 13., 15., 10., 15., 5., 0.],
[ 0., 3., 15., 2., 0., 11., 8., 0.],
[ 0., 4., 12., 0., 0., 8., 8., 0.],
[ 0., 5., 8., 0., 0., 9., 8., 0.],
[ 0., 4., 11., 0., 1., 12., 7., 0.],
[ 0., 2., 14., 5., 10., 12., 0., 0.],
[ 0., 0., 6., 13., 10., 0., 0., 0.]])
Create empty 10 x 10 subplots for visualizing 100 images
import matplotlib.pyplot as plt
fig, axes = plt.subplots(10,10, figsize=(8,8))
Plotting 100 images
for i,ax in enumerate(axes.flat):
ax.imshow(digits.images[i])
Result:
What does axes.flat do?
It creates a numpy enumerator so you can iterate over axis in order to draw objects on them.
Example:
import numpy as np
x = np.arange(6).reshape(2,3)
x.flat
for item in (x.flat):
print (item, end=' ')
import numpy as np
from keras.preprocessing.image import array_to_img
img = np.zeros([525,525,3], np.uint8)
b=array_to_img(img)
b
Using pillow's fromarray, for example:
from PIL import Image
from numpy import *
im = array(Image.open('image.jpg'))
Image.fromarray(im).show()
Using pygame, you can open a window, get the surface as an array of pixels, and manipulate as you want from there. You'll need to copy your numpy array into the surface array, however, which will be much slower than doing actual graphics operations on the pygame surfaces themselves.
The Python Imaging Library can display images using Numpy arrays. Take a look at this page for sample code:
Convert Between Numerical Arrays and PIL Image Objects
EDIT: As the note on the bottom of that page says, you should check the latest release notes which make this much simpler:
http://effbot.org/zone/pil-changes-116.htm
Supplement for doing so with matplotlib. I found it handy doing computer vision tasks. Let's say you got data with dtype = int32
from matplotlib import pyplot as plot
import numpy as np
fig = plot.figure()
ax = fig.add_subplot(1, 1, 1)
# make sure your data is in H W C, otherwise you can change it by
# data = data.transpose((_, _, _))
data = np.zeros((512,512,3), dtype=np.int32)
data[256,256] = [255,0,0]
ax.imshow(data.astype(np.uint8))
For example your image is in an array names 'image'
All you do is
plt.imshow(image)
plt.show
This will display an array in the form of an image
Also, dont forget to import PLT
this could be a possible code solution:
from skimage import io
import numpy as np
data=np.random.randn(5,2)
io.imshow(data)

Categories

Resources