Use Binary Data Instead of File in Python Numpy - python

I need to read a file into a numpy array. The program only has access to the binary data from the file, and the original file extension if needed. The data the program receives would look something like the "data" shown below.
data = open('file.csv', 'rb').read()
I need to generate an array from this binary data. I do not have permission to write the data to a file so doing that then sending the file to numpy won't work.
Is there some way I can treat the binary data like a file so I can use the numpy function below?
my_data = genfromtxt(data, delimiter=',')
Thanks.

Related

How to filter out useable data from csv files using python?

Please help me in extracting important data from a .csv file using python. I got .csv file from 'citrine'.
I want to extract the element name and atomic percentage in the form of "Al2.5B0.02C0.025Co14.7Cr16.0Mo3.0Ni57.48Ti5.0W1.25Zr0.03"
ORIGINAL
[{""element"":""Al"",""idealAtomicPercent"":{""value"":""5.4""}},{""element"":""B"",""idealAtomicPercent"":{""value"":""0.02""}},{""element"":""C"",""idealAtomicPercent"":{""value"":""0.13""}},{""element"":""Co"",""idealAtomicPercent"":{""value"":""7.5""}},{""element"":""Cr"",""idealAtomicPercent"":{""value"":""6.1""}},{""element"":""Mo"",""idealAtomicPercent"":{""value"":""2.0""}},{""element"":""Nb"",""idealAtomicPercent"":{""value"":""0.5""}},{""element"":""Ni"",""idealAtomicPercent"":{""value"":""61.0""}},{""element"":""Re"",""idealAtomicPercent"":{""value"":""0.5""}},{""element"":""Ta"",""idealAtomicPercent"":{""value"":""9.0""}},{""element"":""Ti"",""idealAtomicPercent"":{""value"":""1.0""}},{""element"":""W"",""idealAtomicPercent"":{""value"":""5.8""}},{""element"":""Zr"",""idealAtomicPercent"":{""value"":""0.13""}}]
Original CSV
Expected output
Without having the file structure it is hard to tell.
Try to load the file using:
import csv
with open(file_path) as file:
reader = csv.DictReader(...)
You will have to figure out the arguments for the function which depend on the file.

How to load matlab cell matrix of varied size into python

I need to load a cell array generated in Matlab into Python. Each element in the cell is 2D matrix, and varies in the matrix size.
I tried both scipy.io.loadmat and also mat2py.loadmat, both cannot give desired results (e.g., a list of numpy arrays). With the former, the resulting data is of object type, and the latter gives a list but does not maintain the shape of array elements in the cell.
in matlab, save the data as JSON using JSONLab: https://github.com/fangq/jsonlab
or save the data as HDF5 using EasyH5: https://github.com/fangq/easyh5
then, open python, import the json file using
import json
with open('mydata.json', 'r') as fid:
data=json.load(fid, strict=false);
or
import the hdf5 file using
import h5py
covid19=h5py.File('mydata.h5','r');
if the exported json file contains JData structures, you need to install pyjdata (https://pypi.org/project/jdata/) via
pip install jdata
and then load the .json file using
import jdata as jd
import numpy as np
newdata=jd.load('mydata.json')

Reading large array with numpy gives zeros

I have a large binary file (~4GB) written in 4byte reals. I am trying to read this file using numpy fromfile as follows.
data = np.fromfile(filename, dtype=np.single)
Upon inspecting data, I see that all elements are zeros. However when I read the file in Matlab I can see that the file contains correct data and not zeros. I tested a smaller file (~2.5GB) and numpy could read that fine.
I finally tried using np.memmap to read the large file (~4GB), as
data = np.memmap(filename, dtype=np.single, mode='r')
and upon inspecting data, I can see that it correctly reads the data.
My question is why is np.fromfile giving me all zeros in the array. Is there a memory limit to what np.fromfile can read?

np.loadtxt ignores the header, how can I save the header data?

I've saved an numpy array using savetxt and given the array a header. When I read the file using loadtxt, the header is ignored and only the data is saved in my new array.
How can I access the header as it has important information I want to save as a string.
Edit:
np.savetxt(file_name, array, delimiter=",", header='x,y,z, data from monte carlo simulation')
data = np.loadtxt('test', dtype=float, delimiter=',')
I want to get "data from monte carlo simulation" and save it as a string.
To get the header you can simply read the first line of the file using .readline() method on your file. In your case It would look something like this :
f = open(filename)
header = f.readline()
last_col_name = header.split(',')[-1] #returns 'data from monte carlo simulation'
Also if you want to look into a more versatile way storing data you can check out the pandas library.

Conversion of a .fits file

I've got a .fits file and I want to read the data, unfortunately I'm not at all familiar with this format type. Is there a way to convert it to a table (.txt file?) so that I can work with it using pandas? I just found pyfits and read some of the documentation but it's a bit nebulous to me.
Thanks.
The pyfits getdata function returns an ndarray from the file:
from pyfits import getdata
data = getdata(file_name)
then you can decide which slice(s) to put into your DataFrame(s).
(These bnviewer slides [1] [2] seem like quite a nice primer.)

Categories

Resources