Read h5 files through a NAS - python

I'm reading h5 files from remote, previously it was on a server but more recently I had to store some on a NAS device. When I'm trying to read the ones on the NAS, for some of them I have the following error:
HDF5ExtError: HDF5 error back trace
File "C:\ci\hdf5_1593121603621\work\src\H5Dio.c", line 199, in H5Dread
can't read data
File "C:\ci\hdf5_1593121603621\work\src\H5Dio.c", line 603, in H5D__read
can't read data
File "C:\ci\hdf5_1593121603621\work\src\H5Dcontig.c", line 621, in H5D__contig_read
contiguous read failed
File "C:\ci\hdf5_1593121603621\work\src\H5Dselect.c", line 283, in H5D__select_read
read error
File "C:\ci\hdf5_1593121603621\work\src\H5Dselect.c", line 218, in H5D__select_io
read error
File "C:\ci\hdf5_1593121603621\work\src\H5Dcontig.c", line 956, in H5D__contig_readvv
can't perform vectorized sieve buffer read
File "C:\ci\hdf5_1593121603621\work\src\H5VM.c", line 1500, in H5VM_opvv
can't perform operation
File "C:\ci\hdf5_1593121603621\work\src\H5Dcontig.c", line 753, in H5D__contig_readvv_sieve_cb
block read failed
File "C:\ci\hdf5_1593121603621\work\src\H5Fio.c", line 118, in H5F_block_read
read through page buffer failed
File "C:\ci\hdf5_1593121603621\work\src\H5PB.c", line 732, in H5PB_read
read through metadata accumulator failed
File "C:\ci\hdf5_1593121603621\work\src\H5Faccum.c", line 260, in H5F__accum_read
driver read request failed
File "C:\ci\hdf5_1593121603621\work\src\H5FDint.c", line 205, in H5FD_read
driver read request failed
File "C:\ci\hdf5_1593121603621\work\src\H5FDsec2.c", line 725, in H5FD_sec2_read
file read failed: time = Tue May 10 11:37:06 2022
, filename = 'Y:/myFolder\myFile.h5', file descriptor = 4, errno = 22, error message = 'Invalid argument', buf = 0000020F03F14040, total read size = 16560000, bytes this sub-read = 16560000, bytes actually read = 18446744073709551615, offset = 480252764
End of HDF5 error back trace
Problems reading the array data.
I don't really understand the error, it always happend for the same files, but I can open the file and read the data myself with HDFView. If I put it on the server I can read it without problem with the same lines of code (path is correct for both):
hdf5store = pd.HDFStore(myPath[fich])
datacopy = hdf5store['my_data']
Btw the error occurs at this 2nd line of code. Right now I don't have access to the server and can't copy the file on local because I don't have enough space. If anyone know how to correct this so I could continue to work through the NAS ?

Related

OSError: Unable to open file (File signature not found) h5 file

I am trying to read an h5 file
# Reads the Training data file. However, this just reads the speaker list data required for encoding the targets with scikit-learn block "LabelEncoder".
dataServer = h5py.File('Librispeech_960_train_list.h5', 'r')
sLabels=dataServer['lSpeaker'][:]
encoder_spk = LabelEncoder()
encoder_spk.fit(sLabels)
num_spk_class=np.unique(sLabels).shape[0]
but i get this error :
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5f.pyx", line 85, in h5py.h5f.open
OSError: Unable to open file (file signature not found)
I get this OSError and I don't know how to solve it

How can read Minecraft .mca files so that in python I can extract individual blocks?

I can't find a way of reading the Minecraft world files in a way that i could use in python
I've looked around the internet but can find no tutorials and only a few libraries that claim that they can do this but never actually work
from nbt import *
nbtfile = nbt.NBTFile("r.0.0.mca",'rb')
I expected this to work but instead I got errors about the file not being compressed or something of the sort
Full error:
Traceback (most recent call last):
File "C:\Users\rober\Desktop\MinePy\MinecraftWorldReader.py", line 2, in <module>
nbtfile = nbt.NBTFile("r.0.0.mca",'rb')
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 628, in __init__
self.parse_file()
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 652, in parse_file
type = TAG_Byte(buffer=self.file)
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 99, in __init__
self._parse_buffer(buffer)
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\site-packages\nbt\nbt.py", line 105, in _parse_buffer
self.value = self.fmt.unpack(buffer.read(self.fmt.size))[0]
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\gzip.py", line 276, in read
return self._buffer.read(size)
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\gzip.py", line 463, in read
if not self._read_gzip_header():
File "C:\Users\rober\AppData\Local\Programs\Python\Python36-32\lib\gzip.py", line 411, in _read_gzip_header
raise OSError('Not a gzipped file (%r)' % magic)
OSError: Not a gzipped file (b'\x00\x00')
Use anvil parser. (Install with pip install anvil-parser)
Reading
import anvil
region = anvil.Region.from_file('r.0.0.mca')
# You can also provide the region file name instead of the object
chunk = anvil.Chunk.from_region(region, 0, 0)
# If `section` is not provided, will get it from the y coords
# and assume it's global
block = chunk.get_block(0, 0, 0)
print(block) # <Block(minecraft:air)>
print(block.id) # air
print(block.properties) # {}
https://pypi.org/project/anvil-parser/
According to this page, the .mca files is not totally kind of of NBT file. It begins with an 8KiB header which includes the offsets of chunks in the region file itself and the timestamps for the last updates of those chunks.
I recommend you to see the offical announcement and this page for more information.

Parquet file not accesible to write after first read using PyArrow

I am trying to read a parquet file in pandas dataframe, do some manipulation and write it back in the same file, however it seems file is not accessible to write after the first read in same function.
It only works, if I don't perform STEP 1 below.
Is there anyway to unlock the file as such?
#STEP 1: Read entire parquet file
pq_file = pq.ParquetFile('\dev\abc.parquet')
exp_df = pq_file.read(nthreads=1, use_pandas_metadata=True).to_pandas()
#STEP 2:
# Change some data in dataframe
#STEP 3: write merged dataframe
pyarrow_table = pa.Table.from_pandas(exp_df)
pq.write_table(pyarrow_table, '\dev\abc.parquet',compression='none',)
Error:
File "C:\Python36\lib\site-packages\pyarrow\parquet.py", line 943, in
write_table
**kwargs)
File "C:\Python36\lib\site-packages\pyarrow\parquet.py", line 286, in
__init__
**options)
File "_parquet.pyx", line 832, in pyarrow._parquet.ParquetWriter.__cinit__
File "error.pxi", line 79, in pyarrow.lib.check_status
pyarrow.lib.ArrowIOError: Failed to open local file: \dev\abc.parquet ,
error: Invalid argument

Getting an EOFError when getting large files with Paramiko

I'm trying to write a quick python script to grab some logs with sftp. My first inclination was to use Pysftp, since it seemed like it made it very simple. It worked great, until it got to a larger file. I got an error while getting any file over about 13 MB. I then decided to try writing what I needed directly in Paramiko, rather than relying on the extra layer of Pysftp. After figuring out how to do that, I ended up getting the exact same error. Here's the Paramiko code, as well as the trace from the error I get. Does anyone have any idea why this would have an issue pulling any largish files? Thanks.
# Create tranport and connect
transport = paramiko.Transport((host, 22))
transport.connect(username=username, password=password)
sftp = paramiko.SFTPClient.from_transport(transport)
# List of the log files in c:
files = sftp.listdir('c:/logs')
# Now pull them, logging as you go
for f in files:
if f[0].lower() == 't' or f[:3].lower() == 'std':
logger.info('Pulling {0}'.format(f))
sftp.get('c:/logs/{0}'.format(f), output_dir +'/{0}'.format(f))
# Close the connection
sftp.close()
transport.close()
And here's the error:
No handlers could be found for logger "paramiko.transport"
Traceback (most recent call last):
File "pull_logs.py", line 420, in <module> main()
File "pull_logs.py", line 410, in main
pull_logs(username, host, password, location)
File "pull_logs.py", line 142, in pull_logs
sftp.get('c:/logs/{0}'.format(f), output_dir +'/{0}'.format(f))
File "/Users/me/my_site/site_packages/paramiko/sftp_client.py", line 676, in get
size = self.getfo(remotepath, fl, callback)
File "/Users/me/my_site/site_packages/paramiko/sftp_client.py", line 645, in getfo
data = fr.read(32768)
File "/Users/me/my_site/site_packages/paramiko/file.py", line 153, in read
new_data = self._read(read_size)
File "/Users/me/my_site/site_packages/paramiko/sftp_file.py", line 152, in _read
data = self._read_prefetch(size)
File "/Users/me/my_site/site_packages/paramiko/sftp_file.py", line 132, in _read_prefetch
self.sftp._read_response()
File "/Users/me/my_site/site_packages/paramiko/sftp_client.py", line 721, in _read_response
raise SSHException('Server connection dropped: %s' % (str(e),))
paramiko.SSHException: Server connection dropped:

Cannot open my client

I am trying to open my tryton client but it is not working
the snapshot of my problem :
the text of the problem is:
File "./tryton", line 66, in <module>
tryton.client.TrytonClient().run()
File "/home/ghassen/work/tryton/tryton/client.py", line 101, in run
main.sig_login()
File "/home/ghassen/work/tryton/tryton/gui/main.py", line 910, in sig_login
res = DBLogin().run()
File "/home/ghassen/work/tryton/tryton/gui/window/dblogin.py", line 579, in run
if (self.profiles.get(profile_name, sectionname)
File "/usr/lib/python2.7/ConfigParser.py", line 618, in get
raise NoOptionError(option, section)
Tryton has a profile file where it saves the know connections, and from the error it seems that file is corrupted. You can find this file under ~/.config/tryton/x.y/profiles.cfg where x.y corresponds to you version number.
If you don't have any saved profile, you can remove this file and the client will recreate them when started another time.

Categories

Resources