Create a new folder on HSDS server with Python - python

How is a new folder created using h5pyd module in Python?
For example, I have the domain /home/user/ and I want to create a folder /home/user/data1/.
From the command line I can use the following command:
hstouch /home/user/data1/
What is the equivalent in h5pyd?
See below for a simplified example of what I am trying to do.
import h5pyd
import numpy as np
with h5pyd.File("/home/user/data1/myfile.h5", "w") as f:
dset = f.create_dataset("mydataset", (100,), dtype='i')
However, because /home/user/data1/ does not exist, I get a 404 error.

You would just do:
h5pyd.Folder("/home/user/data1/", mode="w')
There's a test case for this at:
https://github.com/HDFGroup/h5pyd/blob/master/test/hl/test_folder.py#L178

Related

Loading .dat file in python

I made a simple code that loads in data called 'original.dat' from a folder named 'data' in my computer. The code was working great and i was able to view my spectra graph. This morning I ran the code again, but it completely crashed giving the error " OSError: data/original.dat not found." Even though nothing changed. The data is infact still in the folder named 'data' and there isn't any spelling mistakes. Can anyone please help understand why its now giving me this error? The code was working perfectly the day before.
here is the code I used :
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
OPUSimaginary = np.loadtxt('data/original.dat', delimiter = ',')
Data file position, Error:cant find the file,Error: suggested code to find file
Few things that you can do to avoid file not found type of issues:
Add a check if the file exists before trying to load it.
Example:
import os
os.path.isfile(fname) # Returns True is file found
Make sure the file permissions are correct and the file can be read.
sudo chmod -R a+rwx <file-name>

How to access the global environment in R using rpy2 in Python?

I am trying to access a dataframe from R global environment and import it into Python in Pycharm IDE. But, I am not able to figure how to do it.
I tried the following:
Since, I don't know how to access the global environment where my target data.frame is stored. I created another R script (myscript.R) where I converted to data.frame into a rds object and called it again.
save(dfcast, file = "forecast.rds")
my_data <- readRDS(file = "forecast.rds")
However, when I try to read the rds in python using the following code in Python:
import os
import pandas as pd
import rpy2.robjects as robjects
from rpy2.robjects import pandas2ri
from rpy2.robjects.packages import SignatureTranslatedAnonymousPackage
cwd = os.getcwd()
pandas2ri.activate()
os.chdir('C:/Users/xx/myscript.R')
readRDS = robjects.r['readRDS']
df = readRDS('forecast.rds')
df = pandas2ri.ri2py(df)
df.head()
I get the following error:
Error in gzfile(file, "rb") : cannot open the connection
In addition: Warning message:
In gzfile(file, "rb") :
cannot open compressed file 'forecast.rds', probable reason 'No such file or directory'
Please show the way to deal with this. I just want to access a data.frame from R in Python.
The data.frame is actually a forecast generated from another R script which takes about 7-8 minutes to run. So, instead of running it again on Python , i want it to process in R and import the forecast dataframe to python for further analysis. Since, I am in the midst of building further analysis module. I don't want the R forecast function to run again and again while I am debugging my analysis module. Hence, I want to directly access it from R.

Read dictionary from file

Background (optional)
I am writting a python script to analyse Abaqus (finite element software) outputs. This software generates a ".odb" which has a propriedtary format. However you can access the data stored inside of the databse thanks to python libraries specialy developped by Dassault (owner of the Abaqus sofware). The python script has to be run by the software in order to access these libraries :
abaqus python myScript.py
However it is really hard to use new libraries this way, and I cannot make it run the matplotlib library. So I would like to export the array I created inside a file, and access it later with an other script that would not require to be run using abaqus
The Problem
In order to manipulate the data, I am using collections. For exemple:
coord2s11=defaultdict(list)
This array stores the Z coordinate of a group of nodes and their stress value, at each time step:
coord2s11[time_step][node_number][0]=z_coordinate
coord2s11[time_step][node_number][1]=stress_value
For a given time step, the output would be :
defaultdict(<type 'list'>, {52101: [-61.83229635920749, 0.31428813934326172], 52102: [-51.948098314163417, 0.31094224750995636],[...], 52152: [440.18335942363655, -0.11255115270614624]})
And the glob (for all step time):
defaultdict(<type 'list'>, {0.0: defaultdict(<type 'list'>, {52101: [0.0, 0.0],[...]}), 12.660835266113281: defaultdict(<type 'list'>, {52101: [0.0, 0.0],[...],52152: [497.74876378582229, -0.24295337498188019]})})
If it is visually unpleasant, it is rather easy to use ! I printed this array inside this file using:
with open('node2coord.dat','w') as f :
f.write(str(glob))
I tried to follow the solution I found on this post, but when I try to read the file a store the value inside a new dictionnay
import ast
with open('node2coord.dat', 'r') as f:
s = f.read()
node2coord = ast.literal_eval(s)
I end up with a SyntaxError: invalid syntax, that I guess comes from the defaultdict(<type 'list'> here and there in the array.
Is there a way to get the data stored inside of the file or should I modify the way it is written inside the file ? Ideally I would like to create the exact same array I stored.
The solution by Joel Johnson
Creating a database using shelve. It is an easy and fast method. The following code did the trick for me to create the db :
import os
import shelve
curdir = os.path.dirname(__file__) #defining current directory
d = shelve.open(os.path.join(curdir, 'nameOfTheDataBase')) #creation of the db
d['keyLabel'] = glob # storing the dictionary in "d" under the key 'keyLabel'
d.close() # close the db
The "with" statement did not work for me.
And then to open it again :
import os
import shelve
curdir = os.path.dirname(__file__)
d = shelve.open(os.path.join(curdir, 'nameOfTheDataBase')) #opening the db
newDictionary = d['keyLabel'] #loading the dictionary inside of newDictionary
d.close()
If you ever get an error saying
ImportError: No module named gdbm
Just install the gdbm module. For linux :
sudo apt-get install python-gdbm
More information here
If you have access to shelve (which I think you would because it's part of the standard library) I would highly recommend using that. Using shelve is an easy way to store and load python objects without manually parsing and reconstructing them.
import shelve
with shelve.open('myData') as s:
s["glob"] = glob
Thats it for storing the data. Then when you need to retrieve it...
import shelve
with shelve.open('myData') as s:
glob = s["glob"]
It's as simple as that.

Import csv Python with Spyder

I am trying to import a csv file into Python but it doesn't seem to work unless I use the Import Data icon.
I've never used Python before so apologies is I am doing something obviously wrong. I use R and I am trying to replicate the same tasks I do in R in Python.
Here is some sample code:
import pandas as pd
import os as os
Main_Path = "C:/Users/fan0ia/Documents/Python_Files"
Area = "Pricing"
Project = "Elasticity"
Path = os.path.join(R_Files, Business_Area, Project)
os.chdir(Path)
#Read in the data
Seasons = pd.read_csv("seasons.csv")
Dep_Sec_Key = pd.read_csv("DepSecKey.csv")
These files import without any issues but when I execute the following:
UOM = pd.read_csv("FINAL_UOM.csv")
Nothing shows in the variable explorer panel and I get this in the IPython console:
In [3]: UOM = pd.read_csv("FINAL_UOM.csv")
If I use the Import Data icon and use the wizard selecting DataFrame on the preview tab it works fine.
The same file imports into R with the same kind of command so I don't know what I am doing wrong? Is there any way to see what code was generated by the wizard so I can compare it to mine?
Turns out the data had imported, it just wasn't showing in the variable explorer

Error: Line magic function

I'm trying to read a file using python and I keep getting this error
ERROR: Line magic function `%user_vars` not found.
My code is very basic just
names = read_csv('Combined data.csv')
names.head()
I get this for anytime I try to read or open a file. I tried using this thread for help.
ERROR: Line magic function `%matplotlib` not found
I'm using enthought canopy and I have IPython version 2.4.1. I made sure to update using the IPython installation page for help. I'm not sure what's wrong because it should be very simple to open/read files. I even get this error for opening text files.
EDIT:
I imported traceback and used
print(traceback.format_exc())
But all I get is none printed. I'm not sure what that means.
Looks like you are using Pandas. Try the following (assuming your csv file is in the same path as the your script lib) and insert it one line at a time if you are using the IPython Shell:
import pandas as pd
names = pd.read_csv('Combined data.csv')
names.head()

Categories

Resources