Is there a way to find the index of a variable in SPSS Python?
For example, if one of my variables in the SPSS dataset is ID
usually, I would be able to access the variable with the following code:
varObj = datasetObj.varlist[0]
Assuming that ID is the first column in my dataset.
But what if the variable ID is lost somewhere in the middle of a dataset?
Is there a way for me to find the index value of the variable ID?
Thank you very much for your help.
From the documentation of the Variable Class, you can get a reference to the variable by name or by index:
# Create a Variable object, specifying the variable by name
varObj = datasetObj.varlist['bdate']
# Create a Variable object, specifying the variable by index
varObj = datasetObj.varlist[3]
So in your case:
varObj = datasetObj.varlist['ID']
You can, if needed, get the index of the variable by its name, using the index property:
varIndex = datasetObj.varlist['ID'].index
Note also that you can use the spssaux.VariableDict class to get and set (except for type) all the properties of variables.
Also, all the doc for the programmability apis is available under Help > Programmability, and you may find the Programming and Data Management book (pdf) downloadable from the SPSS Community (old) or new Predictive Analytics website at https://developer.ibm.com/predictiveanalytics/
Related
I am using Python. I would like to create a new column which is the log transformation of column 'lights1992'.
I am using the following code:
log_lights1992 = np.log(lights1992)
I obtain the following error:
I have tried two things: 1) adding a 1 to each value and transform the column 'lights1992' to numeric.
city_join['lights1992'] = pd.to_numeric(city_join['lights1992'])
city_join["lights1992"] = city_join["lights1992"] + 1
However, that two solution has not worked. Variable 'lights1992' is a float64 type. Do you know what can be the problem?
Edit:
The variable 'lights1992' comes from doing a zonal_statistics from a raster 'junk1992', maybe this affect.
zs1 = zonal_stats(city_join, junk1992, stats=['mean'], nodata=np.nan)
city_join['lights1992'] = [x['mean'] for x in zs1]
the traceback states:
'DatasetReader' object has no attribute'log'.
Did you re-assign numpy to something else at some point? I can't find much about 'DatasetReader' is that a custom class?
EDIT:
I think you would need to pass the whole column because your edit doesn't show a variable named 'lights1992'
so instead of:
np.log(lights1992)
can you try passing in the Dataframe's column to log?:
np.log(city_join['lights1992'])
2ND EDIT:
Since you've reported back that it works I'll dive into the why a little bit.
In your original statement you called the log function and gave it an argument, then you assigned the result to a variable name:
log_lights1992 = np.log(lights1992)
The problem here is that when you give python text without any quotes it thinks you are giving it a variable name (see how you have log_lights1992 on the left of the equal sign? You wanted to assign the results of the operation on the right hand side of the equal sign to the variable name log_lights1992) but in this case I don't think lights1992 had any value!
So there were two ways to make it work, either what I said earlier:
Instead of giving it a variable name you give .log the column of the city_join dataframe (that's what city_join["lights1992"]) directly.
Or
You assign the value of that column to the variable name first then you pass it in to .log, like this:
lights1992 = city_join["lights1992"]
log_lights1992 = np.log(lights1992)
Hope that clears it up for you!
I am using python 3.9.6 in Windows 10.
Similar earlier questions at
(1) Creating a dynamic dictionary name
and
(2) How to obtain assembly name dynamically
were different and do not solve my problem.
My data dictionary(dynamically created):
pcm30={'ABB': '-0.92', 'ZYDUSWELL': 2.05}
Dynamically obtained new dictionary name "pCh0109" is in variable z
I have to create different dictionaries to create a data frame.
Now I want to dynamically (i.e through programming) change the name of the dictionary from
'pcm30'
to
'pCh0109'.
The digits in the new name of the dictionary ('pCh0109') indicate the time of creation of the particular dictionary.
How to do it?
Will be grateful for assistance and help.
I would strongly recommend you don't try this unless you absolutely have to, but here's the simplest approach to do that:
pcm30 = {'ABB': '-0.92', 'ZYDUSWELL': 2.05}
globals()['pCh0109'] = globals().pop('pcm30')
# Your IDE might glare at you here, but it'll work out without errors at runtime
print(pCh0109)
Instead I suggest to try this approach - use a dictionary if possible. This will turn out much safer for all. Example below:
def my_func():
d = {}
pcm30 = {'ABB': '-0.92', 'ZYDUSWELL': 2.05}
d['pCh0109'] = locals().pop('pcm30')
print(d['pCh0109']['ABB'])
# -0.92
I have an imports dictionary with:
keys equal to names of new variables I would like to build, for example dataset_1, dataset_2 etc.
values being the pandas DataFrames (the type of each value is pd.DataFrame)
What I would like to achieve is to build new variables in amount of len(keys). The name of each variable would be equal to the name of key and the variable would hold a respective pd.DataFrame.
The code below doesn't work, but nevertheless, I have deep feeling that still it's a bad approach and a 'regular programmer' would do this another way.
for key in imports.keys():
import_str = '{} = imports.get({})'.format(key, key)
globalize = 'global {}'.format(key)
exec(globalize)
exec(import_str)
Can you please advise how to proceed?
How can I manipulate a DataFrame name within a function so that I can have a new DataFrame with a new name that is derived from the input DataFrame name in return?
let say I have this:
def some_func(df):
# some operations
return(df_copy)
and whatever df I put inside this function it should return the new df as ..._copy, e.g. some_func(my_frame) should return my_frame_copy.
Things that I considered are as follows:
As in string operations;
new_df_name = "{}_copy".format(df) -- I know this will not work since the df refers to an object but it just helps to explain what I am trying to do.
def date_timer(df):
df_copy = df.copy()
dates = df_copy.columns[df_copy.columns.str.contains('date')]
for i in range(len(dates)):
df_copy[dates[i]] = pd.to_datetime(df_copy[dates[i]].str.replace('T', ' '), errors='coerce')
return(df_copy)
Actually this was the first thing that I tried, If only DataFrame had a "name" attribute which allowed us to manipulate the name but this also not there:
df.name
Maybe f-string or any kind of string operations could be able to make it happen. if not, it might not be possible to do in python.
I think this might be related to variable name assignment rules in python. And in a sense what I want is reverse engineer that but probably not possible.
Please advice...
It looks like you're trying to access / dynamically set the global/local namespace of a variable from your program.
Unless your data object belongs to a more structured namespace object, I'd discourage you from dynamically setting names with such a method since a lot can go wrong, as per the docs:
Changes may not affect the values of local and free variables used by the interpreter.
The name attribute of your df is not an ideal solution since the state of that attribute will not be set on default. Nor is it particularly common. However, here is a solid SO answer which addresses this.
You might be better off storing your data objects in a dictionary, using dates or something meaningful as keys. Example:
my_data = {}
for my_date in dates:
df_temp = df.copy(deep=True) # deep copy ensures no changes are translated to the parent object
# Modify your df here (not sure what you are trying to do exactly
df_temp[my_date] = "foo"
# Now save that df
my_data[my_date] = df_temp
Hope this answers your Q. Feel free to clarify in the comments.
How can I access only to the value of the variable?
Example:
When I print forces[0,1], I obtain: gurobi.Var forces[0,1] (value -0.6)
How can access only to the value of the variable: -0.6?
Thanks in advance
I assume you are using python?
If forces[0,1] is a Gurobi variable-object, and a value is available, you can access the value (more precise: current solution) with:
forces[0,1].X
This is of course explained in the docs. Look for variable attributes! (.X)
(Above links are for Gurobi's Python-API; look up the corresponding docs for other APIs if needed)