Python in SPSS - KEEP variables - python

I have selected the variables I need based on a string within the variable name. I'm not sure how to keep only these variables from my SPSS file.
begin program.
import spss,spssaux
spssaux.OpenDataFile(r'XXXX.sav')
target_string = 'qb2'
variables = [var for var in spssaux.GetVariableNamesList() if target_string in var]
vars = spssaux.VariableDict().expand(variables)
nvars=len(vars)
for i in range(nvars):
print vars[i]
spss.Submit(r"""
SAVE OUTFILE='XXXX_reduced.sav'.
ADD FILES FILE=* /KEEP \n %s.
""" %(vars))
end program.
The list of variables that it prints out is correct, but it's falling over trying to KEEP them. I'm guessing it's something to do with not activating a dataset or bringing in the file again as to why there's errors?

Have you tried reversing the order of the SAVE OUTFILE and ADD FILES commands? I haven't run this in SPSS via Python, but in standard SPSS, your syntax will write the file to disk, and then select the variables for the active version in memory--so if you later access the saved file, it will be the version before you selected variables.
If that doesn't work, can you explain what you mean by falling over trying to KEEP them?

It appears that the problem has been solved, but I would like to point out another solution that can be done without writing any Python code. The extension command SPSSINC SELECT VARIABLES defines a macro based on properties of the variables. This can be used in the ADD FILES command.
SPSSINC SELECT VARIABLES MACRONAME="!selected"
/PROPERTIES PATTERN = ".*qb2".
ADD FILES /FILE=* /KEEP !selected.
The SELECT VARIABLES command is actually implemented in Python. Its selection criteria can also include other metadata such as type and measurement level.

You'll want to use the ADD FILES FILE command before the SAVE for your saved file to be the "reduced" file
I think your very last line in the python program should be trying to join the elements in the list vars. For example: %( " ".join(vars) )

Related

how to change boolean value thats another file

Im trying to make a thing that can detect whether or not a user wants their password stored for next time they run the program, The reason im using a boolean for this is so in the files that need to use the password they can check to see if storepass is True then get the pass/user from a .env if not they can get the password from the storepasswordquestion file amd use that and it wont get stored when the user closes the program.
I have a file called booleans that im using to store booleans, in it is this:
storepass = False
In a other file called storepasswordquestion i have:
import booleans
username = 'username'
password = 'password'
question = input('Would you like your password to be stored for use next time?') # enter y
if question == 'y':
booleans.storepass = True
# store password/username in .env
As i understand it import booleans loads the booleans file, Then i think booleans.storepass is like a variable but like a copy of the one in booleans? Because when i go on the booleans file again its still False.
Im needing to change storepass to True in the booleans file and then save it.
Then i think booleans.storepass is like a variable but like a copy of the one in booleans? Because when i go on the booleans file again its still False.
That's correct - you can't change the values inside a .py file just by importing and then setting within another. py file. The standard way to manipulate files is by using some variation of
with open('boolean.py', 'w') as f:
f.write('storepass = False')
Personally, I really dislike writing over .py files like this; I usually save as JSON. So "boolean.json" can have just
{"storepass": false}
and then in your python code you can (instead of importing) get it as
# import json
boolean = json.load(open('boolean.json', 'r'))
and set storepass with
# import json
boolean.storepass = True
with open('boolean.json', 'w') as f:
json.dump(boolean, f, indent=4)
## NOT f.write(boolean)
and this way, if there are more values in boolean, they'll also be preserved (as long as you don't use the variable for anything else in between...)
You're expecting changing a variable's value to change source code. Luckily, that's not how this works!
You need to make a mental distinction between the source code that the python interpreter reads, and the values of variables.
What you need is indeed, as you say,
Im needing to change storepass to True in the booleans file and then save it.
So, you would need to open that boooleans.py as a text file, not as a python program, read and interpret it as pairs of keys and values, modify the right one (so, storepass in this case), write the result back to the file. Then, the next time someone imports it, they would see the different setting.
This is a bad approach you've chosen:
Reading and parsing python files is very hard in general, because, well, Python is a programming language and can do much more than just store settings
The change only takes effect the next time the program is run and the settings are imported anew. This has no effect on other parts of an already running program. So, that's bad.
Things get a lot easier if you do two things:
Get rid of the idea to store settings in a Python source code file. Instead, use one of the multiple ways that python has to read structured data from a file. For your use case, usage of the configparser module might be easiest – a ConfigParser has a read and a write method, with which you can, you guessed it, read and write your config file. There's multiple other modules that Python brings that can do similar things. The json module, for example, is a sensible way to store especially logically hierarchically structured settings. And of course, the place where you store passwords might also be a place where you could store settings – depending on what that is.
The approach of having a single module that you import (your booleans) is a good one, as it ensures that there's a single booleans that is the "source of truth" about these settings. I propose you put the configuration loading into such a single module:
# this is settings.py
import json
SETTINGSFILE = "settings.json"
# when this is loaded the first time, load things from the settings file
try:
with open(SETTINGSFILE) as sfile:
_settings = json.load(sfile)
except FileNotFoundError:
# no settings file yet. We'll just have empty settings
# Saving these will create a file
_settings = {}
def __getattr__(settingname):
try:
return _settings[settingname]
except KeyError as e:
# you could implement "default settings" by checking
# in a default settings dictionary that you create above
# but here, we simply say: nay, that setting doesn't yet
# exist. Which is as good as your code would have done!
raise AttributeError(*e.args) from e
def save():
with open(SETTINGSFILE, "w") as sfile:
json.dump(_settings, sfile)
def update_setting(settingname, value):
_settings[settingname] = value
save()
which you can then use like this:
import settings
…
if settings.storepass:
"Do what you want here"
…
# User said they want to change the setting:
should_we_store = (input("should we store the password? ").upper == "Y")
# this updates the setting, i.e. it's instantly visible
# in other parts of the program, and also stores the changed config
# to the settings.json file
settings.update_setting("storepass", should_we_store)
You are changing the value inside the runtime, meaning you can actually change it. Meaning if you are ever trying to print out the value inside the if condition like this:
booleans.storepass = True
print(booleans.storepass)#This should return True
But your problem is that you are not actually changing the file. In order to change the file you should do it like this( This is the general way to write to a file with python from my research it should work in your case as well.
f = open("booleans.py", "w")
f.write("booleans.storepass = True")
f.close()
In short what you are doing is actually true for runtime, but if you want to store the result in a file this is the way to do it.

define a variable then reference it in another place

Trying to write a program. Where inputs & outputs names are listed at the very beginning.
After running through it will then the output be generated.
Eg.
### First step. import files and assign names
df1= pd.read_csv(r'df1.csv',low_memory=False)
output file name = final_output
### final step. Output files and name it as 'final_output.csv'
df_final.to_csv('output file name.csv')
What I'm trying to is being able to define the name of the output file at the very beginning, then reference it at the end. Not manually name it at the very end of the program.
Something in SAS would be : Define A = 'output file name'. Reference it using "&A" at the very end.
But how to make it happen in python?
As one of the commenters mentioned, you could include the extension in the filename and then just pass that variable to .to_csv(). If that's not possible, it seems to me like you're looking to use string formatting. You could try this:
df1= pd.read_csv(r'df1.csv',low_memory=False)
output file name = 'final_output'
### final step. Output files and name it as 'final_output.csv'
df_final.to_csv(f'{output file name}.csv')
Using f-strings like this is a more compact and clear way to do string formatting and concatenation to use variable in-line.

Batch Scripting-Python

I am new to python, I am creating an application which requires multiple batch files to be generated. There are only few minor differences in every batch files. Is there any way i can create a batch file using code?
Yes, you can
A Batch file is just a normal text file, which is executable. In order to get the File executable, you need to:
Set File permissions, (On Linux via chmod +x)
Tell the computer how to execute the file, this is done via the file ending on Windows and via the Shebang on Unix systems
Now, that we have coverd the basics, We can look at how to create the file.
since you say, you're files are very similar you should create a template for your file.
This can be done using a Docstring, or a separate file, it is important however, that you replace all positions in the file, where you want a custom value by {}.
Using this template you can use the format mini-language inorder to generate your file.
In python pseudo-code this will look like this:
template = """#/bin/bash/ \n My template with {} specification and {} specification"""
spec_1 = "example"
spec_2 = "present"
with open('my_batch_file', 'w') as file:
file.write(template.format(spec_1, spec_2))

Python: storing and using information created in a function

I just began with Python and am having a little difficulty with storing the result of a function in a variable.
I have a small script that does the following:
change to a directory and within that directory:
create a new directory named to the moment it has been created (for example 2016200420161636)
what i want it to do additionally:
create a file within that newly created directory
I would think to be able to have the file created in the newly created directory I need to store the directory name ( 2016200420161636) and return the value to a part of the script that creates the file (so it knows where to write the file to).
Can someone please advise?
Are you looking to just save the value before you create the directory?
something like
timestamp = time.clock()
if not os.path.exists(timestamp)
os.makedirs(timestamp)
now timestamp has the value stored and you can use as needed

How to permanently update a path variable in a python script

I'd like to know if there's a way to dynamically edit the lines in my scripts depending on a user input from an arcpy tool that I have created. the interface will allow the user to pick a file, then there's a boolean check to permanently change the default directory if the program was run later.
if you are familiar with arcpy:
def example():
defaultPath="C:\\database.dbf"
path=arcpy.GetParameterAsText(0)#returns a string of a directory
userCheck=arcpy.GetParameterAsText(1)#returns a string "TRUE"/"FALSE"
if path=="":
path=defaultPath
The goal here is to change "defaultPath" permanently if "userCheck" is "TRUE". I think it's possible to do that with the use of classes? or do I need to have an "index" table and refer to it's cells (like an excel sheet).
Thanks

Categories

Resources