I am trying to loop through multiple files and extract a calculated variable from each as its own variable name (I.E. max_value[1], max_value[2], ...). Currently using a dictionary to store each variable.
### Create dictionary
max_value = dict()
### Loop through files using glob
for file in glob.glob('files'):
### Do calculations using file variables
calculated_value = 10
### Store calculated value in dictionary
for x in range(1,num_files+1):
max_value[x] = calculated_value
However, the nested for loop overwrites the previous saved max_value with the calculated_value of the last file. How can I avoid rewriting each max_value in the dictionary from the last file's max_value?
I think this is what you want:
### Create dictionary
max_value = dict()
### Loop through files using glob
for i, file in enumerate(glob.glob('files')):
### Do calculations using file variables
calculated_value = 10
### Store calculated value in dictionary
max_value[i] = calculated_value
You never update num_files, so you keep overwriting your own data.
Related
My python script loops over many filles in the directory and performs some operations on each of the file, storing results for each of the file in specific variables, defined for each file, using exec() function:
# consider all filles within the current dirrectory, having pdb extension
pdb_list = glob.glob('*.pdb')
#make a list of the filles
list=[]
# loop over the list and make some operation with each file
for pdb in pdb_list:
# take file name w/o its extension
pdb_name=pdb.rsplit( ".", 1 )[ 0 ]
# save file_name of the file
list.append(pdb_name)
#set variable u_{pdb_name}, which will be accosiated with some function that do something on the corresponded file
exec(f'u_{pdb_name} = Universe(pdb)')
exec(f'print("This is %s computed from %s" % (u_{pdb_name}, pdb_name))')
# plot a graph using matplot liv
# exec(f'plt.savefig("rmsd_traj_{pdb_name}.png")')
Basically in my file-looping scripts I tend to use exec(f'...') when I need to save a new variable consisted of the part of some existing variable (like a name of the current file, u_{pdb_name})
Is it possible to do similar taks with the names of variavles but avoiding constantly exec() ?
You could try something like this:
lst = []
universes = {}
# loop over the list and make some operation with each file
for pdb in pdb_list:
# take file name w/o its extension
pdb_name = pdb.rsplit(".", 1)[0]
# save file_name of the file
lst.append(pdb_name)
key = f'u_{pdb_name}'
universes[key] = Universe(pdb)
print(f"This is {key} computed from {pdb_name}")
To access some value, just do:
universes[key] # where key is the variable name
If you want to iterate over all keys and values, do:
for key, universe in universes.items():
print(key)
print(universe.some_function())
I'm working on cs50's pset6, DNA, and I want to read a csv file that looks like this:
name,AGATC,AATG,TATC
Alice,2,8,3
Bob,4,1,5
Charlie,3,2,5
But the problem is that dictionaries only have a key, and a value, so I don't know how I could structure this. What I currently have is this piece of code:
import sys
with open(argv[1]) as data_file:
data_reader = csv.DictReader(data_file)
And also, my csv file has multiple columns and rows, with a header and the first column indicating the name of the person. I don't know how to do this, and I will later need to access the individual amount of say, Alice's value of AATG.
Also, I'm using the module sys, to import DictReader and also reader
You can always try to create the function on your own.
You can use my code here:
def csv_to_dict(csv_file):
key_list = [key for key in csv_file[:csv_file.index('\n')].split(',')] # save the keys
data = {} # every dictionary
info = [] # list of dicitionaries
# for each line
for line in csv_file[csv_file.index('\n') + 1:].split('\n'):
count = 0 # this variable saves the key index in my key_list.
# for each string before comma
for value in line.split(','):
data[key_list[count]] = value # for each key in key_list (which I've created before), I put the value. This is the way to set a dictionary values.
count += 1
info.append(data) # after updating my data (dictionary), I append it to my list.
data = {} # I set the data dictionary to empty dictionary.
print(info) # I print it.
### Be aware that this function prints a list of dictionaries.
I have assigned to variables different files. Now I want to make some operations iterating those variables. For example:
reduced_file1= 'names.xlsx'
reduced_file2= 'surnames.xlsx'
reduced_file3= 'city.xlsx'
reduced_file4= 'birth.xlsx'
the operations I want to iterate (with a FOR loop ) are:
xls= pd.ExcelFile(reduced_file1)
xls= pd.ExcelFile(reduced_file2)
xls= pd.ExcelFile(reduced_file3)
xls= pd.ExcelFile(reduced_file4)
...and so on
Basically every time is changing the name of the variable : reduced_file(i)
Thanks
files= ['names.xlsx', 'surnames.xlsx', 'city.xlsx', 'birth.xlsx']
for file in files:
xls = pd.ExcelFile(file)
You can also change string names by using f-strings:
for i in range(4):
print(f"this is number {i}")
Here is the whole code section
for entry in auth_log:
# timestamp is converted to milliseconds for CEF
# repr is used to keep '\\' in the domain\username
extension = {
'rt=': str(time.ctime(int(entry['timestamp']))),
'src=': entry['ip'],
'dhost=': entry['host'],
'duser=': repr(entry['username']).lstrip("u").strip("'"),
'outcome=': entry['result'],
'cs1Label=': 'new_enrollment',
'cs1=': str(entry['new_enrollment']),
'cs2Label=': 'factor',
'cs2=': entry['factor'],
'ca3Label=': 'integration',
'cs3=': entry['integration'],
}
log_to_cef(entry['eventtype'], entry['eventtype'], **extension)
In line 5 (rt=), I would like to add the timestamp output to a variable where I can call it later in the script.
You can access the value from the dictionary directly with extension["rt="]?
If you are looking for a way to have a list of all the variables outside of your loop you can use this method.
Before your loop you should make an empty list like this:
extensionRt = []
Then after extension is created inside each loop use:
extensionRt.append(extension["rt="])
You can then access the values in this list by index:
extensionRt[YOUR INDEX HERE]
There are as many as 1440 files in one directory to be read with Python. File names have a pattern as
HMM_1_1_.csv
HMM_1_2_.csv
HMM_1_3_.csv
HMM_1_4_.csv
...
and for HMM_i_j_.csv, i goes from 1 to 144 and j goes from 1 to 10.
How can I import each of them into a variable named HMM_i_j similar to its original name?
For instance, HMM_140_8_.csv should be imported as variable HMM_140_8.
You can do this by using pandas and a dictionary. Here is the script that would probably do what you want.
In order to access to a specific csv file in python environment, just use i.e csv[HMM_5_7].
import pandas as pd
csv = {}
for i in range(1, 145):
for j in range(1, 11):
s = 'HMM_{}_{}'.format(i,j)
csv[s] = pd.read_csv(s+'.csv')
Or: (shorter)
d = {}
for i in range(1440):
s = 'HMM_{}_{}'.format(i//10+1,i%10+1)
d[s] = pd.read_csv(s+'.csv')
Or a less readable one-liner:
d = {'HMM_{}_{}'.format(i//10+1,i%10+1):
pd.read_csv('HMM_{}_{}.csv'.format(i//10+1,i%10+1)) for i in range(1440)}
Instead of putting them in variables with this name, you can create a dictionary where the key is the name minus '_.csv" and the value is the content of the file.
Here are the steps, I let you figure out how to exactly do each step:
Create an empty dictionary
Loop i from 1 to 144 and j from 1 to 10
If the corresponding file exists, read it and put its content in the dictionary at the corresponding key