I am trying to write a program to parse a file, break it into sections, and read it into a nested dictionary. I want the output to be something like this:
output = {'section1':{'nested_section1':{'value1':'value2'}}}
I'm trying to do this by building separate dictionaries, than merging them, but I'm running into trouble naming them. I want the dictionaries inside of the others to be named based on the sections of the file they're taken from. But it seems I can't name a dictionary from a variable.
You can name a dictionary entry from a variable. If you have
text = "myKey" # or myNumber or any hashable type
data = dict()
You can do
data[text] = anyValue
Store all your dictionaries in a single root dictionary.
all_dicts['output'] = {'section1':{'nested_section1':{'value1':'value2'}}}
As you merge dictionaries, remove the children from all_dicts.
all_dicts['someotherdict']['key'] = all_dicts['output']
del all_dicts['output']
Related
So I have a list where each element is associated with a variable(s). If the user wants to read a variable value I would need to take an element of the list perform my operation and then return it to the user. The list is ~250 elements, where each element is defined as a different variable. The element number and variable do not change.
Do I need some form of lookup table, or equivalent? Does it go in the main code, or can I keep it separate as a config file, i.e. txt file containing: element 1 = variable y
I'm fairly new to Python, so just want to pointed in the right direction really.
The data structure you seem to be wanting is a dictionary. Dictionaries allow you to reference elements by their key i.e. a key could be a "variable" and the value associated with the key is essentially data that can be referenced by the "variable". Dictionaries can be made by defining a variable as {} or by using the dict() function on something. I most often will do something like:
dictionary = dict(zip(list_of_names , list_of_data))
The lists could be made by reading your txt file and then using:
string.split("your delimiter here")
I have a python dictionary with keys as dataset names and values as the entire data frames themselves, see the dictionary dict below
[Dictionary of Dataframes ]
One way id to write all the codes manually like below:
csv = dict['csv.pkl']
csv_emp = dict['csv_emp.pkl']
csv_emp_yr= dict['csv_emp_yr.pkl']
emp_wf=dict['emp_wf.pkl']
emp_yr_wf=dict['emp_yr_wf.pkl']
But this will get very inefficient with more number of datasets.
Any help on how to get this done over a loop?
Although I would not recommend this method but you can try this:
import sys
this = sys.modules[__name__] # this is now your current namespace
for key in dict.keys():
setattr(this, key, dict[key])
Now you can check new variables made with names same as keys of dictionary.
globals() has risk as it gives you what the namespace is currently pointing to but this can change and so modifying the return from globals() is not a good idea
List can also be used like (limited usecases):
dataframes = []
for key in dict.keys():
dataframes.append(dict[key])
Still this is your choice, both of the above methods have some limitations.
I've two defaultdicts I eventually want to merge, but first I need to make their keys match. According to some threads I've seen here, I can use pop() to replace keys in a dictionary. But that only updates the existing dictionary, whereas I want to create a new dictionary with the new keys. So something like:
existing_dict_one -> new_dict_one
This is what I've so far:
def split_tabs(x):
"""
Function to split tab-separated strings, used to break up the keys that are separated by tabs.
"""
return x.split('\t')
def create_dict(old_dict):
"""
Function to create a new defaultdict from an existing defaultdict, just with
different keys.
"""
new_dict = old_dict.copy() # Create a copy of old_dict to house the new keys, but with the same values.
for key, value in new_dict.iteritems():
umi = split_tabs(key)[0] # Change key to be UMI, which is the 0th index of the tab-delimited key.
# new_key = key.replace(key, umi)
new_dict[umi] = new_dict.pop(key)
return new_dict
However, I'm getting the following error
RuntimeError: dictionary changed size during iteration
and I don't know how to fix it. Does anyone know how to correct it? I'd like to use the variable "umi" as the new key.
I'd like to post the variable "key" and dictionary "old_dict" I'm using for testing this code, but it's messy and takes up a lot of space. So here's a pastebin link that contains them instead.
Note that "umi" comes from variable "key" which is separated by tabs. So I split "key" and get the first object as "umi".
Just use a dict comprehension for this:
new_dict = {split_tabs(key)[0]: value for key, value in old_dict.iteritems()}
Trying to modify a dictionary while iterating over it is not a good idea in general.
If you use .items() instead of .iteritems(), you won't have that problem, because that will just return a list that is disconnected from the dictionary. In python 3 it would be 'list(new_dict.items())`.
Also if there's any possibility that the dictionary values are mutable, you'll have to use copy.deepcopy(old_dict) instead of just old_dict.copy().
I need to read some huge database from HDF5 files and organize it in a nice way to make it easy to read and use.
I saw this post Python List as variable name and I'm trying to make a dictionary of dictionaries.
Basically I have a list of data sets and variables that I need to read form the HDF5 files. As an example I created this two lists:
dataset = [0,1,2,3]
var = ['a','b','c']
Now, there is legacy "home brewed" read_hdf5(dataset,var) function that reads the data from the HDF5 files and returns the appropriate array.
I can easily read from a specific dataset (say 0) at a time creating a dictionary like this:
data = {}
for type in var:
data[type] = read_hdf5(0,type)
Which gives me a nice dictionary if all the data for each variable in dataset 0.
Now I wan to be able to implement a dictionary of dictionaries so I can be able to access the data like this:
data[dataset][var]
That returns the array of data for the given set and variable
I tried the following but the only thing that the loop is doing is overwriting the last variable read:
for set in dataset:
for type in var:
data[set] = {'set':set, str(type): read_hdf5(set,type)}
Any ideas? Thank you!!!
You have to create a new dict for each set before iterating on vars:
dataset = [0,1,2,3]
var = ['a', 'b', 'c']
data = {}
for set in datasets:
data[set] = {}
for type in var:
data[set][type] = read_hdf5(set, type)
As a side note: set and type are builtin names so you'd better use something else.
So, I've created a Dictionary that stores first-names as a list within that dictionary. New names are added within the dictionary's list via a function. Now, this is where i have hit a snag:
Main Obstacle: The function overwrites new names that I add. If I add the name "George" to the list via the function, it will store the name "George". But, I want to add the name "Alfred" within the dictionary, it overwrites the name "George" and adds the name "Alfred".
I am sure you can see how problematic this is for someone who wants to add multiple names to the dictionary's list. The odd thing is that when I type out the exact same code into the interpreter and I individually append names to the dictionary's list, it works fine.
Here is the code:
def add(data,value):
data['names'] = {}
data['names']['first'] = []
data['names']['first'].append(value)
Didn't you ask this question already? (My previous answer)
You are always setting the data['names'] to an empty dictionary before appending value to it.
def add(data, value):
data.setdefault('names', {}).setdefault('first', []).append(value)
See python docs on dict.setdefault