I'm trying to iteratively capture data on a fixed list of items but the data fields/attributes are not fixed at the start. As I iterate over each item new attributes may crop up which need to be added to the database. What is a good way to do this?
For an easy example, suppose the list has three items (people) and these are their attributes:
Person 1: Height=168cm, Eyes=Brown
Person 2: Height=155cm, Occupation=Teacher
Person 3: Age=43, Country=Spain, Occupation=Writer
For Person 1 two variables need to be captured: height and eye color. On the next iteration for Person 2 there is one existing attribute (height) and one new attribute (occupation). Person 3 has three new attributes that need to be added.
The list of items is currently stored in a pandas dataframe. My only idea so far is to create one field which stores all the attributes in a single dictionary item for each person (e.g.
[
{"Height": "168cm", "Eyes": "Brown"},
{"Height": "155cm, "Occupation": "Teacher"},
{"Age": "43", "Country": "Spain", "Occupation": "Writer"}
]
Is there a better way to store the data which will be easier to query later on? I'm very new to python. Thanks!
Try using a multi-dimensional dictionary. These are formatted as
dict2d = {
"Person1":{"Height": "168cm", "Eyes": "Brown"}
"Person2":{"Height": "155cm", "Occupation": "Teacher"}
"Person3":{"Age": "43", "Country": "Spain", "Occupation": "Writer"}
}
so in your code when you encounter a new element you can run something along the lines of
# code following the encounter
if person_name in dict2d:
dict2d[person_name[new_data_name]] = new_data_value
else:
dict2d[person_name] = {new_data_name : new_data_value}
# this is general pseudocode, there may be a few syntax errors
using these, you will be able to access an element of each person.
Related
I am getting data from an API and storing it in json format. The data I pull is in a list of dictionaries. I am using Python. My task is to only grab the information from the dictionary that matches the ticker symbol.
This is the short version of my data printing using json dumps
[
{
"ticker": "BYDDF.US",
"name": "BYD Co Ltd-H",
"price": 25.635,
"change_1d_prc": 9.927101200686117
},
{
"ticker": "BYDDY.US",
"name": "BYD Co Ltd ADR",
"price": 51.22,
"change_1d_prc": 9.843448423761526
},
{
"ticker": "TSLA.US",
"name": "Tesla Inc",
"price": 194.7,
"change_1d_prc": 7.67018746889343
}
]
Task only gets the dictionary for ticker = TSLA.US. If possible, only get the price associated with this ticker.
I am unaware of how to reference "ticker" or loop through all of them to get the one I need.
I tried the following, but it says that its a string, so it doesn't work:
if "ticker" == "TESLA.US":
print(i)
Try (mylist is your list of dictionaries)
for entry in mylist:
print(entry['ticker'])
Then try this to get what you want:
for entry in mylist:
if entry['ticker'] == 'TSLA.US':
print(entry)
This is a solution that I've seen divide the python community. Some say that it's a feature and "very pythonic"; others say that it's a bad design choice we're stuck with now, and bad practice. I'm personally not a fan, but it is a way to solve this problem, so do with it what you will. :)
Python function loops aren't a new scope; the loop variable persists even after the loop. So, either of these are a solution. Assuming that your list of dictionaries is stored as json_dict:
for target_dict in json_dict:
if target_dict["ticker"] == "TESLA.US":
break
At this point, target_dict will be the dictionary you want.
It is possible to iterate through a list of dictionaries using a for loop.
for stock in list:
if stock["ticker"] == "TSLA.US":
return stock["price"]
This essentially loops through every item in the list, and for each item (which is a dictionary) looks for the key "ticker" and checks if its value is equal to "TSLA.US". If it is, then it returns the value associated with the "price" key.
I want to create a dictionary with Key value pairs which are filled via an for Loop
The dictionary I want to achive
[
{
"timestamp": 123123123,
"image": "image/test1.png"
},
{
"timestamp": 0384030434,
"image": "image/test2.png"
}
]
My code does not work and I´m new to the datatypes of python.
images_dict = []
for i in images:
time = Image.open(i)._getexif()[36867]
images_dict = {"timestamp": time, "image": i}
What am I missing?
First, you seem to be confusing the definition of a list and a dictionary in python. Dictionaries use curly brackets {} and lists use regular brackets []. So in your first example, you are describing a list with a single element, which is a dictionary.
As for your code example, you are creating an empty list, and then iterating over images which I assume is a list of images, and then redefining the variable images_dict to be a dictionary with two key: value pairs for every iteration.
It seems like what you want is this:
images_dict = []
for image in images:
time = Image.open(1)._getexif()[36867]
images_dict.append({'timestamp': time, 'image': image})
The answer from Tom McLean worked for me, I´m a little bit confused with the dataypes of python
images_dict.append({"timestamp": time, "image": i})
I have dozens of lines to update values in nested dictionary like this:
dictionary["parent-key"]["child-key"] = [whatever]
And that goes with different parent-key for each lines, but it always has the same child-keys.
Also, the [whatever] part is written in unique manner for each lines, so the simple recursion isn't the option here. (Although one might suggest to make a separate lists of value to be assigned, and assign them to each dictionary entry later on.)
Is there a way do the same but in even shorter manner to avoid duplicated part of the code?
I'd be happy if it could be written something like this:
update_child_val("parent-key") = [whatever]
By the way, that [whatever] part that I'm assigning will be a long and complicated code, therefore I don't wish to use function such as this:
def update_child_val(parent_key, child_val):
dictionary[parent_key]["child-key"] = child_val
update_child_val("parent-key", [whatever])
Specific Use Case:
I'm making ETL to convert database's table into CSV, and this is the part of the process. I wrote some bits of example below.
single_item_template = {
# Unique values will be assigned in place of `None`later
"name": {
"id": "name",
"name": "Product Name",
"val": None
},
"price": {
"id": "price",
"name": "Product Price (pre-tax)",
"val": None
},
"tax": {
"id": "tax",
"name": "Sales Tax",
"val": 10
},
"another column id": {
"id": "another column id",
"name": "another 'name' for this column",
"val": "another 'val' for this column"
},
..
}
And I have a separate area to assign values to the copy of the dictionary single_item_template for the each row of source database table.
for table_row in table:
item = Item(table_row)
Item class here will return the copy of dictionary single_item_template with updated values assigned for item[column][val]. And each of vals will involve unique process for changing values in setter function within the given class such as
self._item["name"]["val"] = table_row["prod_name"].replace('_', ' ')
self._item["price"]["val"] = int(table_row["price_0"].replace(',', ''))
..
etcetera, etcetera.
In above example, self._item can be shortened easily by assigning it to variable, but I was wondering if I could also save the last five character ["val"].
(..or putting the last logic part as a string and eval later, which I really really do not want to do.)
(So basically all I'm saying here is that I'm lazy typing out ["val"], but I don't bother doing it either. Although I was still interested if there's such thing while I'm not even sure such thing exists in programming in general..)
While you can't get away from doing the work, you can abstract it away in a couple of different ways.
Let's say you have a mapping of parent IDs to intended value:
values = {
'name': None,
'price': None,
'tax': 10,
'[another column id]': "[another 'val' for this column]"
}
Setting all of these at once is only two lines of code:
for parent, val in values.items():
dictionary[parent]['val'] = val
Unfortunately there isn't an easy or legible way to transform this into a dict comprehension. You can easily put this into a utility function that will turn it into a one-line call:
def set_children(d, parents, values, child='val'):
for parent, values in zip(parents, values):
d[parent][child] = value
set_children(dictionary, values.keys(), values.values())
In this case, your values mapping will encode the transformations you want to perform:
values = {
'name': table_row["prod_name"].replace('_', ' '),
'price': int(table_row["price_0"].replace(',', '')),
...
}
This question already has answers here:
Getting a list of values from a list of dicts
(10 answers)
Closed 5 years ago.
I have this JSON file where the amount of id's sometimes changes (more id's will be added):
{
"maps": [
{
"id": "blabla1",
"iscategorical": "0"
},
{
"id": "blabla2",
"iscategorical": "0"
},
{
"id": "blabla3",
"iscategorical": "0"
},
{
"id": "blabla4",
"iscategorical": "0"
}
]
}
I have this python code that has to print all the values of ids:
import json
data = json.load(open('data.json'))
variable1 = data["maps"][0]["id"]
print(variable1)
variable2 = data["maps"][1]["id"]
print(variable2)
variable3 = data["maps"][2]["id"]
print(variable3)
variable4 = data["maps"][3]["id"]
print(variable4)
I have to use variables, because i want to show the values in a dropdown menu. Is it possible to save the values of the id's in a more efficient way? How do you know the max amount of id's of this json file (in de example 4)?
You can get the number of id (which is the number of elements) by checking the length of data['maps']:
number_of_ids = len(data['maps'])
A clean way to get all the id values is storing them in a list.
You can achieve this in a pythonic way like this:
list_of_ids = [map['id'] for map in data['maps']]
Using this approach you don't even need to store the number of elements in the original json, because you iterate through all of them using a foreach approach, essentially.
If the pythonic approach troubles you, you can achieve the same thing with a classic foreach approach doing so:
list_of_ids = []
for map in data['maps']:
list_of_ids.append(map['id'])
Or you can do with a classic for loop, and here is where you really need the length:
number_of_ids = len(data['maps'])
list_of_ids = []
for i in range(0,number_of_ids):
list_of_ids.append(data['maps'][i]['id'])
This last is the classic way, but I suggest you to take the others approaches in order to leverage the advantages python offers to you!
You can find more on this stuff here!
Happy coding!
data['maps'] is a simple list, so you can iterate over it as such:
for map in data['maps']:
print(map['id'])
To store them in a variable, you'll need to output them to a list. Storing them each in a separate variable is not a good idea, because like you said, you don't have a way to know how many there are.
ids = []
for map in data['maps']:
ids.append(map['id'])
We run an app that is highly dependent on location. So, we have five models: Country, Province, District, Sector, Cell and Village:
What I want is to generate a JSON that represents them. What I tried aleady is quite long, but I noticed since the structure is the same, one chunk of the module would show my problem.
So, each cell can have multipe villages inside it:
cells=database.bring('SELECT id,name FROM batteries_cell WHERE sector_id=' + str(sectorID))
if cells:
for cell in cells:
cellID=cell[0]
cellName=cell[1]
cell_pro={'id':cellID,'name':cellName,'villages':{}}
villages=database.bring('SELECT id,name FROM batteries_village WHERE cell_id=' + str(cellID))
if villages:
for village in villages:
villageID=village[0]
villageName=village[1]
village_pro={'id':villageID, 'name':villageName}
cell_pro['villages'].update(village_pro)
However, the update just stores the last village for each cell. Any idea what I am doing wrong because I have been trying and deleting different ways to end up in the same result.
UPDATE needed output is:
[
{
"id": 1,
"name": "Ethiopia",
"villages": [{
"vid": 1,
"vname": "one"
},
{
"vid": 2,
"vname": "village two"
}
]
},
{
"id": 2,
"name": "Sene",
"villages": [{
"vid": 3,
"vname": "third"
},
{
"vid": 4,
"vname": "fourth"
}
]
}
]
The update keeps overwriting the same keys in the cell_pro villages dict. For example, if village_pro is {'id':'1', 'name':'A'}, then cell_pro['villages'].update(village_pro) will set cell_pro['villages']['id'] = '1' and cell_pro['villages']['name'] = 'A'. The next village in the loop will overwrite the id and name with something else.
You probably either want to make cell_pro['villages'] into a list or keep it as a dict and add the villages keyed by id:
cell_pro['villages'][villageID] = village_pro
What format do you want the resulting JSON to be? Maybe you just want:
cell_pro['villages'][villageID] = villageName
EDITED FOR DESIRED JSON ADDED TO QUESTION:
In the JSON, the villages are in an array. For that we use a list in Python. Note that cell_pro['villages'] is now a list and we use append() to add to it.
cells=database.bring('SELECT id,name FROM batteries_cell WHERE sector_id=' + str(sectorID))
if cells:
for cell in cells:
cellID=cell[0]
cellName=cell[1]
cell_pro={'id':cellID,'name':cellName,'villages':[]}
villages=database.bring('SELECT id,name FROM batteries_village WHERE cell_id=' + str(cellID))
if villages:
for village in villages:
villageID=village[0]
villageName=village[1]
village_pro={'vid':villageID, 'vname':villageName}
cell_pro['villages'].append(village_pro)
TIP: I don't know what database access module you're using but it's generally bad practice to build SQL queries like that because the parameters may not be escaped properly and could lead to SQL injection attacks or errors. Most modules have a way to build query strings with bound parameters that automatically safely escape variables in the query string.