Python search and replace whilst caching [duplicate] - python

This question already has answers here:
How can I make a dictionary (dict) from separate lists of keys and values?
(21 answers)
Closed 4 months ago.
I'm attempting to search and replace using information from 2 lists, this is whilst caching any replacements that have been done so the same corresponding values can be given.
For example, I have the following -
names = ["Mark","Steve","Mark","Chrome","192.168.0.1","Mark","Chrome","192.168.0.1","192.168.0.2"]
type = ["user","user","user","process","address","user","process","adress","address"]
And I'm hoping to get the following output -
{
"Mark":"user1",
"Steve":"user2",
"Chrome":"process1",
"192.168.0.1":"adress1",
"192.168.0.2":"adress2"
}
So trying to use the type in the the 2nd list to determine the item in the first list's corresponding value.
Hope this makes sense, is this possible? Any help would be appreciated.

I would recommend you use a dictionary personally.
names = {
"Mark": "user",
"Steve": "user2",
"Chrome": "process1",
"192.168.0.1": "address1",
"192.168.0.2": "address2"
}
print(names["Mark"])
By using this dictionary you can precisely tap into the name you'd like to information of or anything else you want. It is also a little more readable

To form a dictionary from said values you can iterate the range and access values with the same index:
output = {names[i]: types[i] for i in range(len(names))}
Also refrain from using variable name type because it's already taken by a builtin Python syntax.

Looks like you're also trying to store / retrieve the count of the types (i.e. "user1", "user2, "address1", etc.). Hence, we need another data structure to keep count of the types already registered in our "hashmap" (dictionary in python). In the below solution, we use the type_cache.
The code should work as is.
from collections import defaultdict
names = ["Mark", "Steve", "Mark", "Chrome", "192.168.0.1", "Mark", "Chrome", "192.168.0.1", "192.168.0.2"]
types = ["user", "user", "user", "process", "address", "user", "process", "address", "address"]
expected = {
"Mark": "user1",
"Steve": "user2",
"Chrome": "process1",
"192.168.0.1": "address1",
"192.168.0.2": "address2"
}
def process_names_and_types(names, types):
result = {}
type_cache = defaultdict(int)
for name, type_ in zip(names, types):
if name not in result:
type_cache[type_] += 1
result[name] = f"{type_}{type_cache[type_]}"
return result
if __name__ == "__main__":
actual = process_names_and_types(names, types)
assert actual == expected, f"Expected: {expected}, Actual: {actual}"

Related

Navigating JSON with variable keys in Python? [duplicate]

This question already has answers here:
Use Variable As Dictionary Key Set
(2 answers)
How to use a dot "." to access members of dictionary?
(36 answers)
Closed last month.
Lets say I have some json like so store in a variable called data
{
"print": {
"ams": { "exists": 1},
"fan_speed": 29,
"reports": [
{"name": "foo"},
{"name": "bar"}
]
}
}
Now I've got a variable which is the key i want to return stored in a variable called key for example print.fan_speed, print.ams.exists, print.reports[0].name
What I want to is something like data.get(key). What is the best way to approach this?
The following should work its way into your data, including indexing into lists:
import re
data = {
"print": {
"ams": { "exists": 1},
"fan_speed": 29,
"reports": [
{"name": "foo"},
{"name": "bar"}
]
}
}
def value_of(data, location):
for part in location.split("."):
match = re.match(r"(.*)\[(\d+)\]$", part)
if match:
name, index = match.groups()
data = data.get(name)[int(index)]
else:
data = data.get(part)
if not data:
return None
return data
print(value_of(data, "print.ams.exists"))
print(value_of(data, "print.reports[1].name"))
Result:
1
bar
It could do with a little rationalisation as it will return None for a non-existent key, but will error on a bad index - it should do one or the other depending on your requirements but the concept is there.
The concept is to take each '.' separated element of the string in turn, using it to dig further into the data structure. If the element matches the syntax of 'name[index]' using the regex, the component is treated as a list and the indexth element is extracted.

Iterate through a nested python dict

I have a JSON file that looks like this:
{
"returnCode": 200,
"message": "OK",
“people”: [
{
“details: {
"first": “joe”,
“last”: doe,
“id”: 1234567,
},
“otheDetails”: {
“employeeNum”: “0000111222”,
“res”: “USA”,
“address”: “123 main street”,
},
“moreDetails”: {
“family”: “yes”,
“siblings”: “no”,
“home”: “USA”,
},
},
{
“details: {
"first": “jane”,
“last”: doe,
“id”: 987654321,
},
“otheDetails”: {
“employeeNum”: “222333444”,
“res”: “UK”,
“address”: “321 nottingham dr”,
},
“moreDetails”: {
“family”: “yes”,
“siblings”: “yes”,
“home”: “UK,
},
}
This shows two entries, but really there are hundreds or more. I do not know the number of entries at the time the code is run.
My goal is to iterate through each entry and get the 'id' under "details". I load the JSON into a python dict named 'data' and am able to get the first 'id' by:
data['people'][0]['details']['id']
I can then get the second 'id' by incrementing the '0' to '1'. I know I can set i = 0 and then increment i, but since I do not know the number of entries, this does not work. Is there a better way?
Less pythonic then a list comprehension, but a simple for loop will work here.
You can first calculate the number of people in the people list and then loop over the list, pulling out each id at each iteration:
id_list = []
for i in range(len(data['people'])):
id_list.append(data['people'][i]['details']['id'])
You can use dict.get method in a list comprehension to avoid getting a KeyError on id. This way, you can fill dictionaries without ids with None:
ids = [dct['details'].get('id') for dct in data['people']]
If you still get KeyError, then that probably means some dcts in data['people'] don't have details key. In that case, it might be better to wrap this exercise in try/except. You may also want to identify which dcts don't have details key, which can be gathered using error_dct list (which you can uncomment out from below).
ids = []
#error_dct = []
for dct in data['people']:
try:
ids.append(dct['details']['id'])
except KeyError:
ids.append(None)
#error_dct.append(dct)
Output:
1234567
987654321

Assigning python dictionary's nested value without mentioning the immediate key

I have dozens of lines to update values in nested dictionary like this:
dictionary["parent-key"]["child-key"] = [whatever]
And that goes with different parent-key for each lines, but it always has the same child-keys.
Also, the [whatever] part is written in unique manner for each lines, so the simple recursion isn't the option here. (Although one might suggest to make a separate lists of value to be assigned, and assign them to each dictionary entry later on.)
Is there a way do the same but in even shorter manner to avoid duplicated part of the code?
I'd be happy if it could be written something like this:
update_child_val("parent-key") = [whatever]
By the way, that [whatever] part that I'm assigning will be a long and complicated code, therefore I don't wish to use function such as this:
def update_child_val(parent_key, child_val):
dictionary[parent_key]["child-key"] = child_val
update_child_val("parent-key", [whatever])
Specific Use Case:
I'm making ETL to convert database's table into CSV, and this is the part of the process. I wrote some bits of example below.
single_item_template = {
# Unique values will be assigned in place of `None`later
"name": {
"id": "name",
"name": "Product Name",
"val": None
},
"price": {
"id": "price",
"name": "Product Price (pre-tax)",
"val": None
},
"tax": {
"id": "tax",
"name": "Sales Tax",
"val": 10
},
"another column id": {
"id": "another column id",
"name": "another 'name' for this column",
"val": "another 'val' for this column"
},
..
}
And I have a separate area to assign values to the copy of the dictionary single_item_template for the each row of source database table.
for table_row in table:
item = Item(table_row)
Item class here will return the copy of dictionary single_item_template with updated values assigned for item[column][val]. And each of vals will involve unique process for changing values in setter function within the given class such as
self._item["name"]["val"] = table_row["prod_name"].replace('_', ' ')
self._item["price"]["val"] = int(table_row["price_0"].replace(',', ''))
..
etcetera, etcetera.
In above example, self._item can be shortened easily by assigning it to variable, but I was wondering if I could also save the last five character ["val"].
(..or putting the last logic part as a string and eval later, which I really really do not want to do.)
(So basically all I'm saying here is that I'm lazy typing out ["val"], but I don't bother doing it either. Although I was still interested if there's such thing while I'm not even sure such thing exists in programming in general..)
While you can't get away from doing the work, you can abstract it away in a couple of different ways.
Let's say you have a mapping of parent IDs to intended value:
values = {
'name': None,
'price': None,
'tax': 10,
'[another column id]': "[another 'val' for this column]"
}
Setting all of these at once is only two lines of code:
for parent, val in values.items():
dictionary[parent]['val'] = val
Unfortunately there isn't an easy or legible way to transform this into a dict comprehension. You can easily put this into a utility function that will turn it into a one-line call:
def set_children(d, parents, values, child='val'):
for parent, values in zip(parents, values):
d[parent][child] = value
set_children(dictionary, values.keys(), values.values())
In this case, your values mapping will encode the transformations you want to perform:
values = {
'name': table_row["prod_name"].replace('_', ' '),
'price': int(table_row["price_0"].replace(',', '')),
...
}

How to remove same condition for updating dictionary

How can I remove the if condition which is repeating again and again?
if input["custom_fields"].get("billing_notes", None):
billing_notes.update({"value": input["custom_fields"]["billing_notes"]})
work_order_number = {
"name": "work_order_number",
"label": "Work Order Number",
}
if input["custom_fields"].get("work_order_number", None):
work_order_number.update({"value": input["custom_fields"]["work_order_number"]})
contact_name_for_billing = {
"name": "contact_name_for_billing",
"label": "Contact Name For Billing",
}
if input["custom_fields"].get("contact_name_for_billing", None):
contact_name_for_billing.update({"value": input["custom_fields"]["contact_name_for_billing"]})
Here in each dictionary the name and label keys will be always there but if the user entered the value for its related dictionary then only at that time it should be updated but in this case, the same logic is repeating again and again so how can I do this without repeating the same code
One way you can perform the above action is to have a dictionary of dictionaries like so,
update_dict = {
"billing_notes": {...}
"work_order_number": {...},
"contact_name_for_billing": {...}
}
And later, you can just loop through them and update, further considering you are using text and actual variable names at places, this could be beneficial in fact.
for (key, udict) in update_dict.items():
if input["custom_fields"].get(key, None):
udict.update({"value": input["custom_fields"][key]})
In my knowledge, I do not see any other plausible simpler way to do this. Hope you find this answer useful. Do drop questions in the comments.

Parse Json file and save specific values [duplicate]

This question already has answers here:
Getting a list of values from a list of dicts
(10 answers)
Closed 5 years ago.
I have this JSON file where the amount of id's sometimes changes (more id's will be added):
{
"maps": [
{
"id": "blabla1",
"iscategorical": "0"
},
{
"id": "blabla2",
"iscategorical": "0"
},
{
"id": "blabla3",
"iscategorical": "0"
},
{
"id": "blabla4",
"iscategorical": "0"
}
]
}
I have this python code that has to print all the values of ids:
import json
data = json.load(open('data.json'))
variable1 = data["maps"][0]["id"]
print(variable1)
variable2 = data["maps"][1]["id"]
print(variable2)
variable3 = data["maps"][2]["id"]
print(variable3)
variable4 = data["maps"][3]["id"]
print(variable4)
I have to use variables, because i want to show the values in a dropdown menu. Is it possible to save the values of the id's in a more efficient way? How do you know the max amount of id's of this json file (in de example 4)?
You can get the number of id (which is the number of elements) by checking the length of data['maps']:
number_of_ids = len(data['maps'])
A clean way to get all the id values is storing them in a list.
You can achieve this in a pythonic way like this:
list_of_ids = [map['id'] for map in data['maps']]
Using this approach you don't even need to store the number of elements in the original json, because you iterate through all of them using a foreach approach, essentially.
If the pythonic approach troubles you, you can achieve the same thing with a classic foreach approach doing so:
list_of_ids = []
for map in data['maps']:
list_of_ids.append(map['id'])
Or you can do with a classic for loop, and here is where you really need the length:
number_of_ids = len(data['maps'])
list_of_ids = []
for i in range(0,number_of_ids):
list_of_ids.append(data['maps'][i]['id'])
This last is the classic way, but I suggest you to take the others approaches in order to leverage the advantages python offers to you!
You can find more on this stuff here!
Happy coding!
data['maps'] is a simple list, so you can iterate over it as such:
for map in data['maps']:
print(map['id'])
To store them in a variable, you'll need to output them to a list. Storing them each in a separate variable is not a good idea, because like you said, you don't have a way to know how many there are.
ids = []
for map in data['maps']:
ids.append(map['id'])

Categories

Resources