How to map and update python dictionary with different key value pair? - python

I want to transform a Dictionary in Python, from Dictionary 1 Into Dictionary 2 as follows.
transaction = {
"trans_time": "14/07/2015 10:03:20",
"trans_type": "DEBIT",
"description": "239.95 USD DHL.COM TEXAS USA",
}
I want to transform the above dictionary to the following
transaction = {
"trans_time": "14/07/2015 10:03:20",
"trans_type": "DEBIT",
"description": "DHL.COM TEXAS USA",
"amount": 239.95,
"currency": "USD",
"location": "TEXAS USA",
"merchant_name": "DHL"
}
I tried the following but it did not work
dic1 = {
"trans_time": "14/07/2015 10:03:20",
"trans_type": "DEBIT",
"description": "239.95 USD DHL.COM TEXAS USA"
}
print(type(dic1))
copiedDic = dic1.copy()
print("copiedDic = ",copiedDic)
updatekeys = ['amount', 'currency', 'merchant_name', 'location', 'trans_category']
for key in dic1:
if key == 'description':
list_words = dic1[key].split(" ")
newdict = {updatekeys[i]: x for i, x in enumerate(list_words)}
copiedDic.update(newdict)
print(copiedDic)
I got The following result
{
'trans_time': '14/07/2015 10:03:20',
'trans_type': 'DEBIT',
'description': '239.95 USD DHL.COM TEXAS USA',
'amount': '239.95',
'currency': 'USD',
'merchant_name': 'DHL.COM',
'location': 'TEXAS',
'trans_category': 'USA'
}
My Intended output should look like this:
transaction = {
"trans_time": "14/07/2015 10:03:20",
"trans_type": "DEBIT",
"description": "DHL.COM TEXAS USA",
"amount": 239.95,
"currency": "USD",
"location": "TEXAS USA",
"merchant_name": "DHL"
}

I think it would be easier to turn the value into an array of words and parse it. Here, an array of words 'aaa ' is created from the dictionary string 'transaction['description']'. Where there are more than one word(array element) 'join' is used to turn the array back into a string. The currency value itself is converted to fractional format from the string. In 'merchant_name', the segment up to the point is taken.
transaction = {
"trans_time": "14/07/2015 10:03:20",
"trans_type": "DEBIT",
"description": "239.95 USD DHL.COM TEXAS USA",
}
aaa = transaction['description'].split()
transaction['description'] = ' '.join(aaa[2:])
transaction['amount'] = float(aaa[0])
transaction['currency'] = aaa[1]
transaction['location'] = ' '.join(aaa[3:])
transaction['merchant_name'] = aaa[2].partition('.')[0]
print(transaction)
Output
{
'trans_time': '14/07/2015 10:03:20',
'trans_type': 'DEBIT',
'description': 'DHL.COM TEXAS USA',
'amount': 239.95,
'currency': 'USD',
'location': 'TEXAS USA',
'merchant_name': 'DHL'}

If you want to transform, you do not need the copy to the original dictionary.
Just do something like this:
new_keys = ['amount', 'currency', 'merchant_name', 'location', 'trans_category']
values = transaction["description"].split(' ')
for idx, key in enumerate(new_keys):
if key == "amount":
transaction[key] = float(values[idx])
else:
transaction[key] = values[idx]

Related

Output pandas dataframe to json in a particular format

My dataframe is
fname lname city state code
Alice Lee Athens Alabama PXY
Nor Xi Mesa Arizona ABC
The output of json should be
{
"Employees":{
"Alice Lee":{
"code":"PXY",
"Address":"Athens, Alabama"
},
"Nor Xi":{
"code":"ABC",
"Address":"Mesa, Arizona"
}
}
}
df.to_json() gives no hierarchy to the json. Can you please suggest what am I missing? Is there a way to combine columns and give them a 'keyname' while writing json in pandas?
Thank you.
Try:
names = df[["fname", "lname"]].apply(" ".join, axis=1)
addresses = df[["city", "state"]].apply(", ".join, axis=1)
codes = df["code"]
out = {"Employees": {}}
for n, a, c in zip(names, addresses, codes):
out["Employees"][n] = {"code": c, "Address": a}
print(out)
Prints:
{
"Employees": {
"Alice Lee": {"code": "PXY", "Address": "Athens, Alabama"},
"Nor Xi": {"code": "ABC", "Address": "Mesa, Arizona"},
}
}
We can populate a new dataframe with columns being "code" and "Address", and index being "full_name" where the latter two are generated from the dataframe's columns with string addition:
new_df = pd.DataFrame({"code": df["code"],
"Address": df["city"] + ", " + df["state"]})
new_df.index = df["fname"] + " " + df["lname"]
which gives
>>> new_df
code Address
Alice Lee PXY Athens, Alabama
Nor Xi ABC Mesa, Arizona
We can now call to_dict with orient="index":
>>> d = new_df.to_dict(orient="index")
>>> d
{"Alice Lee": {"code": "PXY", "Address": "Athens, Alabama"},
"Nor Xi": {"code": "ABC", "Address": "Mesa, Arizona"}}
To match your output, we wrap d with a dictionary:
>>> {"Employee": d}
{
"Employee":{
"Alice Lee":{
"code":"PXY",
"Address":"Athens, Alabama"
},
"Nor Xi":{
"code":"ABC",
"Address":"Mesa, Arizona"
}
}
}
json = json.loads(df.to_json(orient='records'))
employees = {}
employees['Employees'] = [{obj['fname']+' '+obj['lname']:{'code':obj['code'], 'Address':obj['city']+', '+obj['state']}} for obj in json]
This outputs -
{
'Employees': [
{
'Alice Lee': {
'code': 'PXY',
'Address': 'Athens, Alabama'
}
},
{
'Nor Xi': {
'code': 'ABC',
'Address': 'Mesa, Arizona'
}
}
]
}
you can solve this using df.iterrows()
employee_dict = {}
for row in df.iterrows():
# row[0] is the index number, row[1] is the data respective to that index
row_data = row[1]
employee_name = row_data.fname + ' ' + row_data.lname
employee_dict[employee_name] = {'code': row_data.code, 'Address':
row_data.city + ', ' + row_data.state}
json_data = {'Employees': employee_dict}
Result:
{'Employees': {'Alice Lee': {'code': 'PXY', 'Address': 'Athens, Alabama'},
'Nor Xi': {'code': 'ABC', 'Address': 'Mesa, Arizona'}}}

Issue with updating a dictionary in for loop (not unique)

I have an issue with populating a dictionary from the database. It overwrites or merges values that from my logic should be unique to the given country.
country_list=[]
for countries in item_obj.values('ship_country'):
if countries['ship_country'] not in country_list:
country_list.append(countries['ship_country'])
country_dict = {}
for country in country_list:
country_dict.update({country: []})
cat_dict = { i : {} for i in ProductCategories.objects.filter(user=request.user) }
for country in country_list:
country_dict[country].append(cat_dict)
# outputs: country_dict = {'Netherlands': [{'Games': {}, 'Bikes': {}, 'Paint': {}}], 'Belgium': [{'Games': {}, 'Bikes': {}, 'Paint': {}}]}
for country, value in country_dict.items():
for key in value:
for category in key:
for channel in channel_list:
if channel in channels:
aggr = item_obj.filter(
country=country,
category=category,
channel=channel).aggregate(Sum('price'))
if aggr['price__sum'] != None:
key[category] = {channel: round(aggr['price__sum'], 2)}
which gives me if i print out line for line, exactly what i want see:
Netherlands
Games
{'price__sum': None}
{}
Bikes
{'Channel #3': Decimal('30.10')}
Paint
{'Channel #1': Decimal('56.70')}
Belgium
Games
{'price__sum': None}
{}
Bikes
{'Channel #1': Decimal('39.70')}
Paint
{'Channel #2': Decimal('13.90')}
But when I print the final dictionary it overwrites the unique Netherlands values...
{
"Netherlands":[
{
"Games":{
},
"Bikes":{
"channel #1": "Decimal(""39.70"")"
},
"Sprays":{
"channel #2": "Decimal(""13.90"")"
}
}
],
"Belgium":[
{
"Games":{
},
"Bikes":{
"channel #1": "Decimal(""39.70"")"
},
"Paint":{
"channel #2": "Decimal(""13.90"")"
}
}
]
}
if I use update() instead of =
if aggr['price__sum'] != None:
key[category].update({channel: round(aggr['price__sum'], 2)})
it merges the two together like this:
{
"Netherlands":[
{
"Games":{
},
"Bikes":{
"Channel #3":"Decimal(""30.10"")",
"channel #1":"Decimal(""39.70"")"
},
"Paint":{
"Channel #1":"Decimal(""56.70"")",
"channel #2":"Decimal(""13.90"")"
}
}
],
"Belgium":[
{
"Games":{
},
"Bikes":{
"Channel #3":"Decimal(""30.10"")",
"channel #1":"Decimal(""39.70"")"
},
"Paint":{
"Channel #1":"Decimal(""56.70"")",
"channel #2":"Decimal(""13.90"")"
}
}
]
}
It seems that update or = ignores the fact that we are in a different country and I tried a lot and am stuck, I seem to miss something. Can anyone help me? :-)

Accessing a dynamic dictionary key in Python

I have the below python dictionary (dict1) where I can access the value in this way:
dict1['starters]['Chicken Biryani Type']['Please select']
The keys 'Chicken Biryani Type' and 'Please Select' are dynamically generated. How would I access these in that scenario because as I understand I can't use an index.
Dictionary (Dict1)
{
'starters': {
'product_name': 'Papadam',
'base_price': 3.0,
'Chicken Biryani Type': {
'Please select': [{
'Chicken': 3.0
}, {
'Puree': 4.0
}],
'Please remove': [{
'mayonnaise': 0.0
}, {
'ketchup': 0.0
}]
},
'Papadam Type': {
'Plain or Spicy': [{
'Plain': 3.0
}, {
'Spicy': 5.0
}]
}
}
}
Any help is appreciated.
UPDATED. How the dictionary is created:
dict1 = {}
for cat in category_list:
cat_dic = {cat: {}}
dict1.update(cat_dic)
sliced_df2 = df[df.productCategoryName == cat ]
product_type = sliced_df2.productTypeName.unique()
for ptype in product_type:
sliced_df3 = sliced_df2[sliced_df2.productTypeName == ptype ]
product_name = sliced_df3.productName.unique()
product_name_a = {"product_name": product_name[0]}
base_price = {"base_price": sliced_df3['variant_price'].min()}
dict1[cat].update(product_name_a)
dict1[cat].update(base_price)
product_type_dict = {ptype: {}}
dict1[cat].update(product_type_dict)
attribute_names = sliced_df3.attribute_name.unique()
for item in attribute_names:
attribute_dict = {item: []}
dict1[cat][ptype].update(attribute_dict)
sliced_df4 = sliced_df3[sliced_df3.attribute_name == item ]
# for item in attribute_values:
for index, row in sliced_df4.iterrows():
ab = {row['attribute_value']: row['variant_price']}
dict1[cat][ptype][item].append(ab)

python generator to pandas dataframe

I have a generator being returned from:
data = public_client.get_product_trades(product_id='BTC-USD', limit=10)
How do i turn the data in to a pandas dataframe?
the method DOCSTRING reads:
"""{"Returns": [{
"time": "2014-11-07T22:19:28.578544Z",
"trade_id": 74,
"price": "10.00000000",
"size": "0.01000000",
"side": "buy"
}, {
"time": "2014-11-07T01:08:43.642366Z",
"trade_id": 73,
"price": "100.00000000",
"size": "0.01000000",
"side": "sell"
}]}"""
I have tried:
df = [x for x in data]
df = pd.DataFrame.from_records(df)
but it does not work as i get the error:
AttributeError: 'str' object has no attribute 'keys'
When i print the above "x for x in data" i see the list of dicts but the end looks strange, could this be why?
print(list(data))
[{'time': '2020-12-30T13:04:14.385Z', 'trade_id': 116918468, 'price': '27853.82000000', 'size': '0.00171515', 'side': 'sell'},{'time': '2020-12-30T12:31:24.185Z', 'trade_id': 116915675, 'price': '27683.70000000', 'size': '0.01683711', 'side': 'sell'}, 'message']
It looks to be a list of dicts but the end value is a single string 'message'.
Based on the updated question:
df = pd.DataFrame(list(data)[:-1])
Or, more cleanly:
df = pd.DataFrame([x for x in data if isinstance(x, dict)])
print(df)
time trade_id price size side
0 2020-12-30T13:04:14.385Z 116918468 27853.82000000 0.00171515 sell
1 2020-12-30T12:31:24.185Z 116915675 27683.70000000 0.01683711 sell
Oh, and BTW, you'll still need to change those strings into something usable...
So e.g.:
df['time'] = pd.to_datetime(df['time'])
for k in ['price', 'size']:
df[k] = pd.to_numeric(df[k])
You could access the values in the dictionary and build a dataframe from it (although not particularly clean):
dict_of_data = [{
"time": "2014-11-07T22:19:28.578544Z",
"trade_id": 74,
"price": "10.00000000",
"size": "0.01000000",
"side": "buy"
}, {
"time": "2014-11-07T01:08:43.642366Z",
"trade_id": 73,
"price": "100.00000000",
"size": "0.01000000",
"side": "sell"
}]
import pandas as pd
list_of_data = [list(dict_of_data[0].values()),list(dict_of_data[1].values())]
pd.DataFrame(list_of_data, columns=list(dict_of_data[0].keys())).set_index('time')
its straightforward just use the pd.DataFrame constructor:
#list_of_dicts = [{
# "time": "2014-11-07T22:19:28.578544Z",
# "trade_id": 74,
# "price": "10.00000000",
# "size": "0.01000000",
# "side": "buy"
# }, {
# "time": "2014-11-07T01:08:43.642366Z",
# "trade_id": 73,
# "price": "100.00000000",
# "size": "0.01000000",
# "side": "sell"
#}]
# or if you take it from 'data'
list_of_dicts = data[:-1]
df = pd.DataFrame(list_of_dicts)
df
Out[4]:
time trade_id price size side
0 2014-11-07T22:19:28.578544Z 74 10.00000000 0.01000000 buy
1 2014-11-07T01:08:43.642366Z 73 100.00000000 0.01000000 sell
UPDATE
according to the question update, it seems you have json data that is still string...
import json
data = json.loads(data)
data = data['Returns']
pd.DataFrame(data)
time trade_id price size side
0 2014-11-07T22:19:28.578544Z 74 10.00000000 0.01000000 buy
1 2014-11-07T01:08:43.642366Z 73 100.00000000 0.01000000 sell

Define specific json export format of pandas dataframe

I need to export my DF into a specific JSON format, but I'm struggling to format it in the right way.
I'd like to create a subsection with shop_details that show the city and location for the shop if it's known, otherwise it should be left empty.
Code for my DF:
from pandas import DataFrame
Data = {'item_type': ['Iphone','Computer','Computer'],
'purch_price': [1200,700,700],
'sale_price': [1150,'NaN','NaN'],
'city': ['NaN','Los Angeles','San Jose'],
'location': ['NaN','1st street', '2nd street']
}
DF looks like this:
item_type purch_price sale_price city location
0 Iphone 1200 1150 NaN NaN
1 Computer 700 NaN Los Angeles 1st street
2 Computer 700 NaN San Jose 2nd street
The output format should look like below:
[{
"item_type": "Iphone",
"purch_price": "1200",
"sale_price": "1150",
"shop_details": []
},
{
"item_type": "Computer",
"purch_price": "700",
"sale_price": "600",
"shop_details": [{
"city": "Los Angeles",
"location": "1st street"
},
{
"city": "San Jose",
"location": "2nd street"
}
]
}
]
import json
df = df.fillna('')
def shop_details(row):
if row['city'] != '' and row['location'] !='':
return [{'city': row['city'], 'location': row['location']}]
else:
return []
df['shop_details'] = df.apply(lambda row: shop_details(row), axis = 1)
df = df.drop(['city', 'location'], axis = 1)
json.dumps(df.to_dict('records'))
Only problem is we do not group by item_type, but you should do some of the work ;)
You can do like below to achieve your output. Thanks
from pandas import DataFrame
Data = {'item_type': ['Iphone','Computer','Computer'],
'purch_price': [1200,700,700],
'sale_price': [1150,'NaN','NaN'],
'city': ['NaN','Los Angeles','San Jose'],
'location': ['NaN','1st street', '2nd street']
}
df = DataFrame(Data, columns= ['item_type', 'purch_price', 'sale_price', 'city','location' ])
Export = df.to_json ('path where you want to export your json file')

Categories

Resources