Create multiple JSON files from CSV by grouping categories - python

here is a CSV file :
year,product,price
2021,P01,50
2022,P03,60
2021,P02,30
I'm trying to create a JSON for every year with the list of product like this :
{
"year": "2021",
"products": {
"P02": 30,
"P01": 50
},
"processed": "true"
}
Here is my actual code :
import json
csv = """2021,P01,50
2022,P03,60
2021,P02,30
"""
response = {}
for line in csv.splitlines():
fields = line.split(",")
year, product, price = fields[0], fields[1], fields[2:]
if year not in response:
response[year] = {}
response[year][product] = price
print json.dumps(response)
This is the result I get :
{
"2021": {
"P02": [
"30"
],
"P01": [
"50"
]
},
"2022": {
"P03": [
"60"
]
}
}
Could you help me please to get the result I'm waiting for ?
I start to think that I should maybe use List to make it ...

If the same product in the same year does not have different values then you can create a structure like-
{
"2021": {
"P0": 50,
"P1": 30
},
"2022": {
"P0": 60
}
}
For creating a structure like that
import json
csv = """2021,P01,50
2022,P03,60
2021,P02,30
"""
response = {}
for line in csv.splitlines():
fields = line.split(",")
year, product, price = fields[0], fields[1], fields[2:]
year_response = response.get(year, {})
year_response[product] = price
response[year] = year_response
# iterate the dictionary and create your custom response
for year, year_response in response.items():
file_data = {}
file_date["year"] = year
file_data["products"] = year_response
file_data["processed"] = true
#TODO: add file_data to file now
If the same product in the same year has different values then you can simply use a list instead of a integer value for "P0"

Related

How to add duplicate columns together after converting from excel to json in python?

I have excel file in the format :
Name
Question
Answer
N1
Q1
a1
N2
Q2
a2
N3
Q3
a3
N4
Q4
a4
N3
Q5
a3
Here some name are same and their correspondings answers are also same. I want to convert this into json in the format where all the columns with same name are merged.
{
{
"name":"N1",
"exampleSentences": ["Q1"],
"defaultReply": {
"text": ["a1"],
"type": "text"
}
},
{
"name":"N2",
"exampleSentences": ["Q2"],
"defaultReply": {
"text": ["a2"],
"type": "text"
}
},
{
"name":"N3",
"exampleSentences": ["Q3","Q5"],
"defaultReply": {
"text": ["a3"],
"type": "text"
}
},
{
"name":"N4",
"exampleSentences": ["Q4"],
"defaultReply": {
"text": ["a4"],
"type": "text"
}
},
}
Here is the code that I wrote:
# Import the required python modules
import pandas as pd
import math
import json
import csv
# Define the name of the Excel file
fileName = "FAQ_eng"
# Read the Excel file
df = pd.read_excel("{}.xlsx".format(fileName))
intents = []
intentNames = df["Name"]
# Loop through the list of Names and create a new intent for each row
for index, name in enumerate(intentNames):
if name is not None:
exampleSentences = []
defaultReplies = []
if df["Question"][index] is not None and df["Question"][index] is not float:
try:
exampleSentences = df["Question"][index]
exampleSentences = [exampleSentences]
defaultReplies = df["Answer"][index]
defaultReplies = [defaultReplies]
except:
continue
intents.append({
"name": name,
"exampleSentences": exampleSentences,
"defaultReply": {
"text": defaultReplies,
"type": "text"
}
})
# Write the list of created intents into a JSON file
with open("{}.json".format(fileName), "w", encoding="utf-8") as outputFile:
json.dump(intents, outputFile, ensure_ascii=False)
My code adds another json data
{
"name":"N3",
"exampleSentences": ["Q5"],
"defaultReply": {
"text": ["a3"],
"type": "text"
}
instead of merging Q3 and Q5. What should I do?
The problem in your code is you are iterating through a set of items and at every iteration you should check the previous items to see if your current element is already present. You can avoid this problem if you use an initially empty dictionary d storing key, value pairs in the form d[name] = {"exampleSentences": [question], "text": [answer]}. You can iterate so over df["Name"] like below:
intentNames = df["Name"]
d = {}
# Loop through intentNames and create the dictionary
for index, name in enumerate(intentNames):
question = df["Question"][index]
answer = df["Answer"][index]
if name not in d:
d[name] = {"exampleSentences": [question], "text": [answer]}
else:
d[name]["exampleSentences"].append(question)
Then you can use the created dictionary to create the json file with the expected output like below:
intentNames = df["Name"]
d = {}
# Loop through intentNames and create the dictionary
for index, name in enumerate(intentNames):
question = df["Question"][index]
answer = df["Answer"][index]
if name not in d:
d[name] = {"exampleSentences": [question], "text": [answer]}
else:
d[name]["exampleSentences"].append(question)
#create the json array file
intents = []
for k, v in d.items():
intents.append({
"name": k,
"exampleSentences": v['exampleSentences'],
"defaultReply": {
"text": v['text'],
"type": "text"
}
})
# Write the list of created intents into a JSON file
with open("{}.json".format(fileName), "w", encoding="utf-8") as outputFile:
json.dump(intents, outputFile, ensure_ascii=False)

How can I iterate through a dictionary and use context managers in Python?

The dictionary I am trying to iterate through has the following structure:
d = {
"main_key_1": {
"name": "Name1",
"context": "Context1",
"message": "Message1",
"date": "Date1",
"reference": "Reference1"
},
"main_key_2": {
"name": "Name2",
"context": "Context2",
"message": "Message2",
"date": "Date2",
"reference": "Reference2"
}
}
This is the way I tried to iterate:
for item in d.items():
from_context = f"from {item[1]['context']}"
with context('given a descriptor'):
with context(from_context):
with before.all:
self.descriptor = item[1]['message']
with context('that contains a date'):
with it('recognizes the date'):
adapter = MessageToDatetAdapter(self.descriptor)
result = adapter.is_a_date()
expect(result).to(equal(True))
with it('extracts the date data'):
adapter = MessageToDatetAdapter(self.descriptor)
result = adapter.adapt()
expect(result['date']).to(equal(item[1]['date']))
expect(result['reference']).to(item[1]['reference'])
The first iteration would be something like below:
with context('given a descriptor'):
with context('from Context1'):
with before.all:
self.descriptor = 'Message1'
with context('that contains a date'):
with it('recognizes the date'):
adapter = MessageToDatetAdapter(self.descriptor)
result = adapter.is_a_date()
expect(result).to(equal(True))
with it('extracts the date data'):
adapter = MessageToDatetAdapter(self.descriptor)
result = adapter.adapt()
expect(result['date']).to('Date1')
expect(result['reference']).to('Reference1')
However, it seems like this is not correct. It looks like I cannot iterate through all the dictionary items.

copying data from json response [Python]

I have a scenario where I am trying to extract data from json response which is obtained from the GET request and then rebuilding the json data by changing some values and then sending a PUT request at same time after rebuilding the json data(i.e, after changing idter value)
below is the target json response.
target_json = {
"name": "toggapp",
"ts": [
1234,
3456
],
"gs": [
{
"id": 4491,
"con": "mno"
},
{
"id": 4494,
"con": "hkl"
}
],
"idter": 500,
"datapart": false
}
from the above json I am trying to change the idter value to my custom value and rebuild it into json data again and post the new json data.
Here is what I have tried :
headers = {'Authorization': 'bearer ' + auth_token, 'Content-Type':'application/json', 'Accept':'application/json'}
tesstid =[7865, 7536, 7789]
requiredbdy = []
for key in testid:
get_metadata_targetjson= requests.get('https://myapp.com/%s' %key, headers = headers)
metadata=get_metadata_target.json()
for key1 in metadata:
requiredbdy.append(
{
"metadata" : [{
"name": key1['name'],
"ts": key1['ts'],
"gs": key1[gs],
"idter": 100, #custom value which I want to change
"datapart": false
} ]
}
)
send_metadata_newjson= requests.put('https://myapp.com/%s' %key, headers = headers data = requiredbdy)
print(send_metadata_newjson.status_code)
Is this approach fine or How do I proceed in order to achieve this scenario.
You can use the built-in json module for this like so
import json
my_json = """
{
"name": "toggapp",
"ts": [
1234,
3456
],
"gs": [
{
"id": 4491,
"con": "mno"
},
{
"id": 4494,
"con": "hkl"
}
],
"idter": 500,
"datapart": false
}
"""
json_obj = json.loads(my_json)
json_obj['idter'] = 600
print(json.dumps(json_obj))
Prints
{"name": "toggapp", "ts": [1234, 3456], "gs": [{"id": 4491, "con": "mno"}, {"id": 4494, "con": "hkl"}], "idter": 600, "datapart": false}
There's this small script used it to find entries in some very long and unnerving JSONs. not very beautifull und badly documented but maybe helps in your scenario.
from RecursiveSearch import Retriever
def alter_data(json_data, key, original, newval):
'''
Alter *all* values of said keys
'''
retr = Retriever(json_data)
for item_no, item in enumerate(retr.__track__(key)): # i.e. all 'value'
# Pick parent objects with a last element False in the __track__() result,
# indicating that `key` is either a dict key or a set element
if not item[-1]:
parent = retr.get_parent(key, item_no)
try:
if parent[key] == original:
parent[key] = newval
except TypeError:
# It's a set, this is not the key you're looking for
pass
if __name__ == '__main__':
alter_data(notification, key='value',
original = '********** THIS SHOULD BE UPDATED **********',
newval = '*UPDATED*')

Join array of objects to string value in Python

I want to join array of objects to a string in python. Is there any way for me to do that?
url = "https://google.com",
search = "thai food",
results = [
{
"restaurant": "Siam Palace",
"rating": "4.5"
},
{
"restaurant": "Bangkok Palace",
"rating": "3.5"
}
]
I want to be able to join these all to form one value.
If I could make it look like:
data = { url = "https://google.com",
{
search = "thai food",
results = [
{
"restaurant": "Siam Palace",
"rating": "4.5"
},
{
"restaurant": "Bangkok Palace",
"rating": "3.5"
}
]
}
}
I am receiving these results from mongodb and want to join these 3 together.
Use the JSON module
data = {} # create empty dict
# set the fields
data['url'] = 'https://google.com'
data['search'] = 'thai food'
# set the results
data['results'] = results
# export as string
import json
print(json.dumps(data, indent=4)

Update Json value in python

I am trying to specify a name for my google spreadsheet api. This is done in the 'title' key value. I have tried with the below but it adds a new key to the existing json. Is there a way to get to the "title": "" and update that value with the new_date item?
prev_date = datetime.date.today()-datetime.timedelta(1)
new_date = str(prev_date.isoformat())
res = {
"requests": [
{
"addSheet": {
"properties": {
"title": ""
}
}
}
]
}
res['title'] = new_date
print (res)
This is the output:
{'requests': [{'addSheet': {'properties': {'title': ''}}}], 'title': '2016-12-29'}
This is what I would like it to be:
{'requests': [{'addSheet': {'properties': {'title': '2016-12-29'}}}]}
From the structure you mentioned, the title key that you need to modify is more nested than what you are providing with.
You need to make the following change:
prev_date = datetime.date.today()-datetime.timedelta(1)
new_date = str(prev_date.isoformat())
res = {
"requests": [
{
"addSheet": {
"properties": {
"title": ""
}
}
}
]
}
res['requests'][0]['addSheet']['properties']['title'] = new_date
print (res)
Where:
'requests' value is a list
0 is the first item in the list (and the only item)
'addSheet' is the key in the dictionary that is the value of the item in the list in the 0 index
'properties' is the key in the above dictionary
'title' is the key in the above dictonary, and the one you need upon your request
You are incorrectly indexing your JSON object and adding a new key named 'title' in the root of the object, while you are trying to update the value inside the array. In your case, you should be accessing res['requests'][0]['addSheet']['properties']['title'] = new_date
I now realize I can pass my variables directly in the json.
prev_date = datetime.date.today()-datetime.timedelta(1)
new_date = str(prev_date.isoformat())
req = {
"requests": [
{
"addSheet": {
"properties": {
"title": new_date
}
}

Categories

Resources