How to print clean string from JSON list // Dict? - python

so I have some problem to find how to print a clean string from JSON list // Dict files.
I tried .join, .split method but it doesnt seem to work. Thank for the help guys
My code:
import json
with open("user.json") as f:
data = json.load(f)
for person in data["person"]:
print(person)
The JSON file
{
"person": [
{
"name": "Peter",
"Country": "Montreal",
"Gender": "Male"
},
{
"name": "Alex",
"Country": "Laval",
"Gender": "Male"
}
]
}
The print output (Which is not the correct format I want)
{'name': 'Peter', 'Country': 'Montreal', 'Gender': 'Male'}
{'name': 'Alex', 'Country': 'Laval', 'Gender': 'Male'}
I want to have the output print format to be like this:
Name: Peter
Country: Montreal
Gender:Male

If you want to print all the attributes in the person dictionary (with no exceptions) you can use:
for person in data["person"]:
for k, v in person.items():
print(k, ':', v)

You can access values using their keys as follow
import json
with open("user.json") as f:
data = json.load(f)
for person in data["person"]:
print(f'Name: {person["name"]}')
print(f'Country: {person["Country"]}')
print(f'Gender: {person["Gender"]}')
Result:
Name: Peter
Country: Montreal
Gender: Male
Name: Alex
Country: Laval
Gender: Male

for person in data["person"]:
print(f"Name: {person['name']}")
print(f"Country: {person['Country']}")
print(f"Gender: {person['Gender']}")
for python3.6+

Related

Splitting string in multiple variable fields using regex using python

I have a dataframe were each row of a certain column is a text that comes from some bad formatted form where each 'field' is after the the 'field title', an example is:
col
Name: Bob Surname: Ross Title: painter age:34
Surname: Isaac Name: Newton Title: coin checker age: 42
age:20 Title: pilot Name: jack
this is some trash text Name: John Surname: Doe
As from example, the fields can be in any order an some of them could not exist.
What I need to do is to parse the fields so that the second line becomes something like:
{'Name': 'Isaac','Surname': 'Newton',...}
While i can deal with the 'pythonic part' I believe that the parsing should be done using some regex (also due to the fact that the rows are thousands) but I have no idea on how to design it.
Try:
x = df["col"].str.extractall(r"([^\s:]+):\s*(.+?)\s*(?=[^\s:]+:|\Z)")
x = x.droplevel(level="match").pivot(columns=0, values=1)
print(x.apply(lambda x: x[x.notna()].to_dict(), axis=1).to_list())
Prints:
[
{"Name": "Bob", "Surname": "Ross", "Title": "painter", "age": "34"},
{
"Name": "Newton",
"Surname": "Isaac",
"Title": "coin checker",
"age": "42",
},
{"Name": "jack", "Title": "pilot", "age": "20"},
]

Output pandas dataframe to json in a particular format

My dataframe is
fname lname city state code
Alice Lee Athens Alabama PXY
Nor Xi Mesa Arizona ABC
The output of json should be
{
"Employees":{
"Alice Lee":{
"code":"PXY",
"Address":"Athens, Alabama"
},
"Nor Xi":{
"code":"ABC",
"Address":"Mesa, Arizona"
}
}
}
df.to_json() gives no hierarchy to the json. Can you please suggest what am I missing? Is there a way to combine columns and give them a 'keyname' while writing json in pandas?
Thank you.
Try:
names = df[["fname", "lname"]].apply(" ".join, axis=1)
addresses = df[["city", "state"]].apply(", ".join, axis=1)
codes = df["code"]
out = {"Employees": {}}
for n, a, c in zip(names, addresses, codes):
out["Employees"][n] = {"code": c, "Address": a}
print(out)
Prints:
{
"Employees": {
"Alice Lee": {"code": "PXY", "Address": "Athens, Alabama"},
"Nor Xi": {"code": "ABC", "Address": "Mesa, Arizona"},
}
}
We can populate a new dataframe with columns being "code" and "Address", and index being "full_name" where the latter two are generated from the dataframe's columns with string addition:
new_df = pd.DataFrame({"code": df["code"],
"Address": df["city"] + ", " + df["state"]})
new_df.index = df["fname"] + " " + df["lname"]
which gives
>>> new_df
code Address
Alice Lee PXY Athens, Alabama
Nor Xi ABC Mesa, Arizona
We can now call to_dict with orient="index":
>>> d = new_df.to_dict(orient="index")
>>> d
{"Alice Lee": {"code": "PXY", "Address": "Athens, Alabama"},
"Nor Xi": {"code": "ABC", "Address": "Mesa, Arizona"}}
To match your output, we wrap d with a dictionary:
>>> {"Employee": d}
{
"Employee":{
"Alice Lee":{
"code":"PXY",
"Address":"Athens, Alabama"
},
"Nor Xi":{
"code":"ABC",
"Address":"Mesa, Arizona"
}
}
}
json = json.loads(df.to_json(orient='records'))
employees = {}
employees['Employees'] = [{obj['fname']+' '+obj['lname']:{'code':obj['code'], 'Address':obj['city']+', '+obj['state']}} for obj in json]
This outputs -
{
'Employees': [
{
'Alice Lee': {
'code': 'PXY',
'Address': 'Athens, Alabama'
}
},
{
'Nor Xi': {
'code': 'ABC',
'Address': 'Mesa, Arizona'
}
}
]
}
you can solve this using df.iterrows()
employee_dict = {}
for row in df.iterrows():
# row[0] is the index number, row[1] is the data respective to that index
row_data = row[1]
employee_name = row_data.fname + ' ' + row_data.lname
employee_dict[employee_name] = {'code': row_data.code, 'Address':
row_data.city + ', ' + row_data.state}
json_data = {'Employees': employee_dict}
Result:
{'Employees': {'Alice Lee': {'code': 'PXY', 'Address': 'Athens, Alabama'},
'Nor Xi': {'code': 'ABC', 'Address': 'Mesa, Arizona'}}}

How to access elements in a specific way in json?

I want to learn JSON and I am using Python. I currently have a question about how to access elements. Here would be a generic example of the JSON information:
"data":{
"Bob":{
"name":"Bob",
"age":"30",
"state":"California",
"job":"accountant"
},
"Joe":{
"name":"Bob",
"age":"30",
"state":"Florida",
"job":"engineer"
},
"Tom":{
"name":"Bob",
"age":"25",
"state":"North Dakota",
"job":"manager"
}
}
Now, I would want to make a for loop that gets a list of all the names that are age 30. How am I going to do this. I tried doing something like this:
array = []
for x in range(0,3):
if data[x]['age'] is '30'
array.append(data[x])
but that is definitely wrong. Can somebody teach me how to sort the items in JSON in this way?
You can iterate via JSON data the same as via Python dict
import json
with open('data.json') as json_file:
input_data = json.load(json_file)
data = []
# use dict function items that iterate via key, value in our cause k - key, v - value
for k, v in input_data['data'].items():
if v['age'] == '30':
data.append(v)
print(data)
Output:
[{'name': 'Bob', 'age': '30', 'state': 'California', 'job': 'accountant'}, {'name': 'Bob', 'age': '30', 'state': 'Florida', 'job': 'engineer'}]
Try this:
data = {
"Bob":{
"name":"Bob",
"age":"30",
"state":"California",
"job":"accountant"
},
"Joe":{
"name":"Bob",
"age":"30",
"state":"Florida",
"job":"engineer"
},
"Tom":{
"name":"Bob",
"age":"25",
"state":"North Dakota",
"job":"manager"
}
}
names = []
for name, information in data.items():
if information['age'] == '30':
names.append(name)
print(names)
If you have a dictionary object called data, you can use data.items() to iterate over the keys ('Bob', 'Joe', 'Tom') and values of the dictionary. You could check out the documentation here: https://docs.python.org/3/library/stdtypes.html#dict.items
json_data = {"data":{
"Bob":{
"name":"Bob",
"age":"30",
"state":"California",
"job":"accountant"
},
"Joe":{
"name":"Bob",
"age":"30",
"state":"Florida",
"job":"engineer"
},
"Tom":{
"name":"Bob",
"age":"25",
"state":"North Dakota",
"job":"manager"
}
}
}
names = []
for k, v in json_data['data'].items():
if v['age'] == '30':
names.append(k)
print(names)

Match regex in python and return key

I have a nested dictionary and I have a trouble matching a regular expression with values in dictionary. I need to iterate through values in dictionary and return a key where regex has matched in value.
I have nested dictionary like this:
user_info = { 'user1': {'name': 'Aby',
'surname': 'Clark',
'description': 'Hi contact me by phone +1 548 5455 55
or facebook.com/aby.clark'},
'user2': {'name': 'Marta',
'surname': 'Bishop',
'description': 'Nice to meet you text me'},
'user3': {'name': 'Janice',
'surname': 'Valinise',
'description': 'You can contact me by phone +1 457
555667'},
'user4': {'name': 'Helen',
'surname': 'Bush',
'description': 'You can contact me by phone +1 778
65422'},
'user5': {'name': 'Janice',
'surname': 'Valinise',
'description': 'You can contact me by phone +1 457
5342327 or email janval#yahoo.com'}}
So I need to iterate through values of dictionary with regex and find a match and return back a key where is match happened.
A first problem I have faced is extracting a values from nested dictionary, but I solved this through:
for key in user_info.keys():
for values in user_info[key].values():
print(values)
And this getting back a values from nested dictionary. So is there a way to iterate through this values with regex as it will find a match and return back a key where match is happened.
I tried the following:
for key in user_info.keys():
for values in user_info.[key].values():
#this regex match the email
email = re.compile(r'(^[a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)'.format(pattern), re.IGNORECASE|re.MULTILINE)
match = re.match(email)
if match is not None:
print ("No values.")
if found:
return match
Am I doing something wrong? I am wrestling with this question for a week...
Could you please tell me what's wrong and give a tips how to solve this #!4fd... please. Thank you!
P.S. And yep I didn't found the similar issue on stackoverflow and google. I've tried.
Looks like you want to extract the emails from the JSON values while also returning the matched key. Here are 2 solutions. The first one is similar to yours and the second one is generalized to any JSON with arbitrary levels.
Two for loops
import re
user_info = {
"user1": {
"name": "Aby",
"surname": "Clark",
"description": "Hi contact me by phone +1 548 5455 55or facebook.com/aby.clark"
},
"user2": {
"name": "Marta",
"surname": "Bishop",
"description": "Nice to meet you text me"
},
"user3": {
"name": "Janice",
"surname": "Valinise",
"description": "You can contact me by phone +1 457 555667"
},
"user4": {
"name": "Helen",
"surname": "Bush",
"description": "You can contact me by phone +1 778 65422"
},
"user5": {
"name": "Janice",
"surname": "Valinise",
"description": "You can contact me by phone +1 457 5342327 or email janval#yahoo.com",
}
}
matches = []
for user, info in user_info.items():
for key, value in info.items():
emails = re.findall("([a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)", value)
if emails:
matches.append((f'{user}.{key}', emails))
print(matches)
# -> [('user5.description', ['janval#yahoo.com'])]
The recursive approach for arbitrary JSON
import re
user_info = {
"user1": {
"name": "Aby",
"surname": "Clark",
"description": "Hi contact me by phone +1 548 5455 55or janval#yahoo.com",
"friends": [
{
"name": "Aby",
"surname": "Clark",
"description": "Hi contact me by phone +1 548 5455 55or janval#yahoo.com",
}
]
}
}
def traverse(obj, keys = []):
if isinstance(obj, str):
emails = re.findall("([a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)", obj)
return [('.'.join(keys), emails)] if emails else []
if isinstance(obj, dict):
return [match for key, value in obj.items() for match in traverse(value, [*keys, key])]
if isinstance(obj, list):
return [match for i, value in enumerate(obj) for match in traverse(value, [*keys, str(i)])]
return []
print(traverse(user_info, []))
# -> [('user1.description', ['janval#yahoo.com']), ('user1.friends.0.description', ['janval#yahoo.com'])]
You can try using search instead of the match function in the next way:
for key in user_info.keys():
for values in user_info[key].values():
email = re.search(r'([a-zA-Z0-9_.+-]+#[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+)+', values)
if email != None:
print(key)
This code will print all the keys with the matched inner value.
Notice that in the code you have tried you didn't use values at all.

How to write multiple json files in python?

First of all, I would like to ask what is a typical format of JSON file with multiple objects?
Is it a list of objects like: [ {...}, {...} ... ]?
Second, I tried to store multiple dict to a single JSON file using python.
I have two JSONs:
test_data = {'profile_img': 'https://fmdataba.com/images/p/4592.png',
'name': 'Son Heung-Min ',
'birth_date': '8/7/1992',
'nation': 'South Korea KOR',
'position': 'M (R), AM (RL), ST (C)',
'foot': 'Either'}
and
test_data2 = {'profile_img': 'https://fmdataba.com/images/p/1103.png',
'name': 'Marc-André ter Stegen ',
'birth_date': '30/4/1992',
'nation': 'Germany',
'position': 'GK',
'foot': 'Either'}
then I did
with open('data.json', 'w') as json_file:
json.dump(test_data, json_file, indent=2)
json.dump(test_data2, json_file, indent=2)
Of course, I would have iterated a list of dicts to store multiple dicts, but I just did this for now to test if the format is correct. The result .json file looks like
data.json
{
"profile_img": "https://fmdataba.com/images/p/4592.png",
"name": "Son Heung-Min ",
"birth_date": "8/7/1992",
"nation": "South Korea KOR",
"position": "M (R), AM (RL), ST (C)",
"foot": "Either"
}{
"profile_img": "https://fmdataba.com/images/p/1103.png",
"name": "Marc-Andr\u00e9 ter Stegen ",
"birth_date": "30/4/1992",
"nation": "Germany",
"position": "GK",
"foot": "Either"
}
It seems pretty weird because there is not , between two objects.
What is the typical way of doing this?
You need create an object hold all your date in list first.Then dump this list to file.
test_data_list = [test_data, test_data2]
json.dump(test_data_list, json_file)

Categories

Resources