Convert lines of string to a dictionary

Convert lines of string to a dictionary - python

I have an initial code like this:
record = "Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001
..."
I need to make it into a dictionary and its output should be something like this:
{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'}
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'}
...
Each line should have an incrementing key starting from 1.
I currently have no idea to approach this issue as I am still a student with no prior experience.
I have seen people use this for strings containing key-value pairs but my string does not contain those:
mydict = dict((k.strip(), v.strip()) for k,v in
(item.split('-') for item in record.split(',')))

Use split:
In [220]: ans = []
In [221]: record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
In [223]: l = record.split(';')
In [227]: for i in l:
...: l1 = i.split(',')
...: d = {'First Name': l1[0], 'Last Name': l1[1], 'Birthday': l1[2]}
...: ans.append(d)
...:
In [228]: ans
Out[228]:
[{'First Name': 'Jane', 'Last Name': 'Doe', 'Birthday': '25/02/2002'},
{'First Name': 'James', 'Last Name': 'Poe', 'Birthday': '19/03/1998'},
{'First Name': 'Max', 'Last Name': 'Soe', 'Birthday': '16/12/2001'}]

To make the required dictionary for a single line, you can use split to chop up the line where there are commas (','), to get the values for the dictionary, and hard-code the keys. E.g.
line = "Jane,Doe,25/02/2002"
values = line.split(",")
d = {"First Name": values[0], "Last Name": values[1], "Birthday": values[2]}
Now to repeat that for each line in the record, a list of all the lines is needed. Again, you can use split in this case to chop up the input where there are semicolons (';'). E.g.
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
lines = record.split(";")
Now you can iterate the solution for one line over this lines list, collecting the results into another list.
results = []
for line in lines:
values = line.split(",")
results.append({"First Name": values[0], "Last Name": values[1], "Birthday": values[2]})
The incremental key requirement you mention seems strange, because you could just keep them in a list, where the index in the list is effectively the key. But of course, if you really need the indexed-dictionary thing, you can use a dictionary comprehension to do that.
results = {i + 1: results[i] for i in range(len(results))}
Finally, the whole thing might be made more concise (and nicer IMO) by using a combination of list and dictionary comprehensions, as well as a list of your expected keys.
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
keys = ["First Name", "Last Name", "Birthday"]
results = [dict(zip(keys, line.split(","))) for line in record.split(";")]
With the optional indexed-dictionary thingy:
results = {i + 1: results[i] for i in range(len(results))}

This should work for your case:
lines = [line.replace('\n','').replace('.','').strip() for line in record.split(';')]
desired_dict = {}
for i, line in enumerate(lines):
words = line.split(',')
desired_dict[i] = {
'First name':words[0],
'Last name':words[1],
'Birthday':words[2]
}

The .split() method is useful. First split the strings separated by ; and split each of the new strings by ,.
record = """Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001"""
out = []
for rec in record.split(';'):
lst = rec.strip().split(',')
dict_new = {}
dict_new['First Name'] = lst[0]
dict_new['Last Name'] = lst[1]
dict_new['Birthday'] = lst[2]
out.append(dict_new)
print(out)

other answers are already quite clear, just want to add on that, you can do it in one line (which is much less readable, not recommended, but it is arguably fancier). it also takes possible spaces into account with strip(), you can remove them if you don't want them. this gives you a list of dicts you need
record_dict = [{'First name': val[0].strip(), 'Last name': val[1].strip(), 'Birthday': val[2].strip()} for val in (rec.strip().split(',') for rec in record.strip().split(';'))]

I think you are looking for :
record = """Jane,Doe,25/02/2002;
James,Poe,19/03/1998;
Max,Soe,16/12/2001"""
num = 0
out = dict()
for v in record.split(";"):
v = v.strip().split(",")
num += 1
out[num] = {'First name':v[0],'Last name':v[1], 'Birthday':v[2]}
print(out)
prints:
{1: {'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
2: {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
3: {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}}

# raw string data
record = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
# list of lists
list_of_lists = [x.split(',') for x in record.split(';')]
# list of dicts
list_of_dicts = []
for x in list_of_lists:
# assemble into dict
d = {'First name': x[0],
'Last name': x[1],
'Birthday': x[2]}
# append to list
list_of_dicts.append(d)
output:
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]

Here is a step by step Pythonic way to achieve that:
>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = '''Jane,Doe,25/02/2002
... James,Poe,19/03/1998
... Max,Soe,16/12/2001'''
>>> records = records.split()
>>> pprint(records)
['Jane,Doe,25/02/2002',
'James,Poe,19/03/1998',
'Max,Soe,16/12/2001']
>>> records = [_.split(',') for _ in records]
>>> pprint(records)
[['Jane', 'Doe', '25/02/2002'],
['James', 'Poe', '19/03/1998'],
['Max', 'Soe', '16/12/2001']]
>>> records = [dict(zip(columns, _)) for _ in records]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
If you have all records in one line, delimited by a ; signal, then you can do this:
>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
>>> records = records.split(';')
>>> pprint(records)
['Jane,Doe,25/02/2002',
'James,Poe,19/03/1998',
'Max,Soe,16/12/2001']
>>> records = [_.split(',') for _ in records]
>>> pprint(records)
[['Jane', 'Doe', '25/02/2002'],
['James', 'Poe', '19/03/1998'],
['Max', 'Soe', '16/12/2001']]
>>> records = [dict(zip(columns, _)) for _ in records]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]
And finally you can put it all together in one line:
>>> from pprint import pprint # just to have a fancy print
>>> columns = ['First name', 'Last name', 'Birthday']
>>> records = 'Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001'
>>> # All tasks in one line now
>>> records = [dict(zip(columns, _)) for _ in [_.split(',') for _ in records.split(';')]]
>>> pprint(records)
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'},
{'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'},
{'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]

list comprehensions make it easy.
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
list_of_records = [item.split(',') for item in record.split(';')]
dict_of_records = [{'first_name':line[0], 'last_name':line[1], 'Birthday':line[2]} for line in list_of_records]
print(dict_of_records)
Output:
[{'first_name': 'Jane', 'last_name': 'Doe', 'Birthday': '25/02/2002'}, {'first_name': 'James', 'last_name': 'Poe', 'Birthday': '19/03/1998'}, {'first_name': 'Max', 'last_name': 'Soe', 'Birthday': '16/12/2001'}]

You can do it without writing any loops using sub() method of re and json:
import re
import json
record = "Jane,Doe,25/02/2002;James,Poe,19/03/1998;Max,Soe,16/12/2001"
sub_record = re.sub(r'\b;?([a-zA-Z]+),([a-zA-Z]+),(\d\d/\d\d/\d\d\d\d)',r',{"First name": "\1", "Last name": "\2", "Birthday": "\3"}',record)
mydict = json.loads('['+sub_record[1:]+']')
print(mydict)
Output:
[{'First name': 'Jane', 'Last name': 'Doe', 'Birthday': '25/02/2002'}, {'First name': 'James', 'Last name': 'Poe', 'Birthday': '19/03/1998'}, {'First name': 'Max', 'Last name': 'Soe', 'Birthday': '16/12/2001'}]

With some regex:
import re
[re.match(r'(?P<First_name>\w+),(?P<Last_name>\w+),(?P<Birthday>.+)', r).groupdict() for r in record.split(';')]
The underscores in First_name and Last_name are inevitable unfortunately.

Related

How to remove the duplicate dictionaries from the list of dictionaries where dictionary contains an another dictionary? [duplicate]

This question already has answers here:
Remove duplicate dict in list in Python
(16 answers)
Closed 8 months ago.
I have a bit complex list of dictionaries which looks like
[
{'Name': 'Something XYZ', 'Address': 'Random Address', 'Customer Number': '-', 'User Info': [{'Registration Number': '17002', 'First Name': 'John', 'Middle Name': '', 'Last Name': 'Denver'}, {'Registration Number': '27417', 'First Name': 'Robert', 'Middle Name': '', 'Last Name': 'Patson'}]},
{'Name': 'Something XYZ', 'Address': 'Random Address', 'Customer Number': '-', 'User Info': [{'Registration Number': '27417', 'First Name': 'Robert', 'Middle Name': '', 'Last Name': 'Patson'}, {'Registration Number': '17002', 'First Name': 'John', 'Middle Name': '', 'Last Name': 'Denver'}]}
]
Expected is below
[
{'Name': 'Something XYZ', 'Address': 'Random Address', 'Customer Number': '-', 'User Info': [{'Registration Number': '17002', 'First Name': 'John', 'Middle Name': '', 'Last Name': 'Denver'}, {'Registration Number': '27417', 'First Name': 'Robert', 'Middle Name': '', 'Last Name': 'Patson'}]},
]
I want to remove the duplicate dictionaries in this list but I don't know how to deal with User Info because the order of the items might be different. A duplicate case would be where all the dictionary items are exactly the same and in the case of User Info order doesn't matter.

I think the best way is to make a hash of User Info by sum the hash values of it's elements (sum will tolerate position change).
def deepHash(value):
if type(value) == list:
return sum([deepHash(x) for x in value])
if type(value) == dict:
return sum([deepHash(x) * deepHash(y) for x, y in value.items()])
return hash(str(value))
and you can simply check the hash of you inputs:
assert deepHash({"a": [1,2,3], "c": "d"}) == deepHash({"c": "d", "a": [3,2,1]})

Is there a way to extract the selected value in a nested Dictionary using a for loop?

Using this dictionary, is there a way I can only extract the Name, Last Name, and Age of the boys?
myDict = {'boy1': {'Name': 'JM', 'Last Name':'Delgado', 'Middle Name':'Goneza', 'Age':'21',
'Birthday':'8/22/2001', 'Gender':'Male'},
'boy2': {'Name': 'Ralph', 'Last Name':'Tubongbanua', 'Middle Name':'Castro',
'Age':'21', 'Birthday':'9/5/2001', 'Gender':'Male'},}
for required in myDict.values():
print (required ['Name', 'Last Name', 'Age'])
The output is:
JM
Ralph
What I have in mind is
JM Delgado 21
Ralph Tubongbanua 21

You have to extract the keys one by one:
myDict = {'boy1': {'Name': 'JM', 'Last Name':'Delgado', 'Middle Name':'Goneza', 'Age':'21',
'Birthday':'8/22/2001', 'Gender':'Male'},
'boy2': {'Name': 'Ralph', 'Last Name':'Tubongbanua', 'Middle Name':'Castro',
'Age':'21', 'Birthday':'9/5/2001', 'Gender':'Male'},}
for required in myDict.values():
print (required['Name'], required['Last Name'],required['Age'])

this could be a solution:
myDict = {'boy1': {'Name': 'JM', 'Last Name':'Delgado', 'Middle, Name':'Goneza', 'Age':'21', 'Birthday':'8/22/2001', 'Gender':'Male'},
'boy2': {'Name': 'Ralph', 'Last Name':'Tubongbanua', 'Middle Name':'Castro',
'Age':'21', 'Birthday':'9/5/2001', 'Gender':'Male'},}
for required in myDict.values():
print(required ['Name'], required['Last Name'], required['Age'])
When printing multiple values separated with commas, a space will automatically appear between them.

python else part in if runs many times , how to solve

here is my code for showing search record and showing inform usr if found nothing.
Problem: else part runs as many times as outer loop.
entries = [{'First Name': 'Sher', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '2989484'},
{'First Name': 'Ali', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '398439'},
{'First Name': 'Talha', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '3343434'}]
search = input("type your search: ")
print(search)
for person in entries:
# print(person)
if person["Last Name"] == search:
print("Here are the records found for your search")
for e in person:
print(e, ":", person[e])
else:
print("There is no record found as you search Keyword")

thats because each iteration you are checking only 1 person, and if you didn't find what you looked for, you are printing that it does not exist.
this is actually an undesired behavior.
a better solution would be to simply look in the set of values you need:
...
search = input("type your search: ")
founds = [entry for entry in entries if entry["Last Name"] == search)] ## filtering only records that match what we need using list comprehension
if founds:
for found in founds:
* print info *
else:
print("There is no record found as you search Keyword")

First, check if the Last Name that the user enters is present in the dictionaries. If yes, then loop through them and print the respective records. Else, display no records found. Here is how you do it:
entries = [{'First Name': 'Sher', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '2989484'},
{'First Name': 'Ali', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '398439'},
{'First Name': 'Talha', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '3343434'}]
search = input("type your search: ")
print(search)
if search in [person['Last Name'] for person in entries]:
for person in entries:
if person["Last Name"] == search:
print("Here are the records found for your search")
for e in person:
print(e, ":", person[e])
else:
print("There is no record found as you search Keyword")
Output:
type your search: >? Khan
Khan
Here are the records found for your search
First Name : Sher
Last Name : Khan
Age : 22
Telephone : 2989484
Here are the records found for your search
First Name : Ali
Last Name : Khan
Age : 22
Telephone : 398439
Here are the records found for your search
First Name : Talha
Last Name : Khan
Age : 22
Telephone : 3343434
type your search: >? Jones
Jones
There is no record found as you search Keyword

Try like this (Use a boolean Found variable)
entries = [{'First Name': 'Sher', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '2989484'},
{'First Name': 'Ali', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '398439'},
{'First Name': 'Talha', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '3343434'},
{'First Name': 'Talha', 'Last Name': 'Jones', 'Age': '22', 'Telephone': '3343434'}]
search = input("type your search: ")
found = False
print(search)
for person in entries:
if person["Last Name"] == search:
found = True
print("Here are the records found for your search")
for e in person:
print(e, ":", person[e])
if not found:
print("There is no record found as you search Keyword")

This might not be the best way of doing it but this would work I think
entries = [{'First Name': 'Sher', 'Last Name': 'Khan', 'Age': '22', 'Telephone':
'2989484'},
{'First Name': 'Ali', 'Last Name': 'Khan', 'Age': '22', 'Telephone': '398439'},
{'First Name': 'Talha', 'Last Name': 'Khan', 'Age': '22', 'Telephone':
'3343434'}]
inEntries = False
search = input("type your search: ")
print(search)
for person in entries:
# print(person)
if person["Last Name"] == search:
inEntries = True
print("Here are the records found for your search")
for e in person:
print(e, ":", person[e])
if not inEntries:
print("There is no record found as you search Keyword")

How to merge the multiple lists into a dict with Python?

I'm using Python to translate a txt file into JSON. However, when I was iterating the lines from txt, the result is containing multiple lists there, and I failed to merge the list into a dict with the function zip(). Can anyone help me figure it out? I've been stuck here for a couple of hours. Thanks.
with open(path, encoding='utf-8-sig') as f:
seq = re.compile("[|]")
for line_num, line in enumerate(f.readlines()):
result = seq.split(line.strip("\n"))
print(result)
This is the output:
['Delivery', 'Customer Name', 'Shipment Priority', 'Creation Date', 'Customer Number', 'Batch Name', 'Release Date']
['69328624', 'Zhidi Feng', 'Standard Priority', '13-OCT-20', '432579', '19657423', '13-OCT-20 00:01:07']
['69328677', 'Zhengguo Huang', 'Standard Priority', '13-OCT-20', '429085', '19657425', '13-OCT-20 00:01:34']

Something like that ?
>>> lists = [["12", "Abc", "def"], ["34", "Ghi", "jkl"]]
>>> fields = ["id", "lastname", "firstname"]
>>> dicts = []
>>> for l in lists:
... d = {}
... for i in range(3):
... d[fields[i]] = l[i]
... dicts.append(d)
>>> dicts
[{'id': '12', 'lastname': 'Abc', 'firstname': 'def'},
{'id': '34', 'lastname': 'Ghi', 'firstname': 'jkl'}]
Edit: included in your existing code:
dicts = []
for line_num, line in enumerate(f.readlines()):
result = seq.split(line.strip("\n"))
if line_num = 0:
keys = result
else:
d = {}
for i, key in enumerate(keys):
d[key] = result[i]
dicts.append(d)
(I didn't test this since I don't have the file you're using)

you can use zip for making a dictionary like this. (I separate keys and values in two lists.)
with open(path, encoding='utf-8-sig') as f:
seq = re.compile("[|]")
lines = f.readlines()
keys = lines[0] # stting keys
dict_list = []
for line_num, line in enumerate(lines[1:]):
result = seq.split(line.strip("\n"))
dict_list.append(dict(zip(keys, result))) # making a dict and append it to list

>>> total
[['Delivery', 'Customer Name', 'Shipment Priority', 'Creation Date', 'Customer Number', 'Batch Name', 'Release Date'], ['69328624', 'Zhidi Feng', 'Standard Priority', '13-OCT-20', '432579', '19657423', '13-OCT-20 00:01:07'], ['69328677', 'Zhengguo Huang', 'Standard Priority', '13-OCT-20', '429085', '19657425', '13-OCT-20 00:01:34']]
>>> keys = total[0]
>>>
>>> values = total[1:]
>>> wanted = [ dict(zip(keys, value)) for value in values]
>>> wanted
[{'Delivery': '69328624', 'Customer Name': 'Zhidi Feng', 'Shipment Priority': 'Standard Priority', 'Creation Date': '13-OCT-20', 'Customer Number': '432579', 'Batch Name': '19657423', 'Release Date': '13-OCT-20 00:01:07'}, {'Delivery': '69328677', 'Customer Name': 'Zhengguo Huang', 'Shipment Priority': 'Standard Priority', 'Creation Date': '13-OCT-20', 'Customer Number': '429085', 'Batch Name': '19657425', 'Release Date': '13-OCT-20 00:01:34'}]

Dictionaries within a list

suppose I have the following list of dictionaries:
database = [{'Job title': 'painter', 'Email address': 'xxx#yyy.com', 'Last name': 'Wright', 'First name': 'James', 'Company': 'Swift'},
{'Job title': 'plumber', 'Email address': 'xxx#yyy.com', 'Last name': 'Bright', 'First name': 'James', 'Company': 'ABD Plumbing'},
{'Job title': 'brick layer', 'Email address': 'xxx#yyy.com', 'Last name': 'Smith', 'First name': 'John', 'Company': 'Bricky brick'}]
I'm entering the following code so I can print information about a person given their first name (I will be changing this, to search for last name, company, job title etc, using a variable):
print(next(item for item in database if item['First name'] == 'James'))
The issue arises as I have two First name's which are equal, namely James. How do I adjust the code so that it prints out information about all the James's in the database?

Remove the next().
print([item for item in database if item['First name'] == 'James'])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Convert lines of string to a dictionary - python

This should work for your case: lines = [line.replace('\n','').replace('.','').strip() for line in record.split(';')] desired_dict = {} for i, line in enumerate(lines): words = line.split(',') desired_dict[i] = { 'First name':words[0], 'Last name':words[1], 'Birthday':words[2] }

With some regex: import re [re.match(r'(?P<First_name>\w+),(?P<Last_name>\w+),(?P<Birthday>.+)', r).groupdict() for r in record.split(';')] The underscores in First_name and Last_name are inevitable unfortunately.

Related

How to remove the duplicate dictionaries from the list of dictionaries where dictionary contains an another dictionary? [duplicate]

Is there a way to extract the selected value in a nested Dictionary using a for loop?

python else part in if runs many times , how to solve

How to merge the multiple lists into a dict with Python?

Dictionaries within a list

Categories

Resources