Converting an list to a dictionary in python - python

I'm trying to convert a list that is in the form of a dictionary to an actual dictionary.
This is for a webs scraping tool. I've tried removing to the single '' and setting as a dictionary, but I am new to programming and I think my logic is off in some way.
My list is of the form
['"name":"jack"', '"address":"1234 College Ave"']
I am trying to convert general form to a dictionary of the form
{"name":"jack", "address":"1234 College Ave"}

You can convert it to a string JSON representation then use json.loads.
>>> import json
>>> data = ['"name":"jack"', '"address":"1234 College Ave"']
>>> json.loads('{' + ', '.join(data) + '}')
{'name': 'jack', 'address': '1234 College Ave'}

l = ['"name":"jack"', '"address":"1234 College Ave"']
d = {elem.split(":")[0][1:-1]:elem.split(":")[1][1:-1] for elem in l}
print(d)

One way to tackle this is to fix each individual string before passing it to json.loads.
inp = ['"name":"jack"', '"address":"1234 College Ave"']
import json
result = {}
for item in inp:
result.update(json.loads("{" + item + "}"))
print(result)
{'name': 'jack', 'address': '1234 College Ave'}
However, ideally you should be getting data in a better format and not have to rely on manipulating the data before being able to use it. Fix this problem "upstream" if you can.

Related

How to read particular key value data from dictionary using python [duplicate]

This question already has answers here:
Convert a String representation of a Dictionary to a dictionary
(11 answers)
Closed 1 year ago.
I have a file from which I need to extract the particular dictionary value
Data is in below format in file:
{'name': 'xyz', 'age': 14, 'country': 'india'}
My code:
var = 'country'
with open('abc.txt', 'r') as fw:
first_line = fw.readline()
dictvalue = first_line[var]
print(dictvalue)
But this is not fetching value : india, it is throwing error: string indices must be integer
Because first_line=fw.readline() returns string, not dict. You can convert string to dict, using ast module:
import ast
var='country'
with open('abc.txt','r') as fw:
first_line=fw.readline()
dictvalue= ast.literal_eval(first_line)[var]
print(dictvalue)
Also you would need to format your file, because india should be in within single quote
{'name': 'xyz','age': 14,'country': 'india'}
Output:
india
Convert a String representation of a Dictionary to a dictionary
in this line of code,
first_line=fw.readline()
first_line is read as string ie., "{'name':'xyz','age':14,'country':'india'}"
Solution 1:
You can make use of eval.
mydict = eval(first_line)
print(mydict[var])
#'india'
This works, but you should avoid using eval and exec functions, because it is considered as "dangerous" function in python. You can refer this for more on this topic.
Solution 2 (Recommended):
Use Json module to read/write dict objects.
import json
data = {'name':'xyz','age':14,'country':'india'}
#save dict as 'abc.txt' file. alternately use 'abc.json' to save as JSON file.
json.dump(data, open('abc.txt','w'))
read_data = json.load(open('abc.txt', 'r'))
print (read_data)
#{'name': 'xyz', 'age': 14, 'country': 'india'}

Exclude part of string with Regex in python with web scraping

I'm trying to scrape some data from an e-commerce website for a personal project. I'm trying to build a nested list of strings from the html but am having an issue with one part of the html.
Each list item appears as the following:
<div class="impressions" data-impressions=\'{"id":"01920","name":"Sleepy","price":12.95,"brand":"Lush","category":"Bubble Bar","variant":"7 oz.","quantity":1,"list":"/bath/bubble-bars/sleepy/9999901920.html","dimension11":"","dimension12":"Naked,Self Preserving,Vegan","dimension13":1,"dimension14":1,"dimension15":true}\'></div>
What I have now is a regex that turns all the items in the data-impressions tag like so and splits them at the comma:
list_return = [re.findall('\{([^{]+[^}\'></div>])', i) for i in bathshower_impressions]
list_return = [re.split(',', list_return[i][0]) for i in range(0, len(list_return))]
Which gives me a list of lists of lists for each thing which will become a key:value pair in a dictionary. For the example above here is what the second level item would be:
[['"id"', '"01920"'],
['"name"', '"Sleepy"'],
['"price"', '12.95'],
['"brand"', '"Lush"'],
['"category"', '"Bubble Bar"'],
['"variant"', '"7 oz."'],
['"quantity"', '1'],
['"list"', '"/bath/bubble-bars/sleepy/9999901920.html"'],
['"dimension11"', '""'],
['"dimension12"', '"Naked'],
['Self Preserving'],
['Vegan"'],
['"dimension13"', '1'],
['"dimension14"', '1'],
['"dimension15"', 'true']]
My problem is with dimension12, I can't figure out how to exclude that dimension from splitting at the comma, so that that list would appear as:
['"dimension12"', '"Naked,Self Preserving,Vegan"']
Any help is appreciated, thanks.
I'd like to suggest a bit different approach. That attribute value looks like JSON, so why not use a json module? That way, you have a ready-made data structure, that you can modify further.
import json
from bs4 import BeautifulSoup
html_list = [
"""<div class="impressions" data-impressions=\'{"id":"01920","name":"Sleepy","price":12.95,"brand":"Lush","category":"Bubble Bar","variant":"7 oz.","quantity":1,"list":"/bath/bubble-bars/sleepy/9999901920.html","dimension11":"","dimension12":"Naked,Self Preserving,Vegan","dimension13":1,"dimension14":1,"dimension15":true}\'></div>""",
]
data_structures = []
for html_item in html_list:
soup = BeautifulSoup(html_item, "html.parser").find("div", {"class": "impressions"})
data_structures.append(json.loads(soup["data-impressions"]))
print(data_structures)
This outputs a list of dictionaries:
[{'id': '01920', 'name': 'Sleepy', 'price': 12.95, 'brand': 'Lush', 'category': 'Bubble Bar', 'variant': '7 oz.', 'quantity': 1, 'list': '/bath/bubble-bars/sleepy/9999901920.html', 'dimension11': '', 'dimension12': 'Naked,Self Preserving,Vegan', 'dimension13': 1, 'dimension14': 1, 'dimension15': True}]
To access the desired key, just do this:
for data_item in data_structures:
print(data_item["dimension12"])
Prints: Naked,Self Preserving,Vegan

how to convert a Dictionary with multiple values for a single key in multiple rows to single row with multiple values to a key

Below is my requirement. Below is the data that is present in json file:
{"[a]":" text1","[b]":" text2","[a]":" text3","[c]":" text4","[c]":" Text5"}.
The final output should be like
{"[a]":[" text1","text3"],"[b]":" text2","[c]":" text4"," Text5"]}.
I tried below code:
data_in= ["[a]"," text1","[b]"," text2","[a]"," text3","[c]"," text4","[c]"," text5"]
data_pairs = zip(data_in[::2],data_in[1::2])
data_dict = {}
for x in data_pairs:
data_dict.setdefault(x[0],[]).append(x[1])
print data_dict
But the input it takes is more in form of List than a dictionary.
Please advise.
Or is there a way where i can convert my original dictionary into list with multiple values as list will take only unique values. Please let me know the code also i am very new to Python and still learning it. TIA.
Keys are unique within a dictionary while values may not be.
You can try
>>> l = ["[a]"," text1","[b]"," text2","[a]"," text3","[c]"," text4","[c]"," text5"]
>>> dict_data = {}
>>> for i in range(0,len(l),2):
if dict_data.has_key(l[i]):
continue
else:
dict_data[l[i]] = []
>>> for i in range(1,len(l),2):
dict_data[l[i-1]].append(l[i])
>>> print dict_data
{'[c]': [' text4', ' text5'], '[a]': [' text1', ' text3'], '[b]': [' text2']}

Printing json data in string format

I have many json fields in my model. I want to print them in the string format.
The code I am using is :
data=[]
detail=details.objects.filter(Id=item['Id'])
for i in compliance:
data.append(str("Name")+str(":")+str(i.Name)+str(" , ")+str("Details")+str(":")+str(i.Details)
print data
The output I am getting is :
Name:ABC, Details:{u'Status': u'True', u'Remarks': u'No Remark'}
The expected output is:
Name:ABC, Details:Status:True,Remarks:No Remark
Any help will be appreciated.
Check if your data is of type dict
If not print as you are doing now
If yes then send dictionary to another function which does as below
def print_dict(d):
return ",".join([key+":"+str(d[key]) for key in d])
You can do it this way, assuming compliance is a dict / json.
Save the key dicts in a list
Iterate over that list and build a concatenated list
Code would look like this:
keyorder = ['Name', 'Status', 'Remarks']
res = []
for key in keyorder:
res.append(key + ':' + compliance[key])
', '.join(res)
'Name:ABC, Status:True, Remarks:No remarks'
As #chkri suggested check first if your data is dict if yes then you can try this one line solution:
dict={'Name':'ABC', 'Details':{u'Status': u'True', u'Remarks': u'No Remark'}}
print({k:v for k,v in dict.items()})
output:
{'Name': 'ABC', 'Details': {'Remarks': 'No Remark', 'Status': 'True'}}

Most elegant way to format multi-line strings in Python

I have a multiline string, where I want to change certain parts of it with my own variables. I don't really like piecing together the same text using + operator. Is there a better alternative to this?
For example (internal quotes are necessary):
line = """Hi my name is "{0}".
I am from "{1}".
You must be "{2}"."""
I want to be able to use this multiple times to form a larger string, which will look like this:
Hi my name is "Joan".
I am from "USA".
You must be "Victor".
Hi my name is "Victor".
I am from "Russia".
You must be "Joan".
Is there a way to do something like:
txt == ""
for ...:
txt += line.format(name, country, otherName)
info = [['ian','NYC','dan'],['dan','NYC','ian']]
>>> for each in info:
line.format(*each)
'Hi my name is "ian".\nI am from "NYC".\nYou must be "dan".'
'Hi my name is "dan".\nI am from "NYC".\nYou must be "ian".'
The star operator will unpack the list into the format method.
In addition to a list, you can also use a dictionary. This is useful if you have many variables to keep track of at once.
text = """\
Hi my name is "{person_name}"
I am from "{location}"
You must be "{person_met}"\
"""
person = {'person_name': 'Joan', 'location': 'USA', 'person_met': 'Victor'}
print text.format(**person)
Note, I typed the text differently because it lets me line up the text easier. You have to add a '\' at the beginning """ and before the end """ though.
Now if you have several dictionaries in a list you can easily do
people = [{'person_name': 'Joan', 'location': 'USA', 'person_met': 'Victor'},
{'person_name': 'Victor', 'location': 'Russia', 'person_met': 'Joan'}]
alltext = ""
for person in people:
alltext += text.format(**person)
or using list comprehensions
alltext = [text.format(**person) for person in people]
line = """Hi my name is "{0}".
I am from "{1}".
You must be "{2}"."""
tus = (("Joan","USA","Victor"),
("Victor","Russia","Joan"))
lf = line.format # <=== wit, direct access to the right method
print '\n\n'.join(lf(*tu) for tu in tus)
result
Hi my name is "Joan".
I am from "USA".
You must be "Victor".
Hi my name is "Victor".
I am from "Russia".
You must be "Joan".

Categories

Resources