How to read particular key value data from dictionary using python [duplicate] - python

This question already has answers here:
Convert a String representation of a Dictionary to a dictionary
(11 answers)
Closed 1 year ago.
I have a file from which I need to extract the particular dictionary value
Data is in below format in file:
{'name': 'xyz', 'age': 14, 'country': 'india'}
My code:
var = 'country'
with open('abc.txt', 'r') as fw:
first_line = fw.readline()
dictvalue = first_line[var]
print(dictvalue)
But this is not fetching value : india, it is throwing error: string indices must be integer

Because first_line=fw.readline() returns string, not dict. You can convert string to dict, using ast module:
import ast
var='country'
with open('abc.txt','r') as fw:
first_line=fw.readline()
dictvalue= ast.literal_eval(first_line)[var]
print(dictvalue)
Also you would need to format your file, because india should be in within single quote
{'name': 'xyz','age': 14,'country': 'india'}
Output:
india
Convert a String representation of a Dictionary to a dictionary

in this line of code,
first_line=fw.readline()
first_line is read as string ie., "{'name':'xyz','age':14,'country':'india'}"
Solution 1:
You can make use of eval.
mydict = eval(first_line)
print(mydict[var])
#'india'
This works, but you should avoid using eval and exec functions, because it is considered as "dangerous" function in python. You can refer this for more on this topic.
Solution 2 (Recommended):
Use Json module to read/write dict objects.
import json
data = {'name':'xyz','age':14,'country':'india'}
#save dict as 'abc.txt' file. alternately use 'abc.json' to save as JSON file.
json.dump(data, open('abc.txt','w'))
read_data = json.load(open('abc.txt', 'r'))
print (read_data)
#{'name': 'xyz', 'age': 14, 'country': 'india'}

Related

Is it possible to convert unquoted JSON to either CSV or appropriate JSON format using pyspark

I am using pyspark in which I am extacting the required string from log files which is a JSON string but without the quotes. Below is the example:
{PlatformVersion=123,PlatformClient=html,namespace=NAT}
I want to convert it to either CSV or JSON as I want to further store it into relation DB using data pipelines. Is there a way to achieve converting such string to CSV or JSON?
The below is doing the job.
Steps:
remove curly braces
split by ,
split by =
populate the dict with key & value
result = {}
log_line = '{PlatformVersion=123,PlatformClient=html,namespace=NAT}'
log_line = log_line[1:-1]
parts = log_line.split(',')
for part in parts:
k,v = part.split('=')
result[k] = v
print(result)
output
{'PlatformVersion': '123', 'PlatformClient': 'html', 'namespace': 'NAT'}

Finding multiple occuring of word in string in Python

So I have a string that contains data below
https://myanimelist.net/animelist/domis1/load.json?status=2&offset=0.
I want to find all 'anime_id' and put them into the list (only numbers).
I tried with find('anime_id'), but I can't do this for multiple occurings in the string.
Here is an example, how to extract anime_id from a json file called test.json, using built-in json module:
import json
with open('test.json') as f:
data = json.load(f)
# Create generator and search for anime_id
gen = (i['anime_id'] for i in data)
# If needed, iterate over generator and create a list
gen_list = list(gen)
# Print list on console
print(gen_list)
Your string is in json format, you can parse it with the builtin json module.
import json
data = json.loads(your_string)
for d in data:
print(d["anime_id"])

How could I read a dictionary that contains a function from a text file?

I want to read a dictionary from a text file. This dictionary seems like {'key': [1, ord('#')]}. I read about eval() and literal_eval(), but none of those two will work due to ord().
I also tried json.loads and json.dumps, but no positive results.
Which other way could I use to do it?
So Assuming you read the text file in with open as a string and not with json.loads you could do some simple regex searching for what is between the parenthesis of ord e.g ord('#') -> #
This is a minimal solution that reads everything from the file as a single string then finds all instances of ord and places the integer representation in an output list called ord_. For testing this example myfile.txt was a text file with the following in it
{"key": [1, "ord('#')"],
"key2": [1, "ord('K')"]}
import json
import re
with open(r"myfile.txt") as f:
json_ = "".join([line.rstrip("\n") for line in f])
rgx = re.compile(r"ord\(([^\)]+)\)")
rgd = rgx.findall(json_)
ord_ = [ord(str_.replace(r"'", "")) for str_ in rgd]
json.dump() and json.load() will not work because ord() is not JSON Serializable (meaning that the function cannot be a JSON object.
Yes, eval is really bad practice, I would never recommend it to anyone for any use.
The best way I can think of to solve this is to use conditions and an extra list.
# data.json = {'key': [1, ['ord', '#']]} # first one is function name, second is arg
with open("data.json") as f:
data = json.load(f)
# data['key'][1][0] is "ord"
if data['key'][1][0] == "ord":
res = ord(data['key'][1][1])

how to print after the keyword from python?

i have following string in python
b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
I want to print the all alphabet next to keyword "name" such that my output should be
waqas
Note the waqas can be changed to any number so i want print any name next to keyword name using string operation or regex?
First you need to decode the string since it is binary b. Then use literal eval to make the dictionary, then you can access by key
>>> s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
>>> import ast
>>> ast.literal_eval(s.decode())['name']
'waqas'
It is likely you should be reading your data into your program in a different manner than you are doing now.
If I assume your data is inside a JSON file, try something like the following, using the built-in json module:
import json
with open(filename) as fp:
data = json.load(fp)
print(data['name'])
if you want a more algorithmic way to extract the value of name:
s = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a",\
"persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],\
"name":"waqas"}'
s = s.decode("utf-8")
key = '"name":"'
start = s.find(key) + len(key)
stop = s.find('"', start + 1)
extracted_string = s[start : stop]
print(extracted_string)
output
waqas
You can convert the string into a dictionary with json.loads()
import json
mystring = b'{"personId":"65a83de6-b512-4410-81d2-ada57f18112a","persistedFaceIds":["792b31df-403f-4378-911b-8c06c06be8fa"],"name":"waqas"}'
mydict = json.loads(mystring)
print(mydict["name"])
# output 'waqas'
First you need to convert the string into a proper JSON Format by removing b from the string using substring in python suppose you have a variable x :
import json
x = x[1:];
dict = json.loads(x) //convert JSON string into dictionary
print(dict["name"])

Trying to write a list of dictionaries to csv in Python, running into encoding issues

So I am running into an encoding problem stemming from writing dictionaries to csv in Python.
Here is an example code:
import csv
some_list = ['jalape\xc3\xb1o']
with open('test_encode_output.csv', 'wb') as csvfile:
output_file = csv.writer(csvfile)
for item in some_list:
output_file.writerow([item])
This works perfectly fine and gives me a csv file with "jalapeƱo" written in it.
However, when I create a list of dictionaries with values that contain such UTF-8 characters...
import csv
some_list = [{'main': ['4 dried ancho chile peppers, stems, veins
and seeds removed']}, {'main': ['2 jalape\xc3\xb1o
peppers, seeded and chopped', '1 dash salt']}]
with open('test_encode_output.csv', 'wb') as csvfile:
output_file = csv.writer(csvfile)
for item in some_list:
output_file.writerow([item])
I just get a csv file with 2 rows with the following entries:
{'main': ['4 dried ancho chile peppers, stems, veins and seeds removed']}
{'main': ['2 jalape\xc3\xb1o peppers, seeded and chopped', '1 dash salt']}
I know I have my stuff written in the right encoding, but because they aren't strings, when they are written out by csv.writer, they are written as-is. This is frustrating. I searched for some similar questions on here and people have mentioned using csv.DictWriter but that wouldn't really work well for me because my list of dictionaries aren't all just with 1 key 'main'. Some have other keys like 'toppings', 'crust', etc. Not just that, I'm still doing more work on them where the eventual output is to have the ingredients formatted in amount, unit, ingredient, so I will end up with a list of dictionaries like
[{'main': {'amount': ['4'], 'unit': [''],
'ingredient': ['dried ancho chile peppers']}},
{'topping': {'amount': ['1'], 'unit': ['pump'],
'ingredient': ['cool whip']}, 'filling':
{'amount': ['2'], 'unit': ['cups'],
'ingredient': ['strawberry jam']}}]
Seriously, any help would be greatly appreciated, else I'd have to use a find and replace in LibreOffice to fix all those \x** UTF-8 encodings.
Thank you!
You are writing dictionaries to the CSV file, while .writerow() expects lists with singular values that are turned into strings on writing.
Don't write dictionaries, these are turned into string representations, as you've discovered.
You need to determine how the keys and / or values of each dictionary are to be turned into columns, where each column is a single primitive value.
If, for example, you only want to write the main key (if present) then do so:
with open('test_encode_output.csv', 'wb') as csvfile:
output_file = csv.writer(csvfile)
for item in some_list:
if 'main' in item:
output_file.writerow(item['main'])
where it is assumed that the value associated with the 'main' key is always a list of values.
If you wanted to persist dictionaries with Unicode values, then you are using the wrong tool. CSV is a flat data format, just rows and primitive columns. Use a tool that can preserve the right amount of information instead.
For dictionaries with string keys, lists, numbers and unicode text, you can use JSON, or you can use pickle if more complex and custom data types are involved. When using JSON, you do want to either decode from byte strings to Python Unicode values, or always use UTF-8-encoded byte strings, or state how the json library should handle string encoding for you with the encoding keyword:
import json
with open('data.json', 'w') as jsonfile:
json.dump(some_list, jsonfile, encoding='utf8')
because JSON strings are always unicode values. The default for encoding is utf8 but I added it here for clarity.
Loading the data again:
with open('data.json', 'r') as jsonfile:
some_list = json.load(jsonfile)
Note that this will return unicode strings, not strings encoded to UTF8.
The pickle module works much the same way, but the data format is not human-readable:
import pickle
# store
with open('data.pickle', 'wb') as pfile:
pickle.dump(some_list, pfile)
# load
with open('data.pickle', 'rb') as pfile:
some_list = pickle.load(pfile)
pickle will return your data exactly as you stored it. Byte strings remain byte strings, unicode values would be restored as unicode.
As you see in your output, you've used a dictionary so if you want that string to be processed you have to write this:
import csv
some_list = [{'main': ['4 dried ancho chile peppers, stems, veins', '\xc2\xa0\xc2\xa0\xc2\xa0 and seeds removed']}, {'main': ['2 jalape\xc3\xb1o peppers, seeded and chopped', '1 dash salt']}]
with open('test_encode_output.csv', 'wb') as csvfile:
output_file = csv.writer(csvfile)
for item in some_list:
output_file.writerow(item['main']) #so instead of [item], we use item['main']
I understand that this is possibly not the code you want as it limits you to call every key main but at least it gets processed now.
You might want to formulate what you want to do a bit better as now it is not really clear (at least to me). For example do you want a csv file that gives you main in the first cell and then 4 dried ...

Categories

Resources