I have a text file that looks like this
{'tableName': 'customer', 'type': 'VIEW'}
{'tableName': 'supplier', 'type': 'TABLE'}
{'tableName': 'owner', 'type': 'VIEW'}
I want to read it into a python program that stores it into a list of dictonaries like this
expectedOutput=[{'tableName': 'customer', 'type': 'VIEW'},{'tableName': 'supplier', 'type': 'TABLE'},{'tableName': 'owner', 'type': 'VIEW'}]
But the output I get is a list of strings
output = ["{'tableName': 'customer', 'type': 'VIEW'}",
"{'tableName': 'supplier', 'type': 'TABLE'}",
"{'tableName': 'owner', 'type': 'VIEW'}"]
The code I run is
my_file3 = open("textFiles/python.txt", "r")
data3 = my_file3.read()
output = data3.split("\n")
Can someone show me how do I store the entries inside the list as dicts and not strings.
Thank you
You can use eval but it can be dangerous (only do this if you trust the file):
my_file3 = open("textFiles/python.txt") # specifying 'r' is unnecessary
data3 = my_file3.read()
output = [eval(line) for line in data3.splitlines()] # use splitlines() rather than split('\n')
If the file contains something like __import__('shutil').rmtree('/') it could be very dangerous. Read the documentation for eval here
If you don't fully trust the file, use ast.literal_eval:
import ast
my_file3 = open("textFiles/python.txt")
data3 = my_file3.read()
output = [ast.literal_eval(line) for line in data3.splitlines()]
This removes the risk - if the file contains something like an import, it will raise a ValueError: malformed node or string. Read the documentation for ast.literal_eval here
Output:
[{'tableName': 'customer', 'type': 'VIEW'},
{'tableName': 'supplier', 'type': 'TABLE'},
{'tableName': 'owner', 'type': 'VIEW'}]
You can use the json module
import json
my_file3 = open("textFiles/python.txt", "r")
data3 = my_file3.read()
output = json.loads(str(data3.splitlines()))
print(output)
As Thonnu warned, eval is quite dangerous
I am trying to run this code where data of a dictionary is saved in a separate csv file.
Here is the dict:
body = {
'dont-ask-for-email': 0,
'action': 'submit_user_review',
'post_id': 76196,
'email': email_random(),
'subscribe': 1,
'previous_hosting_id': prev_hosting_comp_random(),
'fb_token': '',
'title': review_title_random(),
'summary': summary_random(),
'score_pricing': star_random(),
'score_userfriendly': star_random(),
'score_support': star_random(),
'score_features': star_random(),
'hosting_type': hosting_type_random(),
'author': name_random(),
'social_link': '',
'site': '',
'screenshot[image][]': '',
'screenshot[description][]': '',
'user_data_process_agreement': 1,
'user_email_popup': '',
'subscribe_popup': 1,
'email_asked': 1
}
Now this is the code to write in a CSV file and finally save it:
columns = []
rows = []
chunks = body.split('}')
for chunk in chunks:
row = []
if len(chunk)>1:
entry = chunk.replace('{','').strip().split(',')
for e in entry:
item = e.strip().split(':')
if len(item)==2:
row.append(item[1])
if chunks.index(chunk)==0:
columns.append(item[0])
rows.append(row)
df = pd.DataFrame(rows, columns = columns)
df.head()
df.to_csv ('r3edata.csv', index = False, header = True)
but this is the error I get:
Traceback (most recent call last):
File "codeOffshoreupdated.py", line 125, in <module>
chunks = body.split('}')
AttributeError: 'dict' object has no attribute 'split'
I know that dict has no attribute named split but how do I fix it?
Edit:
format of the CSV I want:
dont-ask-for-email, action, post_id, email, subscribe, previous_hosting_id, fb_token, title, summary, score_pricing, score_userfriendly, score_support, score_features, hosting_type,author, social_link, site, screenshot[image][],screenshot[description][],user_data_process_agreement,user_email_popup,subscribe_popup,email_asked
0,'submit_user_review',76196,email_random(),1,prev_hosting_comp_random(),,review_title_random(),summary_random(),star_random(),star_random(),star_random(),star_random(),hosting_type_random(),name_random(),,,,,1,,1,1
Note: all these functions mentioned are return values
Edit2:
I am picking emails from the email_random() function like this:
def email_random():
with open('emaillist.txt') as emails:
read_emails = csv.reader(emails, delimiter = '\n')
return random.choice(list(read_emails))[0]
and the emaillist.txt is like this:
xyz#gmail.com
xya#gmail.com
xyb#gmail.com
xyc#gmail.com
xyd#gmail.com
other functions are also picking the data from the files like this too.
Since body is a dictionary, you don't have to a any manual parsing to get it into a CSV format.
If you want the function calls (like email_random()) to be written into the CSV as such, you need to wrap them into quotes (as I have done below). If you want them to resolve as function calls and write the results, you can keep them as they are.
import csv
def email_random():
return "john#example.com"
body = {
'dont-ask-for-email': 0,
'action': 'submit_user_review',
'post_id': 76196,
'email': email_random(),
'subscribe': 1,
'previous_hosting_id': "prev_hosting_comp_random()",
'fb_token': '',
'title': "review_title_random()",
'summary': "summary_random()",
'score_pricing': "star_random()",
'score_userfriendly': "star_random()",
'score_support': "star_random()",
'score_features': "star_random()",
'hosting_type': "hosting_type_random()",
'author': "name_random()",
'social_link': '',
'site': '',
'screenshot[image][]': '',
'screenshot[description][]': '',
'user_data_process_agreement': 1,
'user_email_popup': '',
'subscribe_popup': 1,
'email_asked': 1
}
with open('example.csv', 'w') as fhandle:
writer = csv.writer(fhandle)
items = body.items()
writer.writerow([key for key, value in items])
writer.writerow([value for key, value in items])
What we do here is:
with open('example.csv', 'w') as fhandle:
this opens a new file (named example.csv) with writing permissions ('w') and stores the reference into variable fhandle. If using with is not familiar to you, you can learn more about them from this PEP.
body.items() will return an iterable of tuples (this is done to guarantee dictionary items are returned in the same order). The output of this will look like [('dont-ask-for-email', 0), ('action', 'submit_user_review'), ...].
We can then write first all the keys using a list comprehension and to the next row, we write all the values.
This results in
dont-ask-for-email,action,post_id,email,subscribe,previous_hosting_id,fb_token,title,summary,score_pricing,score_userfriendly,score_support,score_features,hosting_type,author,social_link,site,screenshot[image][],screenshot[description][],user_data_process_agreement,user_email_popup,subscribe_popup,email_asked
0,submit_user_review,76196,john#example.com,1,prev_hosting_comp_random(),,review_title_random(),summary_random(),star_random(),star_random(),star_random(),star_random(),hosting_type_random(),name_random(),,,,,1,,1,1
I've got a json file with 30-ish, blocks of "dicts" where every block has and ID, like this:
{
"ID": "23926695",
"webpage_url": "https://.com",
"logo_url": null,
"headline": "aewafs",
"application_deadline": "2020-03-31T23:59:59",
}
Since my script pulls information in the same way from an API more than once, I would like to append new "blocks" to the json file only if the ID doesn't already exist in the JSON file.
I've got something like this so far:
import os
check_empty = os.stat('pbdb.json').st_size
if check_empty == 0:
with open('pbdb.json', 'w') as f:
f.write('[\n]') # Writes '[' then linebreaks with '\n' and writes ']'
output = json.load(open("pbdb.json"))
for i in jobs:
output.append({
'ID': job_id,
'Title': jobtitle,
'Employer' : company,
'Employment type' : emptype,
'Fulltime' : tid,
'Deadline' : deadline,
'Link' : webpage
})
with open('pbdb.json', 'w') as job_data_file:
json.dump(output, job_data_file)
but I would like to only do the "output.append" part if the ID doesn't exist in the Json file.
I am not able to complete the code you provided but I added an example to show how you can achieve the none duplicate list of jobs(hopefully it helps):
# suppose `data` is you input data with duplicate ids included
data = [{'id': 1, 'name': 'john'}, {'id': 1, 'name': 'mary'}, {'id': 2, 'name': 'george'}]
# using dictionary comprehension you can eliminate the duplicates and finally get the results by calling the `values` method on dict.
noduplicate = list({itm['id']:itm for itm in data}.values())
with open('pbdb.json', 'w') as job_data_file:
json.dump(noduplicate, job_data_file)
I'll just go with a database guys, thank you for your time, we can close this thread now
I am a fairly new dev and trying to parse "id" values from this file. Running into the issue below.
My python code:
import ast
from pathlib import Path
file = Path.home() /'AppData'/'Roaming'/'user-preferences-prod'
with open(file, 'r') as f:
contents = f.read()
ids = ast.literal_eval(contents)
profileids = []
for data in ids:
test= data.get('id')
profileids.append(test)
print(profileids))
This returns the error: ValueError: malformed node or string: <_ast.Name object at 0x0000023D8DA4D2E8> at ids = ast.literal_eval(contents)
A snippet of the content in my file of interest:
{"settings":{"defaults":{"value1":,"value2":,"value3":null,"value4":null,"proxyid":null,"sites":{},"sizes":[],"value5":false},"value6":true,"value11":,"user":{"value9":"","value8": ,"value7":"","value10":""},"webhook":"},'profiles':[{'billing': {'address1': '', 'address2': '', 'city': '', 'country': 'United States', 'firstName': '', 'lastName': '', 'phone': '', 'postalCode': '', 'province': '', 'usesBillingInformation': False}, 'createdAt': 123231231213212, 'id': '23123123123213, 'name': ''
I need this code to be looped as there are multiple id values that I am interested in and need them all to be entered into a list.Hopefully I explained it all. the file type is "file" according to windows, I just view its contents with notepad.
It appears to me that you have a file with a string representation of a dict (dictionary). So, what you need to do is:
string_of_dict →ast.literal_eval()→ dict
Open file and read in the text into a string variable. Currently I think this string is going into ids.
Then convert the string representation of dict into a dict using ast library as shown below. Reference
import ast
string_of_dict = "{'muffin' : 'lolz', 'foo' : 'kitty'}"
ast.literal_eval(string_of_dict)
Output:
{'muffin': 'lolz', 'foo': 'kitty'}
Solution
Something like this should most likely work. You may have to tweak it a little bit.
import ast
with open(file, 'r') as f:
contents = f.read()
ids = ast.literal_eval(contents)
profileids = []
for data in ids:
test= data.get('id')
profileids.append(test)
print(profileids)
I am struggling with figuring out the best way to loop through a function. The output of this API is a Graph Connection and I am a-little out of my element. I really need to obtain ID's from an api output and have them in a dict or some sort of form that I can pass to another API call.
**** It is important to note that the original output is a graph connection.... print(type(api_response) does show it as a list however, if I do a print(type(api_response[0])) it returns a
This is the original output from the api call:
[{'_from': None, 'to': {'id': '5c9941fcdd2eeb6a6787916e', 'type': 'user'}}, {'_from': None, 'to': {'id': '5cc9055fcc5781152ca6eeb8', 'type': 'user'}}, {'_from': None, 'to': {'id': '5d1cf102c94c052cf1bfb3cc', 'type': 'user'}}]
This is the code that I have up to this point.....
api_response = api_instance.graph_user_group_members_list(group_id, content_type, accept,limit=limit, skip=skip, x_org_id=x_org_id)
def extract_id(result):
result = str(result).split(' ')
for i, r in enumerate(result):
if 'id' in r:
id = (result[i+1].translate(str.maketrans('', '', string.punctuation)))
print( id )
return id
extract_id(api_response)
def extract_id(result):
result = str(result).split(' ')
for i, r in enumerate(result):
if 'id' in r:
id = (result[i+8].translate(str.maketrans('', '', string.punctuation)))
print( id )
return id
extract_id(api_response)
def extract_id(result):
result = str(result).split(' ')
for i, r in enumerate(result):
if 'id' in r:
id = (result[i+15].translate(str.maketrans('', '', string.punctuation)))
print( id )
return id
extract_id(api_response)
I have been able to use a function to extract the ID's but I am doing so through a string. I am in need of a scalable solution that I can use to pass these ID's along to another API call.
I have tried to use a for loop but because it is 1 string and i+1 defines the id's position, it is redundant and just outputs 1 of the id's multiple times.
I am receiving the correct output using each of these functions however, it is not scalable..... and just is not a solution. Please help guide me......
So to solve the response as a string issue I would suggest using python's builtin json module. Specifically, the method .loads() can convert a string to a dict or list of dicts. From there you can iterate over the list or dict and check if the key is equal to 'id'. Here's an example based on what you said the response would look like.
import json
s = "[{'_from': None, 'to': {'id': '5c9941fcdd2eeb6a6787916e', 'type': 'user'}}, {'_from': None, 'to': {'id': '5cc9055fcc5781152ca6eeb8', 'type': 'user'}}, {'_from': None, 'to': {'id': '5d1cf102c94c052cf1bfb3cc', 'type': 'user'}}]"
# json uses double quotes and null; there is probably a better way to do this though
s = s.replace("\'", '\"').replace('None', 'null')
response = json.loads(s) # list of dicts
for d in response:
for key, value in d['to'].items():
if key == 'id':
print(value) # or whatever else you want to do
# 5c9941fcdd2eeb6a6787916e
# 5cc9055fcc5781152ca6eeb8
# 5d1cf102c94c052cf1bfb3cc