Take the items of a json file separately

Take the items of a json file separately - python

Hey guys so I am trying to read a Json file and write a specific item of it in a list. But the json file is in single quotes so I get the error.
simplejson.errors.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)
I tried to convert the json file from single quotes to double but it didn't work (I also saw the other stackoverflow questions about this but didn't work for me). Because I tried it with str.replace. or json dumps etc. And it always had a different problem. My code is this:
messages = []
with open("commitsJson.json","r", encoding="utf8") as json_file:
data = json.load(json_file)
for p in data['items']:
messages.append(p['message'])
authors.write(p['message']+"\r\n")
print(p['message'])
So the expected result is to read the json file and write specific items of it into a file or list, etc...
EDIT:
Sample of json file:
{'total_count': 3, 'incomplete_results': False, 'items': [{'url': 'https://gits-20.bkf.sda.eu/api/v3/repos/repo/name/commits/2189312903jsadada',
'sha': '2131932103812jdskfsl', 'node_id': 'asl;dkas;ldjasldasio1203',
'html_url': 'https://gits-20.bkf.sda.eu/api/v3/repos/repo/name/commits/2189312903jsadada',
'comments_url': 'https://gits-20.bkf.sda.eu/api/v3/repos/repo/name/commits/2189312903jsadada',
'commit': {'url': 'https://gits-20.bkf.sda.eu/api/v3/repos/repo/name/commits/2189312903jsadada', 'message': 'Initial commit 1'
Something like that. Basically a github api response but with single quotes instead of double...
Desired output would be to get the 'message' Items of all the json file in to another file like:
Initial commit 1
Initial commit 2
Initial commit 3
Initial commit 4
Initial commit 5
Initial commit 6
Initial commit 7
....
ERROR:

The problem is that json expects double quotes to surround strings
Using ast.literal_eval on the file contents:
commitJson.json:
{
'total_count': 3, 'incomplete_results': False, 'items': [{'url': 'https://gits-20.bkf.sda.eu/api/v3/repos/repo/name/commits/2189312903jsadada',
'sha': '2131932103812jdskfsl', 'node_id': 'asl;dkas;ldjasldasio1203',
'html_url': 'https://gits-20.bkf.sda.eu/api/v3/repos/repo/name/commits/2189312903jsadada',
'comments_url': 'https://gits-20.bkf.sda.eu/api/v3/repos/repo/name/commits/2189312903jsadada',
'commit': {'url': 'https://gits-20.bkf.sda.eu/api/v3/repos/repo/name/commits/2189312903jsadada', 'message': 'Initial commit 1'}}]
}
Hence:
import ast
with open("commitJson.json","r", encoding="utf8") as json_file:
data = ast.literal_eval(json_file.read())
for elem in data['items']:
for e in elem['commit']:
if 'message' in e:
print(elem['commit'][e])
OUTPUT:
Initial commit 1
Shorter-version:
print([elem['commit'][e] for e in elem['commit'] if 'message' in e for elem in data['items']])
OUTPUT:
['Initial commit 1']

Related

AttributeError: 'dict' object has no attribute 'split'

I am trying to run this code where data of a dictionary is saved in a separate csv file.
Here is the dict:
body = {
'dont-ask-for-email': 0,
'action': 'submit_user_review',
'post_id': 76196,
'email': email_random(),
'subscribe': 1,
'previous_hosting_id': prev_hosting_comp_random(),
'fb_token': '',
'title': review_title_random(),
'summary': summary_random(),
'score_pricing': star_random(),
'score_userfriendly': star_random(),
'score_support': star_random(),
'score_features': star_random(),
'hosting_type': hosting_type_random(),
'author': name_random(),
'social_link': '',
'site': '',
'screenshot[image][]': '',
'screenshot[description][]': '',
'user_data_process_agreement': 1,
'user_email_popup': '',
'subscribe_popup': 1,
'email_asked': 1
}
Now this is the code to write in a CSV file and finally save it:
columns = []
rows = []
chunks = body.split('}')
for chunk in chunks:
row = []
if len(chunk)>1:
entry = chunk.replace('{','').strip().split(',')
for e in entry:
item = e.strip().split(':')
if len(item)==2:
row.append(item[1])
if chunks.index(chunk)==0:
columns.append(item[0])
rows.append(row)
df = pd.DataFrame(rows, columns = columns)
df.head()
df.to_csv ('r3edata.csv', index = False, header = True)
but this is the error I get:
Traceback (most recent call last):
File "codeOffshoreupdated.py", line 125, in <module>
chunks = body.split('}')
AttributeError: 'dict' object has no attribute 'split'
I know that dict has no attribute named split but how do I fix it?
Edit:
format of the CSV I want:
dont-ask-for-email, action, post_id, email, subscribe, previous_hosting_id, fb_token, title, summary, score_pricing, score_userfriendly, score_support, score_features, hosting_type,author, social_link, site, screenshot[image][],screenshot[description][],user_data_process_agreement,user_email_popup,subscribe_popup,email_asked
0,'submit_user_review',76196,email_random(),1,prev_hosting_comp_random(),,review_title_random(),summary_random(),star_random(),star_random(),star_random(),star_random(),hosting_type_random(),name_random(),,,,,1,,1,1
Note: all these functions mentioned are return values
Edit2:
I am picking emails from the email_random() function like this:
def email_random():
with open('emaillist.txt') as emails:
read_emails = csv.reader(emails, delimiter = '\n')
return random.choice(list(read_emails))[0]
and the emaillist.txt is like this:
xyz#gmail.com
xya#gmail.com
xyb#gmail.com
xyc#gmail.com
xyd#gmail.com
other functions are also picking the data from the files like this too.

Since body is a dictionary, you don't have to a any manual parsing to get it into a CSV format.
If you want the function calls (like email_random()) to be written into the CSV as such, you need to wrap them into quotes (as I have done below). If you want them to resolve as function calls and write the results, you can keep them as they are.
import csv
def email_random():
return "john#example.com"
body = {
'dont-ask-for-email': 0,
'action': 'submit_user_review',
'post_id': 76196,
'email': email_random(),
'subscribe': 1,
'previous_hosting_id': "prev_hosting_comp_random()",
'fb_token': '',
'title': "review_title_random()",
'summary': "summary_random()",
'score_pricing': "star_random()",
'score_userfriendly': "star_random()",
'score_support': "star_random()",
'score_features': "star_random()",
'hosting_type': "hosting_type_random()",
'author': "name_random()",
'social_link': '',
'site': '',
'screenshot[image][]': '',
'screenshot[description][]': '',
'user_data_process_agreement': 1,
'user_email_popup': '',
'subscribe_popup': 1,
'email_asked': 1
}
with open('example.csv', 'w') as fhandle:
writer = csv.writer(fhandle)
items = body.items()
writer.writerow([key for key, value in items])
writer.writerow([value for key, value in items])
What we do here is:
with open('example.csv', 'w') as fhandle:
this opens a new file (named example.csv) with writing permissions ('w') and stores the reference into variable fhandle. If using with is not familiar to you, you can learn more about them from this PEP.
body.items() will return an iterable of tuples (this is done to guarantee dictionary items are returned in the same order). The output of this will look like [('dont-ask-for-email', 0), ('action', 'submit_user_review'), ...].
We can then write first all the keys using a list comprehension and to the next row, we write all the values.
This results in
dont-ask-for-email,action,post_id,email,subscribe,previous_hosting_id,fb_token,title,summary,score_pricing,score_userfriendly,score_support,score_features,hosting_type,author,social_link,site,screenshot[image][],screenshot[description][],user_data_process_agreement,user_email_popup,subscribe_popup,email_asked
0,submit_user_review,76196,john#example.com,1,prev_hosting_comp_random(),,review_title_random(),summary_random(),star_random(),star_random(),star_random(),star_random(),hosting_type_random(),name_random(),,,,,1,,1,1

Python JSON append if value doesn't exist

I've got a json file with 30-ish, blocks of "dicts" where every block has and ID, like this:
{
"ID": "23926695",
"webpage_url": "https://.com",
"logo_url": null,
"headline": "aewafs",
"application_deadline": "2020-03-31T23:59:59",
}
Since my script pulls information in the same way from an API more than once, I would like to append new "blocks" to the json file only if the ID doesn't already exist in the JSON file.
I've got something like this so far:
import os
check_empty = os.stat('pbdb.json').st_size
if check_empty == 0:
with open('pbdb.json', 'w') as f:
f.write('[\n]') # Writes '[' then linebreaks with '\n' and writes ']'
output = json.load(open("pbdb.json"))
for i in jobs:
output.append({
'ID': job_id,
'Title': jobtitle,
'Employer' : company,
'Employment type' : emptype,
'Fulltime' : tid,
'Deadline' : deadline,
'Link' : webpage
})
with open('pbdb.json', 'w') as job_data_file:
json.dump(output, job_data_file)
but I would like to only do the "output.append" part if the ID doesn't exist in the Json file.

I am not able to complete the code you provided but I added an example to show how you can achieve the none duplicate list of jobs(hopefully it helps):
# suppose `data` is you input data with duplicate ids included
data = [{'id': 1, 'name': 'john'}, {'id': 1, 'name': 'mary'}, {'id': 2, 'name': 'george'}]
# using dictionary comprehension you can eliminate the duplicates and finally get the results by calling the `values` method on dict.
noduplicate = list({itm['id']:itm for itm in data}.values())
with open('pbdb.json', 'w') as job_data_file:
json.dump(noduplicate, job_data_file)

I'll just go with a database guys, thank you for your time, we can close this thread now

JDownloader API json.decoder.JSONDecodeError

I am using the python API of JDownloader myjdapi
With the device.linkgrabber.query_links() I got the following object:
{'enabled': True, 'name': 'EQJ_X8gUcAMQX13.jpg', 'packageUUID': 1581524887390, 'uuid': 1581524890696, 'url': 'https://pbs.twimg.com/media/x.jpg?name=orig', 'availability': 'ONLINE'}
Now I want to move to the download list with the function:
device.linkgrabber.move_to_downloadlist('1581524890696', '1581524887390')
The move_to_downloadlist function (githubrepo) says:
def move_to_downloadlist(self, link_ids, package_ids):
"""
Moves packages and/or links to download list.
:param package_ids: Package UUID's.
:type: list of strings.
:param link_ids: Link UUID's.
"""
params = [link_ids, package_ids]
resp = self.device.action(self.url + "/moveToDownloadlist", params)
return resp
But I get always json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
The official API said its a 200 Error, and the reason can be anything.
How I can fix that?

The parameter names are link_ids and package_ids, that's plural. That would be a good indication that lists are expected here, not single values.
Try this:
device.linkgrabber.move_to_downloadlist(['1581524890696'], ['1581524887390'])

Regarding Json load and dump

I am trying to substitute a value using safe substitute. Before this, I am converting an array using JSON dumps and then substituting. Once the substitution is done I am doing JSON loads and passing as a parameter to other utility. While doing this I am getting an error for JSON loads. Below is the code...
account_id={'ABC123', user_id='testing'}
var1 = {'account':account_id, 'user':user_id}
response = json.dumps(var1)
payload = Template.(test_template).safe_substitute(var1=var1)
output = json.loads(payload)
get an error when it comes to loads:
Expecting "," delimiter: line 1 column 448 (char 447)

It's seems to be a syntax error. Try, like below
account_id='ABC123'
user_id='testing'
var1 = {'account':account_id, 'user':user_id}
response = json.dumps(var1)
print(response)
# out: '{"account": "ABC123", "user": "testing"}'
output = json.loads(response)
print(output)
# out: {'user': 'testing', 'account': 'ABC123'}

python re extract items within curly brakets

I have a large dataset with such as in my sql such as:
("Successfully confirmed payment - {'PAYMENTINFO_0_TRANSACTIONTYPE': ['expresscheckout'], 'ACK': ['Success'], 'PAYMENTINFO_0_PAYMENTTYPE': ['instant'], 'PAYMENTINFO_0_RECEIPTID': ['1037-5147-8706-9322'], 'PAYMENTINFO_0_REASONCODE': ['None'], 'SHIPPINGOPTIONISDEFAULT': ['false'], 'INSURANCEOPTIONSELECTED': ['false'], 'CORRELATIONID': ['1917b2c0e5a51'], 'PAYMENTINFO_0_TAXAMT': ['0.00'], 'PAYMENTINFO_0_TRANSACTIONID': ['3U4531424V959583R'], 'PAYMENTINFO_0_ACK': ['Success'], 'PAYMENTINFO_0_PENDINGREASON': ['authorization'], 'PAYMENTINFO_0_AMT': ['245.40'], 'PAYMENTINFO_0_PROTECTIONELIGIBILITY': ['Eligible'], 'PAYMENTINFO_0_ERRORCODE': ['0'], 'TOKEN': ['EC-82295469MY6979044'], 'VERSION': ['95.0'], 'SUCCESSPAGEREDIRECTREQUESTED': ['true'], 'BUILD': ['7507921'], 'PAYMENTINFO_0_CURRENCYCODE': ['GBP'], 'TIMESTAMP': ['2013-08-29T09:15:59Z'], 'PAYMENTINFO_0_SECUREMERCHANTACCOUNTID': ['XFQALBN3EBE8S'], 'PAYMENTINFO_0_PROTECTIONELIGIBILITYTYPE': ['ItemNotReceivedEligible,UnauthorizedPaymentEligible'], 'PAYMENTINFO_0_ORDERTIME': ['2013-08-29T09:15:59Z'], 'PAYMENTINFO_0_PAYMENTSTATUS': ['Pending']}", 1L, datetime.datetime(2013, 8, 29, 11, 15, 59))
I use the following regex to pull the data from the first item list that is within curley brackets
paypal_meta_re = re.compile(r"""\{(.*)\}""").findall
This works as expected, but when I try to remove the square brackets from the dictionary values, I get an error.
here is my code:
paypal_meta = get_paypal(order_id)
paypal_msg_re = paypal_meta_re(paypal_meta[0])
print type(paypal_msg_re), len(paypal_msg_re)
paypal_str = ''.join(map(str, paypal_msg_re))
print paypal_str, type(paypal_str)
paypal = ast.literal_eval(paypal_str)
paypal_dict = {}
for k, v in paypal.items():
paypal_dict[k] = str(v[0])
if paypal_dict:
namespace['payment_gateway'] = { 'paypal' : paypal_dict}
and here is the traceback:
Traceback (most recent call last):
File "users.py", line 383, in <module>
orders = get_orders(user_id, mongo_user_id, address_book_list)
File "users.py", line 290, in get_orders
paypal = ast.literal_eval(paypal_str)
File "/usr/local/Cellar/python/2.7.2/lib/python2.7/ast.py", line 49, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "/usr/local/Cellar/python/2.7.2/lib/python2.7/ast.py", line 37, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 1
'PAYMENTINFO_0_TRANSACTIONTYPE': ['expresscheckout'], 'ACK': ['Success'], 'PAYMENTINFO_0_PAYMENTTYPE': ['instant'], 'PAYMENTINFO_0_RECEIPTID': ['2954-8480-1689-8177'], 'PAYMENTINFO_0_REASONCODE': ['None'], 'SHIPPINGOPTIONISDEFAULT': ['false'], 'INSURANCEOPTIONSELECTED': ['false'], 'CORRELATIONID': ['5f22a1dddd174'], 'PAYMENTINFO_0_TAXAMT': ['0.00'], 'PAYMENTINFO_0_TRANSACTIONID': ['36H74806W7716762Y'], 'PAYMENTINFO_0_ACK': ['Success'], 'PAYMENTINFO_0_PENDINGREASON': ['authorization'], 'PAYMENTINFO_0_AMT': ['86.76'], 'PAYMENTINFO_0_PROTECTIONELIGIBILITY': ['PartiallyEligible'], 'PAYMENTINFO_0_ERRORCODE': ['0'], 'TOKEN': ['EC-6B957889FK3149915'], 'VERSION': ['95.0'], 'SUCCESSPAGEREDIRECTREQUESTED': ['true'], 'BUILD': ['6680107'], 'PAYMENTINFO_0_CURRENCYCODE': ['GBP'], 'TIMESTAMP': ['2013-07-02T13:02:50Z'], 'PAYMENTINFO_0_SECUREMERCHANTACCOUNTID': ['XFQALBN3EBE8S'], 'PAYMENTINFO_0_PROTECTIONELIGIBILITYTYPE': ['ItemNotReceivedEligible'], 'PAYMENTINFO_0_ORDERTIME': ['2013-07-02T13:02:49Z'], 'PAYMENTINFO_0_PAYMENTSTATUS': ['Pending']
^
SyntaxError: invalid syntax
where as if i split the code, using
msg, paypal_msg = paypal_meta[0].split(' - ')
paypal = ast.literal_eval(paypal_msg)
paypal_dict = {}
for k, v in paypal.items():
paypal_dict[k] = str(v[0])
if paypal_dict:
namespace['payment_gateway'] = { 'paypal' : paypal_dict}
insert = orders_dbs.save(namespace)
return insert
This works, but I can't use it, as some of the records returned don't split and is not accurate.
Basically, I want to take the items in the curly brackets and remove the square brackets from the values and then create a new dictionary from that.

You need to include the curly braces, your code omits these:
r"""({.*})""")
Note that the parentheses are now around the {...}.
Alternatively, if there is always a message and one dash before the dictionary, you can use str.partition() to split that off:
paypal_msg = paypal_meta[0].partition(' - ')[-1]
or limit your splitting with str.split() to just once:
paypal_msg = paypal_meta[0].split(' - ', 1)[-1]
Try to avoid putting Python structures like that into the database instead; store JSON in a separate column rather than a string dump of the object.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Take the items of a json file separately - python

Related

AttributeError: 'dict' object has no attribute 'split'

Python JSON append if value doesn't exist

JDownloader API json.decoder.JSONDecodeError

Regarding Json load and dump

python re extract items within curly brakets

Categories

Resources