Json overwriting itself in Python - python

I am giving a try in backend and I am failing in parsing Twitters stream API. I want to create Json file with timestamp, name, tweet and screen name. That seems to be working. But when I try to write it in file - entry overwrites already existing one. It does not continues. I red here that for some outfile.write('\n') worked. Tried and still new entry overwrites previous one
with open('text2', 'w') as outfile:
json.dump({'time': time.time(), 'screenName': screenName, 'text': text, 'name': name}, outfile, indent = 4, sort_keys=True)
outfile.write('\n')

When you open the file use a (append) instead of w (write).
https://docs.python.org/2/library/functions.html#open

Related

How i add another list in json python

my json file :
{
"ali":{"name":"ali","age":23,"email":"his email"},
"joe":{"name":"joe","age":55,"email":"his email"}
}
And my code
name=input("name:")
age=input("age: ")
email=input("email:")
list={}
list[name]={"name":name,"age":age,"email":email}
data=json.dumps(list)
with open ('info.json','a') as f:
f.write(data)
i need method to append anather one (another name)to the json file
any idea?
To update an existing json file you need to read the entire file, make adjustments and write the whole lot back again:
with open ('info.json','r') as f:
data = json.load(f)
name = input("name:")
age = input("age: ")
email = input("email:")
data[name] = {"name":name, "age":age, "email":email}
with open ('info.json','w') as f:
json.dump(data, f)
By the way, there are no lists involved here, just nested dictionaries.
Also, if the user enters a duplicate name, then this code will overwrite the one in the file with updated data. This may, or may not, be what you want.

how to modify use my function to retrieve tweets

hi guys so i am working on a personal project in which i was searching for tweets containing specific keywords. I collected about 100 recent tweets for each of the keywords and saved them to variable x1_tweets, x2_tweets and x3_tweets. The data is basically a list of dictionaries and the fields look like this:
['created_at', 'id', 'id_str', 'text', 'truncated', 'entities', 'metadata', 'source', 'in_reply_to_status_id', 'in_reply_to_status_id_str', 'in_reply_to_user_id', 'in_reply_to_user_id_str', 'in_reply_to_screen_name', 'user', 'geo', 'coordinates', 'place', 'contributors', 'is_quote_status', 'retweet_count', 'favorite_count', 'favorited', 'retweeted', 'lang']
i then wanted to save the tweets(just the text) from each of the variables to json file. for that i defined a function(the function saves a list of dictionaries to a json file, obj being the list of dictionaries and filename being the name i want to save it as):
def save_to_json(obj, filename):
with open(filename, 'w') as fp:
json.dump(obj, fp, indent=4, sort_keys=True)
In order to get only the tweets i implemented the following code:
for i, tweet in enumerate(x1_tweets):
save_to_json(tweet['text'],'bat')
However i have had no success thus far, can anyone please guide me to the right direction? thanks in advance!
edit: I am using twitterAPI
First thing you need to do is change the below code as:
def save_to_json(obj, filename):
with open(filename, 'a') as fp:
json.dump(obj, fp, indent=4, sort_keys=True)
You need to change the mode in which file is open because of the below reason.
w: Opens in write-only mode. The pointer is placed at the beginning of the file and this will overwrite any existing file with the same name. It will create a new file if one with the same name doesn't exist.
a: Opens a file for appending new information to it. The pointer is placed at the end of the file. A new file is created if one with the same name doesn't exist.
Also, there is no meaning of sort_keys as you are only passing a string and not a dict. Similarly, there is no meaning of indent=4 for strings.
If you need some indexing with the tweet text you can use the below code:
tweets = {}
for i, tweet in enumerate(x1_tweets):
tweets[i] = tweet['text']
save_to_json(tweets,'bat.json')
The above code will create a dict with index to the tweet and write to the file once all tweets are processed.

Load JSON file into a dictionary and not string or list

I have created a JSON file after scraping data online with the following simplified code:
for item in range(items_to_scrape)
az_text = []
for n in range(first_web_page, last_web_page):
reviews_html = requests.get(page_link)
tree = fromstring(reviews_html.text)
page_link = base_url + str(n)
review_text_tags = tree.xpath(xpath_1)
for r_text in review_text_tags:
review_text = r_text.text
az_text.append(review_text)
az_reviews = {}
az_reviews[item] = az_text
with open('data.json', 'w') as outfile:
json.dump(az_reviews , outfile)
There might be a better way to create a JSON file with the first key equal to the item number and the second key equal to the list of reviews for that item, however I am currently stuck at opening the JSON file to see the items have been already scraped.
The structure of the JSON file looks like this:
{
"asin": "0439785960",
"reviews": [
"Don’t miss this one!",
"Came in great condition, one of my favorites in the HP series!",
"Don’t know how these books are so good and I’ve never read them until now. Whether you’ve watched the movies or not, read these books"
]
}
The unsuccessful attempt that seems to be closer to the solution is the following:
import json
from pprint import pprint
json_data = open('data.json', 'r').read()
json1_file = json.loads(json_data)
print(type(json1_file))
print(json1_file["asin"])
It returns a string that replicates exactly the result of the print() function I used during the scraping process to check what the JSON file was going to be look like, but I can't access the asins or reviews using json1_file["asin"] or json1_file["reviews"] since the file read is a string and not a dictionary.
TypeError: string indices must be integers
Using the json.load() function I still print the right content, but I have cannot figure out how to access the dictionary-like object from the JSON file to iterate through keys and values.
The following code prints the content of the file, but raises an error (AttributeError: '_io.TextIOWrapper' object has no attribute 'items') when I try to iterate through keys and values:
with open('data.json', 'r') as content:
print(json.load(content))
for key, value in content.items():
print(key, value)
What is wrong with the code above and what should be adjusted to load the file into a dictionary?
string indices must be integers
You're writing out the data as a string, not a dictionary. Remove the dumps, and only dump
with open('data.json', 'w') as outfile:
json.dump(az_reviews, outfile, indent=2, ensure_ascii=False)
what should be adjusted to load the file into a dictionary?
Once you're parsing a JSON object, and not a string, then nothing except maybe not using reads, then loads and rather only json.load
Another problem seems to be that you're overwriting the file on every loop iteration
Instead, you probably want to open one file then loop and write to it afterwards
data = {}
for item in range(items_to_scrape):
pass # add to data
# put all data in one file
with open('data.json', 'w') as f:
json.dump(data, f)
In this scenario, I suggest that you store the asin as a key, with the reviews as values
asin = "123456" # some scraped value
data[asin] = reviews
Or write a unique file for each scrape, which you then must loop over to read them all.
for item in range(items_to_scrape):
data = {}
# add to data
with open('data{}.json'.format(item), 'w') as f:
json.dump(data, f)

Add one more attribute to JSON file

So I have a generated JSON which one looks like ( there is a lot of it just with unique ID )
{
"id": 1,
"name": "name",
"dep": "dep",
"Title": "Title',
"email": "email"
}
I'm trying to do "append" a new field but I get an error
with open('data.json', 'w') as file:
json.dump(write_list, file)
file.close()
with open('data.json', 'w') as json_file:
entry = {'parentId': random.randrange(0, 487, 2)}
json_file.append(entry, json_file)
json_file.close()
It there is some way to add one more "key: value" to it after generating ?
There are two issues:
your are using json.dump to generate a list, but you're not using json.load to re-create the Python data structure.
You're opening the file with the w mode in the second open call, which truncates it
Try breaking each step out into its own and separating mutating data and writing to disk.
with open('data.json', 'w') as file:
json.dump(write_list, file)
#file.close() # manually closing files is unnecessary,
# when using context managers
with open('data.json', 'r') as json_file:
write_list = json.load(json_file)
entry = {'parentId': random.randrange(0, 487, 2)}
write_list.append(entry)
with open('data.json', 'w') as json_file:
json.dump(write_list, file)
The steps to do what you want are as files:
Parse the entire JSON file into a python dictionary.
Add the entry to the dictionary.
Serialize the string back to JSON.
Write the JSON file back to the file.
Also after I done this feature with Tim McNamara advice, I found a more pretty way to add new line to every JSON dict I have in file
for randomID in write_list:
randomID['parentId'] = random.randrange(0, 487, 2)
with open('data.json', 'w') as file:
json.dump(write_list, file)

Converting dictionary as Json and append to a file

Scenario is i need to convert dictionary object as json and write to a file . New Dictionary objects would be sent on every write_to_file() method call and i have to append Json to the file .Following is the code
def write_to_file(self, dict=None):
f = open("/Users/xyz/Desktop/file.json", "w+")
if json.load(f)!= None:
data = json.load(f)
data.update(dict)
f = open("/Users/xyz/Desktop/file.json", "w+")
f.write(json.dumps(data))
else:
f = open("/Users/xyz/Desktop/file.json", "w+")
f.write(json.dumps(dict)
Getting this error "No JSON object could be decoded" and Json is not written to the file. Can anyone help ?
this looks overcomplex and highly buggy. Opening the file several times, in w+ mode, and reading it twice won't get you nowhere but will create an empty file that json won't be able to read.
I would test if the file exists, if so I'm reading it (else create an empty dict).
this default None argument makes no sense. You have to pass a dictionary or the update method won't work. Well, we can skip the update if the object is "falsy".
don't use dict as a variable name
in the end, overwrite the file with a new version of your data (w+ and r+ should be reserved to fixed size/binary files, not text/json/xml files)
Like this:
def write_to_file(self, new_data=None):
# define filename to avoid copy/paste
filename = "/Users/xyz/Desktop/file.json"
data = {} # in case the file doesn't exist yet
if os.path.exists(filename):
with open(filename) as f:
data = json.load(f)
# update data with new_data if non-None/empty
if new_data:
data.update(new_data)
# write the updated dictionary, create file if
# didn't exist
with open(filename,"w") as f:
json.dump(data,f)

Categories

Resources