I have 3 json files as below:
test1.json:
{"item":"book1","price":"10.00","location":"library"}
test2.json:
{"item":"book2","price":"15.00","location":"store"}
test3.json:
{"item":"book3","price":"9.50","location":"store"}
I have this code:
import json
import glob
result = ''
for f in glob.glob("*.json"):
with open (f, "r") as infile:
result += infile.read()
with open("m1.json", "w") as outfile:
outfile.writelines(result)
I get the following output:
{"item":"book1","price":"10.00","location":"library"}
{"item":"book2","price":"15.00","location":"store"}
{"item":"book3","price":"9.50","location":"store"}
Is it possible to get each file as a new line separated by a comma like below?
{"item":"book1","price":"10.00","location":"library"}, <-- line 1
{"item":"book2","price":"15.00","location":"store"}, <-- line 2
{"item":"book3","price":"9.50","location":"store"} <-- line 3
As others commented, your expected result is invalid json.
But if you really want the format, use str.join() is more convenient.
jsons = []
for f in glob.glob("*.json"):
with open (f, "r") as infile:
jsons.append(infile.read())
result = ',\n'.join(jsons)
infile.read() gets the string, d = json.loads(infile.read()) can get a real json object(dict).
To write a valid combined json(of type list(of dict)), just write s = json.dumps(jsons) to file.
Related
I am trying to write a python script to convert rows in a file to json output, where each line contains a json blob.
My code so far is:
with open( "/Users/me/tmp/events.txt" ) as f:
content = f.readlines()
# strip to remove newlines
lines = [x.strip() for x in content]
i = 1
for line in lines:
filename = "input" + str(i) + ".json"
i += 1
f = open(filename, "w")
f.write(line)
f.close()
However, I am running into an issue where if I have an entry in the file that is quoted, for example:
client:"mac"
This will be output as:
"client:""mac"""
Using a second strip on writing to file will give:
client:""mac
But I want to see:
client:"mac"
Is there any way to force Python to read text in the format ' "something" ' without appending extra quotes around it?
Instead of creating an auxiliary list to strip the newline from content, just open the input and output files at the same time. Write to the output file as you iterate through the lines of the input and stripping whatever you deem necessary. Try something like this:
with open('events.txt', 'rb') as infile, open('input1.json', 'wb') as outfile:
for line in infile:
line = line.strip('"')
outfile.write(line)
Hello I have issues with replacing numbers and words in txt file. I have dict that is written in txt file like this for example: abc|aaa|bbb|ccc.
I want to replace last dict value with new one from input and then save it in same txt file without changing first part of dict. like: abc|aaa|bbb|ddd.
I am getting all same numbers replaced, can't make it to replace specific one.
I forgot the code, sorry.
inputQuantity = input("Quantity: ")
f = open("file.txt", "r")
f1 = f.read()
f2 = f1.replace(book["quantity"], inputQuantity)
f = open("file.txt", "w")
f.write(f2)
f.close()
book["quantity"] = inputQuantity
If your "Dict" always seperated with | you can use this:
with open('file.txt', 'r') as f:
text = f.read().split('|')
'''
now you have a list ['abc','aaa','bbb','ccc']
now can you change 'ccc' like you would change a list
'''
text[3] = 'ddd'
If you want to save a "Python dict" dct = {Key:value} you can use a module like json which is part of the standard libary.
Saving and or loading a dict would look like this:
import json
dct = {'key':'value'}
with open('file.json', 'w') as f:
json.dump(dct, f)
with open('file.json', 'r') as f:
dct = json.load(f)
print(type(dct))
>>> <class 'dict'>
How do i convert a text file seperated by a new line to a csv file.
Text File Sample
/themes/modern/user_style.php?user_colors[bg_color]="</style><script></script>
?<meta http-equiv=set-cookie content="testpokn=7494">
/_37040/
/clr_cream/
reg_req/
trackir3pro-1/
selector_10274/
javascript/orders.html
perlutil/
/carte_ameriques2/
/javascript/count.conf
/glow_chairs/
I want to convert this AS is to a csv file. 1 column, multiple rows. Each line of the text file should be a row in the csv file.
Python
def loadFile(name):
directory = os.getcwd()
filepath = directory + "/" + name
data = open(filepath,'r').readlines()
result = []
for d in data:
d = str(urllib.parse.unquote(d))
result.append(d)
return result
def main():
data = loadFile('code.txt')
with open('new.csv', 'w', newline='') as fp:
a = csv.writer(fp,)
a.writerows(data)
main()
My problem is that for each char it adds a comma
writerows should use a list for list or tuple, like [('bla', ), ('bla', )]
def main():
data = loadFile('code.txt')
with open('new.csv', 'w', newline='') as fp:
a = csv.writer(fp,)
a.writerows([(r, ) for r in data])
The wrong output comes from your usage of writerows. It is supposed to receive an iterable (of lines) of iterables (of columns). As you give it a list of strings, it takes each string as an iterable of characters, hence the output.
A simple fix would be that loadFile returns a list of 1-tuple of strings:
def loadFile(name):
directory = os.getcwd()
filepath = directory + "/" + name
data = open(filepath,'r').readlines()
result = []
for d in data:
d = str(urllib.parse.unquote(d))
result.append((d,)) # result now contains 1-tuple of strings
return result
But anyway, as you process each line separately, if would be much more memory friendly to have this general program structure:
open input file and output csv (with open(...) as ...)
loop over input file by lines (for line in ...:)
compute what the line should become
write it to the output file
I am trying to search a large group of text files (160K) for a specific string that changes for each file. I have a text file that has every file in the directory with the string value I want to search. Basically I want to use python to create a new text file that gives the file name, the string, and a 1 if the string is present and a 0 if it is not.
The approach I am using so far is to create a dictionary from a text file. From there I am stuck. Here is what I figure in pseudo-code:
**assign dictionary**
d = {}
with open('file.txt') as f:
d = dict(x.rstrip().split(None, 1) for x in f)
**loop through directory**
for filename in os.listdir(os.getcwd()):
***here is where I get lost***
match file name to dictionary
look for string
write filename, string, 1 if found
write filename, string, 0 if not found
Thank you. It needs to be somewhat efficient since its a large amount of text to go through.
Here is what I ended up with
d = {}
with open('ibes.txt') as f:
d = dict(x.rstrip().split(None, 1) for x in f)
import os
for filename in os.listdir(os.getcwd()):
string = d.get(filename, "!##$%^&*")
if string in open(filename, 'r').read():
with open("ibes_in.txt", 'a') as out:
out.write("{} {} {}\n".format(filename, string, 1))
else:
with open("ibes_in.txt", 'a') as out:
out.write("{} {} {}\n".format(filename, string, 0))
As I understand your question, the dictionary relates file names to strings
d = {
"file1.txt": "widget",
"file2.txt": "sprocket", #etc
}
If each file is not too large you can read each file into memory:
for filename in os.listdir(os.getcwd()):
string = d[filename]
if string in open(filename, 'r').read():
print(filename, string, "1")
else:
print(filename, string, "0")
This example uses print, but you could write to a file instead. Open the output file before the loop outfile = open("outfile.txt", 'w') and instead of printing use
outfile.write("{} {} {}\n".format(filename, string, 1))
On the other hand, if each file is too large to fit easily into memory, you could use a mmap as described in Search for string in txt file Python
I'm trying to parse the json format data to json.load() method. But it's giving me an error. I tried different methods like reading line by line, convert into dictionary, list, and so on but it isn't working. I also tried the solution mention in the following url loading-and-parsing-a-json but it give's me the same error.
import json
data = []
with open('output.txt','r') as f:
for line in f:
data.append(json.loads(line))
Error:
ValueError: Extra data: line 1 column 71221 - line 1 column 6783824 (char 71220 - 6783823)
Please find the output.txt in the below URL
Content- output.txt
I wrote up the following which will break up your file into one JSON string per line and then go back through it and do what you originally intended. There's certainly room for optimization here, but at least it works as you expected now.
import json
import re
PATTERN = '{"statuses"'
file_as_str = ''
with open('output.txt', 'r+') as f:
file_as_str = f.read()
m = re.finditer(PATTERN, file_as_str)
f.seek(0)
for pos in m:
if pos.start() == 0:
pass
else:
f.seek(pos.start())
f.write('\n{"')
data = []
with open('output.txt','r') as f:
for line in f:
data.append(json.loads(line))
Your alleged JSON file is not a properly formatted JSON file. JSON files must contain exactly one object (a list, a mapping, a number, a string, etc). Your file appears to contain a number of JSON objects in sequence, but not in the correct format for a list.
Your program's JSON parser correctly returns an error condition when presented with this non-JSON data.
Here is a program that will interpret your file:
import json
# Idea and some code stolen from https://gist.github.com/sampsyo/920215
data = []
with open('output.txt') as f:
s = f.read()
decoder = json.JSONDecoder()
while s.strip():
datum, index = decoder.raw_decode(s)
data.append(datum)
s = s[index:]
print len(data)