Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
with open('twit/example.json', encoding='utf8') as json_data:
for line in json_data:
try:
dataText = json.loads(line)
except ValueError:
continue
for a in dataText:
print(a["user"]["location"])
the result is: string indices must be integers
Update: The below answer is for printing
print(dataText["user"]["location"])
now I want this one:
print(a["user"]["location"])
If your json file is in a normal format, use this instead:
with open('twit/example.json', encoding='utf8') as json_data:
dataText = json.loads(line)
for a in dataText:
print(dataText["user"]["location"])
The way your code is currently written makes me think you have multiple json structures in a single file, separated by new lines. This is not how json is usually formatted.
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
in 'id_path' in CSV file i want remove subpath from it such as
dataframe of csv file
i want remove all path before the image file name
./input/skin-cancer-malignant-vs-benign/data/test/benign/454.jpg
./input/skin-cancer-malignant-vs-benign/data/test/benign/90.jpg
./input/skin-cancer-malignant-vs-benign/data/test/benign/147.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/771.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/208.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/1383.jpg
./input/skin-cancer-malignant-vs-benign/data/test/malignant/1354.jpg
the output should be
454.jpg
90.jpg
147.jpg
771.jpg
208.jpg
1383.jpg
1354.jpg
rsplit() splits the data from the right side of the string and 1 is way of saying python to stop after first split.
txt = "./input/skin-cancer-malignant-vs-benign/data/test/benign/454.jpg"
x = txt.rsplit("/",1)
#your answer
print(x[1])
on your dataframe you could do something like:
train_df['id_path'] = train_df['id_path'].apply(lambda x: x.rsplit('/',1)[1])
Using str.replace:
df["filename"] = df["path"].str.replace(r'^.*/', '')
We could also use str.extract here:
df["filename"] = df["path"].str.extract(r'([^/]+\.\S+$)')
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I'm busy constructing a Physics simulator as a school project and need to store a list of shapes in a file to be read in when the program is loaded again. How do I store the list and return it to its original state, especially considering some of the list items are tuples or lists themselves.
This doesn't work:
with open(filename, 'w') as f:
f.write("\n".join(objs))
I expected to be able to write to a file but errors keep springing up as I can't write tuples.
Why not write objs to file as a string?
objs = (1, 2, 3, 4, 5)
with open('filename', 'w') as f:
f.write(str(objs))
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
How can I save the result from this code to .csv format?
import re
import CSV
text = open('example.txt').read()
pattern = r'([0-9]+)[:]([0-9]+)[:](.*)'
regex = re.compile(pattern)
for match in regex.finditer(text):
result = ("{},{}".format(match.group(2),match.group(3)))
If I understood your question correctly, you can generate the CSV as follows:
import re
text = open('example.txt').read()
pattern = r'([0-9]+)[:]([0-9]+)[:](.*)'
regex = re.compile(pattern)
with open('csv_file.csv', 'w') as csv_file:
# Add header row with two columns
csv_file.write('{},{}\n'.format('Id', 'Tile'))
for match in regex.finditer(text):
result = ("{},{}".format(match.group(2),match.group(3)))
csv_file.write('{}\n'.format(result))
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
with open('twit/example.json', encoding='utf8') as json_data:
for line in json_data:
try:
dataText = json.loads(line)
except ValueError:
continue
for a in dataText:
print(a["user"]["location"])
the result is: string indices must be integers
Update: The below answer is for printing
print(dataText["user"]["location"])
now I want this one:
print(a["user"]["location"])
If your json file is in a normal format, use this instead:
with open('twit/example.json', encoding='utf8') as json_data:
dataText = json.loads(line)
for a in dataText:
print(dataText["user"]["location"])
The way your code is currently written makes me think you have multiple json structures in a single file, separated by new lines. This is not how json is usually formatted.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I need to parse a bunch of unformatted text similar to the one below.
those|DT|O considered|VBN|O anarchists|NNS|O at|IN|O best|JJS|O share|NN|O a|DT|O certain|JJ|O family|NN|O resemblance|NN|O .|.|O "|RQU|O
I need to use regular expression to parse the data into a format which would be like this:
The DT I-MISC
certain JJ O
in IN O
the DT B
pound NN I
with open('outfile.txt', 'wb') as outfile, open('infile.txt', 'r') as infile:
[outfile.write(i.replace('|', ' ') + '\n') for i in infile.read().split()]
You basically just want to split by whitespace then replace the | with whitespace correct? That seems to be what you're looking for.
EDIT:
Code now writes to file.
EDIT 2:
Code now reads from a file