I'm using python to try and locate and change different parts of a Json file.
I have a list with 2 columns and what I want to do is look for the string in the first column, find it in the Json file and then replace it with the second column in the list.
Does anyone have any idea how to do this? Been driving me mad.
for row in new_list:
if json_str == new_list[row][0]:
json_str.replace(new_list[row][0], new_list[row][1])
I tried using the .replace() above but it says that it list indices must be integers or slices, not list.
The way that I've managed to print off all the data works...
But this is not referencing anything, so if anyone has any ideas, feel free to lend a hand, thanks.
import json
# I import a json file and a text file...
# with open('json file', 'r', encoding="utf8") as jsonData:
# data = json.load(jsonData)
# jsonData.close()
jsonData = {"employees":[
{"firstName":"a"},
{"firstName":"b"},
{"firstName":"c"}
]}
# text_file = open('text file', 'r', encoding="utf8")
# list = text_file.readlines()
# jsonString = str(data)
# The text file contains lots of data like 'a|A', 'b|B, 'c|C'
# so column 1 is lower and column 2 is upper
list = 'a|A, b|B, c|C'
def print_all():
for value in list:
new_list = value.split("|")
print("%s" % value.split("|"))
# This prints column 1 and 2
if new_list[0:] == 'some value':
print(new_list[1])
# This prints off the 'replaced' value
print_all()
edit for the comment, this should be able to run... I think
Without more context it's hard to say for sure, but it sounds as though what you want is to use
for row in range(len(new_list)):
instead of
for row in new_list:
If your JSON file is small enough, just read it into memory and then .replace() in a loop.
#UNTESTED
with open('json.txt') as json_file:
json_str = json_file.read()
for was, will_be in new_list:
json_str = json_str.replace(was, will_be)
with open('new-json.txt', 'w') as json_file:
json_file.write(json_str)
Related
I am trying to concatenate a string to send a message via python>telegram
My plan is so that the function is modular.
It first import lines from a .txt file and based on that many lines it creates two different arrays
array1[] and array2[], array1 will receive the values of the list as strings and array2 will receive user generated information to complemente what is stored in the same position as to a way to identify the differences in the array1[pos], as to put in a way:
while (k<len(list)):
array2[k]= str(input(array1[k]+": "))
k+=1
I wanted to create a single string to send in a single message like however in a way that all my list goes inside the same string
string1 = array1[pos]+": "+array2[pos]+"\n"
I have tried using while to compared the len but I kept recalling and rewriting my own string again and again.
It looks like what you're looking for is to have one list that comes directly from your text file. There's lots of ways to do that, but you most likely won't want to create a list iteratively with the index position. I would say to just append items to your list.
The accepted answer on this post has a good reference, which is basically the following:
import csv
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
# do something
Which, in your case would mean something like this:
import csv
actual_text_list = []
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
actual_text_list.append(row)
user_input_list = []
for actual_text in actual_text_list:
the_users_input = input(f'What is your response to {actual_text}? ')
user_input_list.append(the_users_input)
This creates two lists, one with the actual text, and the other with the other's input. Which I think is what you're trying to do.
Another way, if the list in your text file will not have duplicates, you could consider using a dict, which is just a dictionary, a key-value data store. You would make the key the actual_text from the file, and the value the user_input. Another technique, you could make a list of lists.
import csv
actual_text_list = []
with open('filename.csv', 'r') as fd:
reader = csv.reader(fd)
for row in reader:
actual_text_list.append(row)
dictionary = dict()
for actual_text in actual_text_list:
the_users_input = input(f'What is your response to {actual_text}? ')
dictionary[actual_text] = the_users_input
Then you could use that data like this:
for actual_text, user_input in dictionary.items():
print(f'In response to {actual_text}, you specified {user_input}.')
list_of_strings_from_txt = ["A","B","C"]
modified_list = [f"{w}: {input(f'{w}:')}" for w in list_of_strings_from_txt]
I guess? maybe?
I'm a beginner in python, and tried to find solution by googling. However, I couldn't find any solution that I wanted.
What I'm trying to do with python is pre-processing of data that finds keywords and get all rows that include keyword from a large csv file.
And somehow the nested loop goes through just once and then it doesn't go through on second loop.
The code shown below is a part of my code that finds keywords from the csv file and writes into text file.
def main():
#Calling file (Directory should be changed)
data_file = 'dataset.json'
#Loading data.json file
with open(data_file, 'r') as fp:
data = json.load(fp)
#Make the list for keys
key_list = list(data.keys())
#print(key_list)
preprocess_txt = open("test_11.txt", "w+", -1, "utf-8")
support_fact = 0
for i, k in enumerate(key_list):
count = 1
#read csv, and split on "," the line
with open("my_csvfile.csv", 'r', encoding = 'utf-8') as csvfile:
reader = csv.reader(csvfile)
#The number of q_id is 2
#This is the part that the nested for loop doesn't work!!!!!!!!!!!!!!!!!!!!!!!!!!!!
if len(data[k]['Qids']) == 2:
print("Number 2")
for m in range(len(data[k]['Qids'])):
print(len(data[k]['Qids']))
q_id = [data[k]['Qids'][m]]
print(q_id)
for row in reader: #--->This nested for loop doesn't work after going through one loop!!!!!
if all([x in row for x in q_id]):
print("YES!!!")
preprocess_txt.write("%d %s %s %s\n" % (count, row[0], row[1], row[2]))
count += 1
For the details of above code,
First, it extracts all keys from data.json file, and then put those keys into list(key_list).
Second, I used all([x in row for x in q_id]) method to check each row which contains a keyword(q_id).
However, as I commented above in the code, when the length of data[k]['Qids'] has 2, it prints out YES!!! at first loop correctly, but doesn't print out YES!!!at second loop which means it doesn't go into for row in reader loop even though that csv file contains the keyword.
The figure of print is shown as below,
What did I do wrong..? or what should I add for the code to make it work..?
Can anybody help me out..?
Thanks for looking!
For sake of example, let's say I have a CSV file which looks like this:
foods.csv
beef,stew,apple,sauce
apple,pie,potato,salami
tomato,cherry,pie,bacon
And the following code, which is meant to simulate the structure of your current code:
def main():
import csv
keywords = ["apple", "pie"]
with open("foods.csv", "r") as file:
reader = csv.reader(file)
for keyword in keywords:
for row in reader:
if keyword in row:
print(f"{keyword} was in {row}")
print("Done")
main()
The desired result is that, for every keyword in my list of keywords, if that keyword exists in one of the lines in my CSV file, I will print a string to the screen - indicating in which row the keyword has occurred.
However, here is the actual output:
apple was in ['beef', 'stew', 'apple', 'sauce']
apple was in ['apple', 'pie', 'potato', 'salami']
Done
>>>
It was able to find both instances of the keyword apple in the file, but it didn't find pie! So, what gives?
The problem
The file handle (in your case csvfile) yields its contents once, and then they are consumed. Our reader object wraps around the file-handle and consumes its contents until they are exhausted, at which point there will be no rows left to read from the file (the internal file pointer has advanced to the end), and the inner for-loop will not execute a second time.
The solution
Either move the interal file pointer to the beginning using seek after each iteration of the outer for-loop, or read the contents of the file once into a list or similar collection, and then iterate over the list instead:
Updated code:
def main():
import csv
keywords = ["apple", "pie"]
with open("foods.csv", "r") as file:
contents = list(csv.reader(file))
for keyword in keywords:
for row in contents:
if keyword in row:
print(f"{keyword} was in {row}")
print("Done")
main()
New output:
apple was in ['beef', 'stew', 'apple', 'sauce']
apple was in ['apple', 'pie', 'potato', 'salami']
pie was in ['apple', 'pie', 'potato', 'salami']
pie was in ['tomato', 'cherry', 'pie', 'bacon']
Done
>>>
I believe that your reader variable contains only the first line of your csv file, thus for row in reader executes only once.
try:
with open("my_csvfile.csv", newline='', 'r', encoding = 'utf-8') as csvfile:
newline='' is the new argument introduced above.
reference: https://docs.python.org/3/library/csv.html#id3
Quote: "If csvfile is a file object, it should be opened with newline=''
Here's my code:
f = open("cities.txt", 'wb')
pickle.dump(city_list, f)
f.close()
I know normally to print a list vertically, into new lines, you do this inside a print statement: print(*city_list, sep='\n'). I want to know if there's a way to do this when creating the pickle file, so that when you open it, you see a vertical list, without having to do anything else. For example, when I open the file:
fh = open("cities.txt", 'rb')
x = pickle.load(fh)
print(x)
I want the output to be a vertical list without me having to add a sep='\n' to the print statement.
Once you have loaded your pickled data, it has been converted to a regular Python list already. What you are asking is then: How can I print the items of a list, one item per line?
The answer is simply to do this:
for item in x:
print(item)
If instead you want the output file to be more easily readable by a human, you should encode your data in a format other than what Python's pickling module offers.
Using CSV:
import csv
city_list = [
('Montreal', 'Canada'),
('Belmopan', 'Belize'),
('Monaco', 'Monaco'),
]
with open('cities.txt', 'w') as file:
writer = csv.writer(file)
for city, country in city_list:
writer.writerow(city, country)
This will result in cities.txt containing the following:
Montreal,Canada
Belmopan,Belize
Monaco,Monaco
I have a text file with about 20 entries. They look like this:
~
England
Link: http://imgur.com/foobar.jpg
Capital: London
~
Iceland
Link: http://imgur.com/foobar2.jpg
Capital: Reykjavik
...
etc.
I would like to take these entries and turn them into a CSV.
There is a '~' separating each entry. I'm scratching my head trying to figure out how to go thru line by line and create the CSV values for each country. Can anyone give me a clue on how to go about this?
Use the libraries luke :)
I'm assuming your data is well formatted. Most real world data isn't that way. So, here goes a solution.
>>> content.split('~')
['\nEngland\nLink: http://imgur.com/foobar.jpg\nCapital: London\n', '\nIceland\nLink: http://imgur.com/foobar2.jpg\nCapital: Reykjavik\n', '\nEngland\nLink: http://imgur.com/foobar.jpg\nCapital: London\n', '\nIceland\nLink: http://imgur.com/foobar2.jpg\nCapital: Reykjavik\n']
For writing the CSV, Python has standard library functions.
>>> import csv
>>> csvfile = open('foo.csv', 'wb')
>>> fieldnames = ['Country', 'Link', 'Capital']
>>> writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
>>> for entry in entries:
... cols = entry.strip().splitlines()
... writer.writerow({'Country': cols[0], 'Link':cols[1].split(': ')[1], 'Capital':cols[2].split(':')[1]})
...
If your data is more semi structured or badly formatted, consider using a library like PyParsing.
Edit:
Second column contains URLs, so we need to handle the splits well.
>>> cols[1]
'Link: http://imgur.com/foobar2.jpg'
>>> cols[1].split(':')[1]
' http'
>>> cols[1].split(': ')[1]
'http://imgur.com/foobar2.jpg'
The way that I would do that would be to use the open() function using the syntax of:
f = open('NameOfFile.extensionType', 'a+')
Where "a+" is append mode. The file will not be overwritten and new data can be appended. You could also use "r+" to open the file in read mode, but would lose the ability to edit. The "+" after a letter signifies that if the document does not exist, it will be created. The "a+" I've never found to work without the "+".
After that I would use a for loop like this:
data = []
tmp = []
for line in f:
line.strip() #Removes formatting marks made by python
if line == '~':
data.append(tmp)
tmp = []
continue
else:
tmp.append(line)
Now you have all of the data stored in a list, but you could also reformat it as a class object using a slightly different algorithm.
I have never edited CSV files using python, but I believe you can use a loop like this to add the data:
f2 = open('CSVfileName.csv', 'w') #Can change "w" for other needs i.e "a+"
for entry in data:
for subentry in entry:
f2.write(str(subentry) + '\n') #Use '\n' to create a new line
From my knowledge of CSV that loop would create a single column of all of the data. At the end remember to close the files in order to save the changes:
f.close()
f2.close()
You could combine the two loops into one in order to save space, but for the sake of explanation I have not.
I'm trying to remove some substrings from a string in a csv file.
import csv
import string
input_file = open('in.csv', 'r')
output_file = open('out.csv', 'w')
data = csv.reader(input_file)
writer = csv.writer(output_file,quoting=csv.QUOTE_ALL)# dialect='excel')
specials = ("i'm", "hello", "bye")
for line in data:
line = str(line)
new_line = str.replace(line,specials,'')
writer.writerow(new_line.split(','))
input_file.close()
output_file.close()
So for this example:
hello. I'm obviously over the moon. If I am being honest I didn't think I'd get picked, so to get picked is obviously a big thing. bye.
I'd want the output to be:
obviously over the moon. If I am being honest I didn't think I'd get picked, so to get picked is obviously a big thing.
This however only works when im searching for a single word. So that specials = "I'm" for example. Do I need to add my words to a list or an array?
It looks like you aren't iterating through specials, since it's a tuple rather than a list, so it's only grabbing one of the values. Try this:
specials = ["i'm, "hello", "bye"]
for line in data:
new_line = str(line)
for word in specials:
new_line = str.replace(new_line, word, '')
writer.writerow(new_line.split(','))
It seems like you're already splitting the input via the csv.reader, but then you're throwing away all that goodness by turning the split line back into a string. It's best not to do this, but to keep working with the lists that are yielded from the csv reader. So, it becomes something like this:
for row in data:
new_row = [] # A place to hold the processed row data.
# look at each field in the row.
for field in row:
# remove all the special words.
new_field = field
for s in specials:
new_field = new_field.replace(s, '')
# add the sanitized field to the new "processed" row.
new_row.append(new_field)
# after all fields are processed, write it with the csv writer.
writer.writerow(new_row)