Replacing text in a file with csv file - python

I want to replace text in a text file with names that are in a csv file. For example, I have a text file that says:
Dear [name here],
Hello . . .. etc etc
And a csv file that has a 2 columns with the first name in the first and the last name in the second:
Joe Smith
Rachel Cool
How would I be able to read in the CSV file, and replace each name with the text [name here] in the file? I have this so far after opening the text and csv file and putting the first and last names in the variable names:
for row in f.readlines():
names = row.split(",")[0], row.split(",")[1]
And after that I tried doing something like this, but it isnt working:
for row in textfile.readlines():
print row.replace("[name here]", names)
And after running it, a TypeError: expected a string or other character buffer error. Assuming because the variable names isnt defined in the second for loop.
But how would I be able to read both files and replace just the [name here] in the text file?

Answering your question
From this line,
names = row.split(",")[0], row.split(",")[1]
names is a tuple of two string. But then, in this line
row.replace("[name here]", names)
you're trying to use it as a single string.
If you want to write both first/family names, you can append them as suggested in another answer, but why split them in the first place ? You may just read them in the first loop without splitting (which also solves cases where the person a three names (like Herbert George Wells)):
names = row
in the first loop, then
row.replace("[name here]", names)
in the second loop.
If you split the names because you want to use them separately, then it is unclear to me what you want to achieve.
 Please include tracebacks in questions
The hint here is in the traceback:
TypeError: expected a string or other character buffer error
replace expects a string or something that acts like a string. You're feeding it a tuple. Generally, when asking a question here, it is good practice to copy the full traceback so that we know where the error occurs exactly.
Next issue
Now, you're going to face another issue: in the loop, you erase names at each iteration.
You should use a list to store all names:
i = 0
for row in f.readlines():
names[i] = row
i += 1
i = 0
for row in textfile.readlines():
print(row.replace("[name here]", names[i]))
i += 1
(Or maybe you did this correctly in your code but you stripped it away when generating a simplified example for your question.)

The replace() method is expecting a string for the second parameter and you've passed it an array, hence the TypeError.
You can take the two names in the names array and append them together with a space in the middle using the join method. Try this:
print row.replace("[name here]", ' '.join(names))

Related

Is there a way to search for the contents in a dictionary in another dictionary in Python?

Writing a program that searches through a file, containing names and addresses, and based on the user input takes certain names from the file and stores it in a dictionary and then the program reads another file, containing names and salaries, and stores the entire file in a dictionary. That part of the program seems to work fine but I need the program to search for the names in the first dictionary in the second dictionary but I am struggling to figure that part out and done a lot of research on it but have found nothing that solves my problem.
Your second loop can check whether name is in name_dict before adding the element to salary_dict.
for sal in sal_file:
(name, salary) = sal.strip().split('|')
if name in name_dict:
salary_dict[name] = salary
Then you can just write everything in salary_dict to the new file. You could even do that in the second loop, instead of creating the dictionary.
You should use strip() to remove the newline from the line before splitting it. And there's no need to use str(name), since name is already a string.

python: create variable from loop that reads text file and extracts single column

I'm new to python and was hoping for some assistance please.
I have a small script that reads a text file and prints only the fist column using a for loop:
list = open("/etc/jbstorelist")
for column in list:
print(column.split()[0])
But I would like to take all the lines printed in the for loop and create one single variable for it.
In other words, the text file /etc/jbstorelist has 3 columns and basically I want a list with only the first column, in the form of a single variable.
Any guidance would be appreciated. Thank you.
Since you're new to Python you may want to come back and refernce this answer later.
#Don't override python builtins. (i.e. Don't use `list` as a variable name)
list_ = []
#Use the with statement when opening a file, this will automatically close if
#for you when you exit the block
with open("/etc/jbstorelist") as filestream:
#when you loop over a list you're not looping over the columns you're
#looping over the rows or lines
for line in filestream:
#there is a side effect here you may not be aware of. calling `.split()`
#with no arguments will split on any amount of whitespace if you only
#want to split on a single white space character you can pass `.split()`
#a <space> character like so `.split(' ')`
list_.append(line.split()[0])

Splitting a line into a list (with some difficulty)

I've been trying to work on an assignment and after hours of searching asking class mates and looking on here I can't work this out.
I have a text file containing +100000 lines of data in the format:
"Words" \t "float" \t "float"
and I'm trying to write a function which lets me search a line and pull out one piece of information. It works fine when I write it normally, but I cannot seem to put it into a function
FileList = Mammal.readlines()
Name,Latitude,Longitude = FileList[int].split("\t")
print (Name)
def LineToList(int):
FileList = Text.readlines()
A,B,C = FileList[int].split("\t")
LineToList(0)
print (A)
I receive this error.
IndexError: list index out of range
I've tried swapping out int for a letter, a ratio and adding lines to return values for A, B and C then printing them out but each time it fails.
Your code reads the entire contents of an open file with readlines(), then processes only one line. If your function looks exactly as you show, it will succeed the first time you use it (as long as it's used on a freshly opened file); but on the second call there will be nothing more for readlines() to read, and you'll get back an empty list.
Here's a simpler way to convert an entire file:
lines = mammal.readlines()
values = [ row.split("\t") for row in lines ]
You then have all your values in one list of triples.
Also, note the capitalization. Python style uses names that start with a capital for user-defined types, not for ordinary variables.

When I write in csv how do I separate columns in Python

My code is
import pymysql
conn=pymysql.connect(host=.................)
curs=conn.cursor()
import csv
f=open('./kospilist.csv','r')
data=f.readlines()
data_kp=[]
for i in data:
data_kp.append(i[:-1])
c = csv.writer(open("./test_b.csv","wb"))
def exportFunc():
result=[]
for i in range(0,len(data_kp)):
xp="select date from " + data_kp[i] + " where price is null"
curs.execute(xp)
result= curs.fetchall()
for row in result:
c.writerow(data_kp[i])
c.writerow(row)
c.writerow('\n')
exportFunc()
data_kp is reading the tables name
the tables' names are like this (string, ex: a000010)
I collect table names from here.
Then, execute and get the result.
The actual output of my code is ..
My expectation is
(not 3 columns.. there are 2000 tables)
I thought my code is near the answer... but it's not working..
My work is almost done, but I couldn't finish this part.
I had googled for almost 10 hours..
I don't know how.. please help
I think something is wrong with these part
for row in result:
c.writerow(data_kp[i])
c.writerow(row)
The csvwriter.writerow method allows you to write a row in your output csv file. This means that once you have called the writerow method, the line is wrote and you can't come back to it. When you write the code:
for row in result:
c.writerow(data_kp[i])
c.writerow(row)
You are saying:
"For each result, write a line containing data_kp[i] then write a
line containing row."
This way, everything will be wrote verticaly with alternation between data_kp[i] and row.
What is surprising is that it is not what we get in your actual output. I think that you've changed something. Something like that:
c.writerow(data_kp[i])
for row in result:
c.writerow(row)
But this has not entirely solved your issue, obviously: The names of the tables are not correctly displayed (one character on each column) and they are not side-by-side. So you have 2 problems here:
1. Get the table name in one cell and not splitted
First, let's take a look at the documentation about the csvwriter:
A row must be an iterable of strings or numbers for Writer objects
But your data_kp[i] is a String, not an "iterable of String". This can't work! But you don't get any error either, why? This is because a String, in python, may be itself considered as an iterable of String. Try by yourself:
for char in "abcde":
print(char)
And now, you have probably understood what to do in order to make the things work:
# Give an Iterable containing only data_kp[i]
c.writerow([data_kp[i]])
You have now your table name displayed in only 1 cell! But we still have an other problem...
2. Get the table names displayed side by side
Here, it is a problem in the logic of your code. You are browsing your table names, writing lines containing them and expect them to be written side-by-side and get columns of dates!
Your code need a little bit of rethinking because csvwriter is not made for writing columns but lines. We'll then use the zip_longest function of the itertools module. One can ask why don't I use the zip built-in function of Python: this is because the columns are not said to be of equal size and the zip function will stop once it reached the end of the shortest list!
import itertools
c = csv.writer(open("./test_b.csv","wb"))
# each entry of this list will contain a column for your csv file
data_columns = []
def exportFunc():
result=[]
for i in range(0,len(data_kp)):
xp="select date from " + data_kp[i] + " where price is null"
curs.execute(xp)
result= curs.fetchall()
# each column starts with the name of the table
data_columns.append([data_kp[i]] + list(result))
# the * operator explode the list into arguments for the zip function
ziped_columns = itertools.zip_longest(*data_columns, fillvalue=" ")
csvwriter.writerows(ziped_columns)
Note:
The code provided here has not been tested and may contain bugs. Nevertheless, you should be able (by using the documentation I provided) to fix it in order to make it works! Good luck :)

python: merge two csv files

I have a problem while I'm doing my assignment with python.
I'm new to python so I am a complete beginner.
Question: How can I merge two files below?
s555555,7
s333333,10
s666666,9
s111111,10
s999999,9
and
s111111,,,,,
s222222,,,,,
s333333,,,,,
s444444,,,,,
s555555,,,,,
s666666,,,,,
s777777,,,,,
After merging, it should look something like:
s111111,10,,,,
s222222,,,,,
s333333,10,,,,
s444444,,,,,
s555555,7,,,,
s666666,9,,,,
s777777,,,,,
s999999,9,,,,
Thanks for reading and any helps would be appreciated!!!
Here are the steps you can follow for one approach to the problem. In this I'll be using FileA, FileB and Result as the various filenames.
One way to approach the problem is to give each position in the file (each ,) a number to reference it by, then you read the lines from FileA, then you know that after the first , you need to put the first line from FileB to build your result that you will write out to Result.
Open FileA. Ideally you should use the with statement because it will automatically close the file when its done. Or you can use the normal open() call, but make sure you close the file after you are done.
Loop through each line of FileA and add it to a list. (Hint: you should use split()). Why a list? It makes it easier to refer to items by index as that's our plan.
Repeat steps 1 and 2 for FileB, but store it in a different list variable.
Now the next part is to loop through the list of lines from FileA, match them with the list from FileB, to create a new line that you will write to the Result file. You can do this many ways, but a simple way is:
First create an empty list that will store your results (final_lines = [])
Loop through the list that has the lines for FileA in a for loop.
You should also keep in mind that not every line from FileA will have a corresponding line in FileB. For every first "bit" in FileA's list, find the corresponding line in FileB's list, and then get the next item by using the index(). If you are keen you would have realized that the first item is always 0 and the next one is always 1, so why not simply hard code the values? If you look at the assignment; there are multiple ,s so it could be that at some point you have a fourth or fifth "column" that needs to be added. Teachers love to check for this stuff.
Use append() to add the items in the right order to final_lines.
Now that you have the list of lines ready, the last part is simple:
Open a new file (use with or open)
Loop through final_lines
Write each line out to the file (make sure you don't forget the end of line character).
Close the file.
If you have any specific questions - please ask.
Not relating to python, but on linux:
sort -k1 c1.csv > sorted1
sort -k1 c2.csv > sorted2
join -t , -11 -21 -a 1 -a 2 sorted1 sorted2
Result:
s111111,10,,,,,
s222222,,,,,
s333333,10,,,,,
s444444,,,,,
s555555,7,,,,,
s666666,9,,,,,
s777777,,,,,
s999999,9
Make a dict using the first element as a primary key, and then merge the rows?
Something like this:
f1 = csv.reader(open('file1.csv', 'rb'))
f2 = csv.reader(open('file2.csv', 'rb'))
mydict = {}
for row in f1:
mydict[row[0]] = row[1:]
for row in f2:
mydict[row[0]] = mydict[row[0]].extend(row[1:])
fout = csv.write(open('out.txt','w'))
for k,v in mydict:
fout.write([k]+v)

Categories

Resources