This question already has answers here:
Integers from excel files become floats?
(6 answers)
Closed 4 years ago.
I wrote some code and trying to refactor it to cut out a few steps and i cant seem to find an answer for this. Im reading an excel file and doing a bunch of column renaming and dropping columns i dont need. My end goal is to write the Excel file as an text tab delimited file and accomplished all this but in a very hacky way. I have a function convertToText() that reads the excel file and turns it into a txt file. However every single integer in the file gets a .0 appended to the end.
Ex.
Excel value 1234321
Txt File = 1234321.0
Im just doing a simple read and write using pandas, openpyxl and xlrd.
def convertToText():
with open(os.path.join(outFile, 'target2.txt'), 'wb') as myTxtfile:
wr = csv.writer(myTxtfile, delimiter="\t")
myfile = xlrd.open_workbook(outFile + fileName)
mysheet = myfile.sheet_by_index(0)
for rownum in xrange(mysheet.nrows):
wr.writerow(mysheet.row_values(rownum))
I had to write a second function just to do a find and replace on the .0 and trying to cut that step out of the process. If anyone has any ideas how to do this in the above function would be greatly appreciated!!
Is this the same thing as what you mean?
So I think your code should become, although I have no test data so I cannot try it:
def convertToText():
with open(os.path.join(outFile, 'target2.txt'), 'wb') as myTxtfile:
wr = csv.writer(myTxtfile, delimiter="\t")
myfile = xlrd.open_workbook(outFile + fileName)
mysheet = myfile.sheet_by_index(0)
for rownum in xrange(mysheet.nrows):
wr.writerow([int(i) for i in mysheet.row_values(rownum)])
Related
This question already has answers here:
Iterating on a file doesn't work the second time [duplicate]
(4 answers)
Closed 1 year ago.
I was refactoring some code for my program and I have a mistake somewhere in the process. I am reading and writing .csv files.
In the beginning of my program I iterate through a .csv file in order to find which data from the file I need.
with open(csvPath, mode='r') as inputFile:
csvReader = csv.reader(inputFile)
potentialVals = []
paramVals = {}
for row in csvReader:
if row[3] == "Parameter":
continue
# Increment vales in dict
if row[3] not in paramVals:
paramVals[row[3]] = 1
else:
paramVals[row[3]] += 1
This iterates and works fine, the for loop gets me every row in the .csv file. I them perform some calculations and go to iterate through the same .csv file again later, and then select data to write to a new .csv file. My problem is here, when I go to iterate through a second time, it only gives me the first row of the .csv file, and nothing else.
# Write all of the information to our new csv file
with open(outputPath, mode='w') as outputFile:
csvWriter = csv.writer(outputFile, delimiter=',', quotechar='"', quoting=csv.QUOTE_ALL)
inputFile.seek(0)
rowNum = 0
for row in csvReader:
print(row)
Where the print statement is, it only prints the first line of the .csv file, and then exits the for loop. I'm not really sure what is causing this. I thought it might have been the
inputFile.seek(0)
But even if I opened a 2nd reader, the problem persisted. This for loop was working before I refactored it, all the other code is the same except the for loop I'm having trouble with, here is what it used to look like:
Edit: So I thought maybe it was a variable instance error, so I tried renaming my variables instead of reusing them and the issue persisted. Going to try a new file instance now,
Edit 2: Okay so this is interesting, when I look at the line_num value for my reader object (when I open a new one instead of using .seek) it does output 1, so I am at the beginning of my file. And when I look at the len(list(csvReader)) it is 229703, which shows that the .csv is fully there, so still not sure why it won't do anything besides the first row of the .csv
Edit 3: Just as a hail mary attempt, I tried creating a deep copy of the .csv file and iterating through that, but same results. I also tried just doing an entire separate .csv file and I also got the same issue of only getting 1 row. I guess that eliminates that it's a file issue, the information is there but there is something preventing it from reading it.
Edit 4: Here is where I'm currently at with the same issue. I might just have to rewrite this method completely haha but I'm going to lunch so I won't be able to actively respond now. Thank you for the help so far though!
# TODO: BUG HERE
with open(csvPath, mode='r') as inputFile2:
csvReader2 = csv.reader(inputFile2)
...
for row2 in csvReader2:
print("CSV Line Num: " + str(csvReader2.line_num))
print("CSV Index: " + str(rowNum))
print("CSV Length: " + str(len(list(csvReader2))))
print("CSV Row: " + str(row2))
Also incase it helps, here is csvPath:
nameOfInput = input("Please enter the file you'd like to convert: ")
csvPath = os.path.dirname(os.path.realpath(nameOfInput))
csvPath = os.path.join(csvPath, nameOfInput)
If you read the documentation carefully, it says csv reader is just a parser and all the heavy lifting is done by the underlying file object.
In your case, you are trying to read from a closed file in the second iteration and that is why it isn't working.
For csv reader to work you'll need an underlying object which supports the iterator protocol and returns a string each time its next() method is called — file objects and list objects are both suitable.
Link to the documentation: https://docs.python.org/3/library/csv.html
This question already has answers here:
How to append a new row to an old CSV file in Python?
(8 answers)
Closed 1 year ago.
I'm using selenium and beautifulsoup to iterate through a number of webpages and sort out the results. I have that working, however I want to export the results to a CSV using this block of code:
with open('finallist.csv', mode='w') as final_list:
stock_writer = csv.writer(final_list, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
stock_writer.writerow([ticker, element.get_text()])
The only issue is, with the result being multiple different things, this code as it stands just replaces the first line of the CSV every time a new result comes in. Is there any way I can have it write to a new line each time?
Per the Python documentation for the open() function, you can pass the 'a' mode to the open() function. Doing so will append any text to the end of the file, if the file already exists.
with open('finallist.csv', mode='a') as final_list:
...
This question already has answers here:
Why does my text file keep overwriting the data on it?
(3 answers)
Closed 2 years ago.
So, I did this python program and whenever I run it it says in the file "This is an update" and only one of the quotes I entered. Any help? Program below.
file_name = "my_quote.txt"
new_file = open(file_name, "w")
new_file.close()
def update_file(file_name,quote):
new_file = open(file_name, "w")
new_file.write("This is an update\n")
new_file.write(quote)
new_file.write("\n\n")
new_file.close()
for index in range(1,3):
quote = input("Enter your favorite quote:")
update_file(file_name, quote)
new_file = open(file_name, "r")
print(new_file.read())
new_file.close()
You're opening your file in w mode, which overwrites the file.
Use a for append mode, which, well, appends new content at the end.
You are writing over the current file as you are not opening the file in append mode.
If you change the open command to this instead:
file.open(file_path, 'a')
You will append the text instead of writing over the file.
Whenever you re-open a file, and use write it removes all content previously in the file and overwrites it. And since every time you call update_file you are re-opening it and writeing to it, only the last piece of info written in the last open will be kept (as all previous data was overwritten.
I think you want to use append mode when writing data to the file. See here for a list of all the modes, and their function.
Hope it helps!
How can I tell Python to open a CSV file, and merge all columns per line, into new lines in a new TXT file?
To explain:
I'm trying to download a bunch of member profiles from a website, for a research project. To do this, I want to write a list of all the URLs in a TXT file.
The URLs are akin to this: website.com-name-country-title-id.html
I have written a script that takes all these bits of information for each member and saves them in columns (name/country/title/id), in a CSV file, like this:
mark japan rookie married
john sweden expert single
suzy germany rookie married
etc...
Now I want to open this CSV and write a TXT file with lines like these:
www.website.com/mark-japan-rookie-married.html
www.website.com/john-sweden-expert-single.html
www.website.com/suzy-germany-rookie-married.html
etc...
Here's the code I have so far. As you can probably tell I barely know what I'm doing so help will be greatly appreciated!!!
import csv
x = "http://website.com/"
y = ".html"
csvFile=csv.DictReader(open("NameCountryTitleId.csv")) #This file is stored on my computer
file = open("urls.txt", "wb")
for row in csvFile:
strArgument=str(row['name'])+"-"+str(row['country'])+"-"+str(row['title'])+"-"+str(row['id'])
try:
file.write(x + strArgument + y)
except:
print(strArgument)
file.close()
I don't get any error messages after running this, but the TXT file is completely empty.
Rather than using a DictReader, use a regular reader to make it easier to join the row:
import csv
url_format = "http://website.com/{}.html"
csv_file = 'NameCountryTitleId.csv'
urls_file = 'urls.txt'
with open(csv_file, 'rb') as infh, open(urls_file, 'w') as outfh:
reader = csv.reader(infh)
for row in reader:
url = url_format.format('-'.join(row))
outfh.write(url + '\n')
The with statement ensures the files are closed properly again when the code completes.
Further changes I made:
In Python 2, open a CSV files in binary mode, the csv module handles line endings itself, because correctly quoted column data can have embedded newlines in them.
Regular text files should be opened in text mode still though.
When writing lines to a file, do remember to add a newline character to delineate lines.
Using a string format (str.format()) is far more flexible than using string concatenations.
str.join() lets you join a sequence of strings together with a separator.
its actually quite simple, you are working with strings yet the file you are opening to write to is being opened in bytes mode, so every single time the write fails and it prints to the screen instead. try changing this line:
file = open("urls.txt", "wb")
to this:
file = open("urls.txt", "w")
EDIT:
i stand corrected, however i would like to point out that with an absence of newlines or some other form of separator, how do you intend to use the URLs later on? if you put newlines between each URL they would be easy to recover
I'm trying to adjust a script that previously took in a CSV file where the columns were at the start of a file, however now the CSV it reads has changed so that there is a load of spiel before the column headers are given.
Is there a way using DictReader (or even any other method) to skip down to where the columns are (line 15) and use these?
Currently I'm using the below code, but it will always take the first line in the file.
f = open(fileName)
reader = csv.DictReader(f)
lineU = 0
for underlyer in reader:
lineU = lineU + 1
if(lineU == 6):
#start the code
Appreciate any help given.
Try reading the 15 lines from f first, before passing it to the DictReader.
The csv.reader will iterate over the file, so you can basically read those lines using file.readline()before starting using the reader, so that they don't appear to the reader.