Update: I'm now seeing CSV writer correctly parsing the port numbers, even with using the dialect='excel'. Not sure what I was seeing earlier but it clearly was NOT doing so earlier. I have to consider it suspect until proven otherwise. At any rate, I'm open to ideas to try...
I'm using CiscoConfParse to parse multiple files and writing a CSV file with the information split into separate cells. My problem is that port numbers look fine if they start with '0' i.e. '0/1', but the csv.writer does not correctly parse the port number otherwise. For instance, '1/1' comes out 'Jan-00' when using dialect='excel' as below.
Here is my code at this point:
import os
import re
from ciscoconfparse import CiscoConfParse
import csv
def main():
path="K:\\Temp work\\120\\New\\Configs\\Working\\" # insert (\\) the path to the directory of interest
for path, dirs, files in os.walk(os.path.abspath(path)):
for f in os.listdir(path):
file_path = os.path.join(path, f)
out_file = 'PC.csv' # change to the name of the file to be created
fo = open(out_file, "ab")
fWriter = csv.writer(fo, dialect='excel')
fWriter.writerow([f])
with open(file_path, "r"):
parse = CiscoConfParse(file_path)
vlanList = parse.find_blocks("(?i)^interface [Pp]")
for line in vlanList:
lod = re.split(r'\s*',line)
writer = csv.writer(fo, delimiter=",", dialect='excel')
writer.writerow(lod)
fo.close()
if __name__=='__main__':
main()
When I change the code to include the following...
csv.register_dialect('singlequote', quotechar="'", quoting=csv.QUOTE_ALL)
...and change to dialect='singlequote' the CSV is parsed correctly when viewed in a text editor. I then import into Excel and while everything is there correctly, it has the single quotes of course.
Unfortunately, I don't want the single quotes at this point. I've selected all and formatted as text, then do a replace on the quote character with 'nothing' and the single quotes leave but Excel then changes the fields to back to a date field and I'm back where I started. I could of course remove the quotes while still in CSV (text) file, but when that's imported into Excel - even if I manually import and specify that the fields are text, Excel will still override my formatting and make it a date. (Note: I have already tried turning off AutoCorrect in Excel)
Any suggestions?
Related
I generated a csv via excel and when printing the key names, I get some weird characters appended to the first key like so:
keys(['row1', 'row2']
import csv
path = 'C:\\Users\\asdf\\Desktop\\file.csv'
with open(path, 'r') as file:
reader = csv.DictReader(file)
for row in reader:
print(row.keys())
However, if I just create the csv in the IDE everything works fine and no strange chars are printed. How can I read the excel csv in to chop off the strange characters?
with open(path, 'r', encoding='utf-8-sig')
this worked
I have a folder with csv formated documents with a .arw extension. Files are named as 1.arw, 2.arw, 3.arw ... etc.
I would like to write a code that reads all the files, checks and replaces the forwardslash / with a dash -. And finally creates new files with the replaced character.
The code I wrote as follows:
for i in range(1,6):
my_file=open("/path/"+str(i)+".arw", "r+")
str=my_file.read()
if "/" not in str:
print("There is no forwardslash")
else:
str_new = str.replace("/","-")
print(str_new)
f = open("/path/new"+str(i)+".arw", "w")
f.write(str_new)
my_file.close()
But I get an error saying:
'str' object is not callable.
How can I make it work for all the files in a folder? Apparently my for loop does not work.
The actual error is that you are replacing the built-in str with your own variable with the same name, then try to use the built-in str() after that.
Simply renaming the variable fixes the immediate problem, but you really want to refactor the code to avoid reading the entire file into memory.
import logging
import os
for i in range(1,6):
seen_slash = False
input_filename = "/path/"+str(i)+".arw"
output_filename = "/path/new"+str(i)+".arw"
with open(input_filename, "r+") as input, open(output_filename, "w") as output:
for line in input:
if not seen_slash and "/" in line:
seen_slash = True
line_new = line.replace("/","-")
print(line_new.rstrip('\n')) # don't duplicate newline
output.write(line_new)
if not seen_slash:
logging.warn("{0}: No slash found".format(input_filename))
os.unlink(output_filename)
Using logging instead of print for error messages helps because you keep standard output (the print output) separate from the diagnostics (the logging output). Notice also how the diagnostic message includes the name of the file we found the problem in.
Going back and deleting the output filename when you have examined the entire input file and not found any slashes is a mild wart, but should typically be more efficient.
This is how I would do it:
for i in range(1,6):
with open((str(i)+'.arw'), 'r') as f:
data = f.readlines()
for element in data:
element.replace('/', '-')
f.close()
with open((str(i)+'.arw'), 'w') as f:
for element in data:
f.write(element)
f.close()
this is assuming from your post that you know that you have 6 files
if you don't know how many files you have you can use the OS module to find the files in the directory.
I have an csv sheet that i read it like this:
with open(csvFilePath, 'rU') as csvFile:
reader = csv.reader(csvFile, delimiter= '|')
numberOfMovies = 0
for row in reader:
title = row[1:2][0]
as you see, i am taking the value of title
Then i surf the internet for some info about that value and then i write to a file, the writing is like this:
def writeRDFToFile(rdf, fileName):
f = open("movies/" + fileName + '.ttl','a')
try:
#rdf = rdf.encode('UTF-8')
f.write(rdf) # python will convert \n to os.linesep
except:
print "exception happened for movie " + movieTitle
f.close()
In that function, i am writing the rdf variable to a file.
As you see there is a commetted line
If the value of rdf variable contains unicode char and that line was not commeted, that code doesn't write anything to the file.
However, if I just commet that line, that code writes to a file.
Okay you can say that: commit that line and everything will be fine, but that is not correct, because i have another java process (which is Fuseki server) that reads the file and if the file contains unicode chars, it throws an error.
so i need to solve the file myself, i need to encode that data to ut8,
help please
The normal csv library can have difficulty writing unicode to files. I suggest you use the unicodecsv library instead of the csv library. It supports writing unicode to CSVs.
Practically speaking, just write:
import unicodecsv as csv
I have a python list as such:
[['a','b','c'],['d','e','f'],['g','h','i']]
I am trying to get it into a csv format so I can load it into excel:
a,b,c
d,e,f
g,h,i
Using this, I am trying to write the arary to a csv file:
with open('tables.csv','w') as f:
f.write(each_table)
However, it prints out this:
[
[
'
a
'
,
...
...
So then I tried putting it into an array (again) and then printing it.
each_table_array=[each_table]
with open('tables.csv','w') as f:
f.write(each_table_array)
Now when I open up the csv file, its a bunch of unknown characters, and when I load it into excel, I get a character for every cell.
Not too sure if it's me using the csv library wrong, or the array portion.
I just figured out that the table I am pulling data from has another table within one of its cells, this expands out and messes up the whole formatting
You need to use the csv library for your job:
import csv
each_table = [['a','b','c'],['d','e','f'],['g','h','i']]
with open('tables.csv', 'w') as csvfile:
writer = csv.writer(csvfile)
for row in each_table:
writer.writerow(row)
As a more flexible and pythonic way use csv module for dealing with csv files Note that as you are in python 2 you need the method newline='' * in your open function . then you can use csv.writer to open you csv file for write:
import csv
with open('file_name.csv', 'w',newline='') as csvfile:
spamwriter = csv.writer(csvfile, delimiter=',')
spamwriter.writerows(main_list)
From python wiki: If newline='' is not specified, newlines embedded inside quoted fields will not be interpreted correctly, and on platforms that use \r\n linendings on write an extra \r will be added. It should always be safe to specify newline='', since the csv module does its own (universal) newline handling.
How can I tell Python to open a CSV file, and merge all columns per line, into new lines in a new TXT file?
To explain:
I'm trying to download a bunch of member profiles from a website, for a research project. To do this, I want to write a list of all the URLs in a TXT file.
The URLs are akin to this: website.com-name-country-title-id.html
I have written a script that takes all these bits of information for each member and saves them in columns (name/country/title/id), in a CSV file, like this:
mark japan rookie married
john sweden expert single
suzy germany rookie married
etc...
Now I want to open this CSV and write a TXT file with lines like these:
www.website.com/mark-japan-rookie-married.html
www.website.com/john-sweden-expert-single.html
www.website.com/suzy-germany-rookie-married.html
etc...
Here's the code I have so far. As you can probably tell I barely know what I'm doing so help will be greatly appreciated!!!
import csv
x = "http://website.com/"
y = ".html"
csvFile=csv.DictReader(open("NameCountryTitleId.csv")) #This file is stored on my computer
file = open("urls.txt", "wb")
for row in csvFile:
strArgument=str(row['name'])+"-"+str(row['country'])+"-"+str(row['title'])+"-"+str(row['id'])
try:
file.write(x + strArgument + y)
except:
print(strArgument)
file.close()
I don't get any error messages after running this, but the TXT file is completely empty.
Rather than using a DictReader, use a regular reader to make it easier to join the row:
import csv
url_format = "http://website.com/{}.html"
csv_file = 'NameCountryTitleId.csv'
urls_file = 'urls.txt'
with open(csv_file, 'rb') as infh, open(urls_file, 'w') as outfh:
reader = csv.reader(infh)
for row in reader:
url = url_format.format('-'.join(row))
outfh.write(url + '\n')
The with statement ensures the files are closed properly again when the code completes.
Further changes I made:
In Python 2, open a CSV files in binary mode, the csv module handles line endings itself, because correctly quoted column data can have embedded newlines in them.
Regular text files should be opened in text mode still though.
When writing lines to a file, do remember to add a newline character to delineate lines.
Using a string format (str.format()) is far more flexible than using string concatenations.
str.join() lets you join a sequence of strings together with a separator.
its actually quite simple, you are working with strings yet the file you are opening to write to is being opened in bytes mode, so every single time the write fails and it prints to the screen instead. try changing this line:
file = open("urls.txt", "wb")
to this:
file = open("urls.txt", "w")
EDIT:
i stand corrected, however i would like to point out that with an absence of newlines or some other form of separator, how do you intend to use the URLs later on? if you put newlines between each URL they would be easy to recover