I stored data on a CSV file with Python. Now I need to read it with Python but there are some issues with it. There is a
";;;;;;"
statement on the finish of every line.
Here is the code that I used for writing data to CSV :
file = open("products.csv", "a")
writer = csv.writer(file, delimiter=",", quotechar='"', quoting=csv.QUOTE_ALL)
writer.writerow(data)
And I am trying to read that with that code :
with open("products.csv", "r", newline="") as in_file, open("3.csv", "w", newline='') as to_file:
reader = csv.reader(in_file, delimiter="," ,doublequote=True)
for row in reader:
print(row)
Of course, I am not reading it for just printing I need to remove duplicated lines and make it a readable CSV.
I've tried this to fetch strings and edit them and it's worked for other fields except for semicolons. I cant understand why I cant edit those semicolons.
for row in reader:
try:
print(row)
rowList = row[0].split(",")
for index, field in enumerate(rowList):
if '"' in field:
field = field.replace('"', "")
elif ";;;;;;" in rowList[index]:
field = field.replace(";;;;;;", "")
rowList[index] = field
print(rowList)
Here is the output of the code above :
['Product Name', 'Product Description', 'SKU', 'Regular Price', 'Sale Price', 'Images;;;;;;']
Can anybody help me?
I realized that I used 'elif' on there. I changed it and it solved. Thanks for the help. But I still don't know why it added that semicolon to there.
Related
I am programming a discord bot that lets users send an embed to a channel. The embed is split into multiple parts, which I want to safe to a CSV file because I want to add features that require the data to be saved.
The problem is, that when a user executes the command, the first line in the CSV gets overridden with the new content of the command/embed.
I tried to use a different mode to save to the next line. 'w' for writing is the one I am experiencing the problem with. The mode 'a' almost works, but it also adds the field names every time.
The CSV Code:
with open('homework.csv', 'a', newline='') as file:
fieldnames = ['user', 'fach', 'aufgabe', 'abgabedatum']
writer = csv.DictWriter(file, fieldnames=fieldnames)
writer.writeheader()
writer.writerow({'user': str(message.author), 'fach': subject, 'aufgabe': task, 'abgabedatum': date})
The CSV Output using the mode a
user,fach,aufgabe,abgabedatum
user,fach,aufgabe,abgabedatum
Akorian#0187,test,ja mddoiddn ,01.03.2021
user,fach,aufgabe,abgabedatum
Akorian#0187,testddd,ja mddoiddn ,01.03.2021
Try this:
import csv
import os
if os.path.isfile('homework.csv'):
with open('homework.csv', 'a', newline='') as file:
fieldnames = ['user', 'fach', 'aufgabe', 'abgabedatum']
w = csv.DictWriter(file, fieldnames=fieldnames)
w.writerow({'user': str(message.author), 'fach': subject, 'aufgabe': task, 'abgabedatum': date})
else:
with open('homework.csv', 'w', newline='') as file:
fieldnames = ['user', 'fach', 'aufgabe', 'abgabedatum']
w = csv.DictWriter(file, fieldnames=fieldnames)
w.writeheader()
w.writerow({'user': str(message.author), 'fach': subject, 'aufgabe': task, 'abgabedatum': date})
Looking to edit rows in my csv document and I keep getting flip-flopping issues between requiring "bytes-like object is required" and "Error: iterator should return strings, not bytes"
Running Python3
I have tried changing the mode from "rb" to "r" as well as placing generic string texts in the writer.writerow loop.
The CSV file is definitely comma separated, not tab separated.
I am following this youtube tutorial: https://www.youtube.com/watch?v=pOJ1KNTlpzE&t=75s (1:40)
temp_file = NamedTemporaryFile(delete=False)
with open('clientlist.csv','rb') as csvfile, temp_file:
reader = csv.DictReader(csvfile)
fieldnames = ['Account Name','Account Number','Date Last Checked']
writer = csv.DictWriter(temp_file, fieldnames=fieldnames)
writer.writeheader()
print(temp_file.name)
for row in reader:
writer.writerow({
'Account Name': row['Account Name'],
'Account Number': row['Account Number'],
'Date Last Checked': row['Date Last Checked'],
})
#shutil.move(temp_file.name, client_list)
Expected result should make it when I open the temp_file there is data. Then, from what I read, the shuthil should copy it. Right now the temp_file is blank.
Any ideas if it would be easier to start from scratch and use numpy or pandas? Saw this video on that: https://www.youtube.com/watch?v=pbjGo3oj0PM&list=PLulVrUACBIGX8JT7vpoHVQLYqgOKeunb6&index=16&t=0s
According to the NamedTemporaryFile documentation, named temporary files are opened in w+b mode by default - i.e. binary.
Since you are reading and writing csv files, it makes no sense (to me) to operate in binary mode, so rather open the input file in r mode, and ask for a temporary file in w mode:
import csv
import tempfile
temp_file = tempfile.NamedTemporaryFile(mode='w', delete=False) # note the mode argument
with open('clientlist.csv','r') as csvfile, temp_file: #note the mode argument
reader = csv.DictReader(csvfile)
fieldnames = ['Account Name','Account Number','Date Last Checked']
writer = csv.DictWriter(temp_file, fieldnames=fieldnames)
writer.writeheader()
for row in reader:
writer.writerow({
'Account Name': row['Account Name'],
'Account Number': row['Account Number'],
'Date Last Checked': row['Date Last Checked'],
})
That seems to behave for me.
Here they recommend defining the encoding:
Python 3.1.3 Win 7: csv writerow Error "must be bytes or buffer, not str"
Nevertheless, why don't you open the temporary file with open?
Like,
temp_file = open("new_file.csv", 'wb');
I have created a function that fetches price, rating, etc after it hits an API:
def is_priced(business_id):
try:
priced_ind = get_business(API_KEY, business_id)
priced_ind1 = priced_ind['price']
except:
priced_ind1 = 'None'
return priced_ind1
priced_ind = is_priced(b_id)
print(priced_ind)
Similar for rating
def is_rated(business_id):
try:
rated_ind = get_business(API_KEY, business_id)
rated_ind1 = rated_ind['rating']
except:
rated_ind1 = 'None'
return rated_ind1
However, I want my function to loop through the business names I have in my CSV file and catch all this data and export it to a new csv file with these two parameters beside the names of the business.
The CSV file has info on the name of the business along with its address,city,state,zip and country
Eg:
Name address city state zip country
XYZ(The) 5* WE 223899th St. New York NY 19921 US
My output:
Querying https://api.xyz.com/v3/businesses/matches ...
True
Querying https://api.xyz.com/v3/businesses/matches ...
4.0
Querying https://api.xyz.com/v3/businesses/matches ...
$$
Querying https://api.xyz.com/v3/businesses/matches ...
Querying https://api.xyz.com/v3/businesses/matches ...
The real issue is my output only returns business id in the csv. and the rating etc as u see is just returned in the console. how do I set a loop such that it returns for all the businesses the info i desire into a single CSV?
The csv module is useful for this sort of thing e.g.
import csv
with open('f.csv', 'r') as csvfile:
reader = csv.reader(csvfile, delimiter=',', quotechar='"')
with open('tmp.csv', 'w') as output:
writer = csv.writer(output)
for row in reader:
business_id = row[0]
row.append(get_price_index(business_id))
row.append(get_rate_index(business_id))
writer.writerow(row)
You can read the business names from the CSV file, iterate over them using a for loop, hit the API and store the results, and write to a new CSV file.
import csv
data = []
with open('businesses.csv') as fp:
# skip header line
header = next(fp)
reader = csv.reader(fp)
for row in reader:
b_name = reader[0]
# not sure how you get the business ID:
b_id = get_business_id(b_name)
p = is_priced(b_id)
r = is_rated(b_id)
out.append((b_name, p, r))
# write out the results
with open('business_data.csv', 'w') as fp:
writer = csv.writer(fp)
writer.writerow(['name', 'price', 'rating'])
for row in data:
writer.writerow(row)
You can do this easily using pandas:
import pandas as pd
csv = pd.read_csv('your_csv.csv', usecols=['business_name']) # since you only need the name
# you'll receive business_name in your functions
csv = csv.apply(is_priced, axis=1)
csv = csv.apply(is_rated, axis=1)
csv.to_csv('result.csv', index=False)
All you have to do in your functions is:
def is_priced(row):
business_name = row['business_name']
business_id = ??
...
Im trying to read a csv file, and create a new cvs file, with the contents of the old cvs file with Python. My Problem is, that all entrys are saved in the first column, and i cant find a way to save the informations in different columns. Here is my code:
import csv
from itertools import zip_longest
fieldnamesOrdered = ['First Name', 'Last Name' , 'Email', 'Phone Number',
'Street Address', 'City', 'State', 'HubSpot Owner', 'Lifecyle Stage', 'Lead
Status', 'Favorite Color']
listOne = []
listTwo = []
with open('Contac.csv', 'r', encoding = 'utf-8') as inputFile,
open('result.csv', 'w', encoding = 'utf-8') as outputFile:
reader = csv.DictReader(inputFile)
writer = csv.writer(outputFile, delimiter = 't')
for row in reader:
listOne.append(row['First Name'])
listTwo.append(row['Last Name'])
dataLists = [listOne, listTwo]
export_data = zip_longest(*dataLists, fillvalue='')
writer.writerow(fieldnamesOrdered)
writer.writerows(export_data)
inputFile.close()
outputFile.close()
Thank you very much for your answers
writer = csv.writer(outputFile, delimiter = 't')
Aren't those entries in the first column additionally interspersed with strange unsolicited 't' characters?
So I am collecting data and this data is saved into csv files, however for presentation purposes I want to reorder the columns in each respective csv file based on it's related "order".
I was using this question (write CSV columns out in a different order in Python) as a guide but I'm not sure why I'm getting the error
writeindices = [name2index[name] for name in writenames]
KeyError: % Processor Time
when I run it. Note this error doesn't seem to be limited to just the string % Processor Time'.
Where am I going wrong?
Here is my code:
CPU_order=["%"+" Processor Time", "%"+" User Time", "Other"]
Memory_order=["Available Bytes", "Pages/sec", "Pages Output/sec", "Pages Input/sec", "Page Faults/sec"]
def reorder_csv(path,title,input_file):
if title == 'CPU':
order=CPU_order
elif title == 'Memory':
order=Memory_order
output_file=path+'/'+title+'_reorder'+'.csv'
writenames = order
reader = csv.reader(input_file)
writer = csv.writer(open(output_file, 'wb'))
readnames = reader.next()
name2index = dict((name, index) for index, name in enumerate(readnames))
writeindices = [name2index[name] for name in writenames]
reorderfunc = operator.itemgetter(*writeindices)
writer.writerow(writenames)
for row in reader:
writer.writerow(reorderfunc(row))
Here is a sample of what the input CSV file looks like:
,CPU\% User Time,CPU\% Processor Time,CPU\Other
05/23/2016 06:01:51.552,0,0,0
05/23/2016 06:02:01.567,0.038940741537158409,0.62259056657940626,0.077882481554869071
05/23/2016 06:02:11.566,0.03900149141703179,0.77956981074955856,0
05/23/2016 06:02:21.566,0,0,0
05/23/2016 06:02:31.566,0,1.1695867249963632,0
Your code works. It is your data which does not have a column named "% Processor Time". Here is a sample data I use:
Other,% User Time,% Processor Time
o1,u1,p1
o2,u2,p2
And here is the code which I call:
reorder_csv('.', 'CPU', open('data.csv'))
With these settings, everything works fine. Please check your data.
Update
Now that I see your data, it looks like your have column names such as "CPU\% Processor Time" and want to translate it to "% Processor Time" before writing out. All you need to do is creating your name2index this way:
name2index = dict((name.replace('CPU\\', ''), index) for index, name in enumerate(readnames))
The difference here is instead of name, you should have name.replace('CPU\\', ''), which get rid of the CPU\ part.
Update 2
I reworked your code to use csv.DictReader and csv.DictWriter. I also assume that "CPU\% Prvileged Time" will be transformed into "Other". If that is not the case, you can fix it in the transformer dictionary.
import csv
import os
def rename_columns(row):
""" Take a row (dictionary) of data and return a new row with columns renamed """
transformer = {
'CPU\\% User Time': '% User Time',
'CPU\\% Processor Time': '% Processor Time',
'CPU\\% Privileged Time': 'Other',
}
new_row = {transformer.get(k, k): v for k, v in row.items()}
return new_row
def reorder_csv(path, title, input_file):
header = dict(
CPU=["% Processor Time", "% User Time", "Other"],
Memory=["Available Bytes", "Pages/sec", "Pages Output/sec", "Pages Input/sec", "Page Faults/sec"],
)
reader = csv.DictReader(input_file)
output_filename = os.path.join(path, '{}_reorder2.csv'.format(title))
with open(output_filename, 'wb') as outfile:
# Create a new writer where each row is a dictionary.
# If the row contains extra keys, ignore them
writer = csv.DictWriter(outfile, header[title], extrasaction='ignore')
writer.writeheader()
for row in reader:
# Each row is a dictionary, not list
print row
row = rename_columns(row)
print row
print
writer.writerow(row)