So, currently I have a csv file created with 6 rows, 1 column (header and 5 numbers) I want to be able to do a conversion, say from centimeters to inches and save it in a new csv with a new header.
I am new to coding in general, but I so far I have only been able to import the csv and read it, and print it (using print row) but I was wondering how I could do the conversion, since the numbers are saved in the csv would I have to convert the numbers to float and then somehow write it to a new csv? I only have 5 numbers as I want to be able to just figure out the correct code, but I want to be able use it for a lot of numbers and not just for 5. Therefore, I could not write print row[1] or something like that as that would take too long.
I wasn't sure where the computation would be placed either. Help please! Also, this isn't homework or the like. Im just doing this for fun.
This is the code I currently have:
import csv
with open('test.csv', 'rb') as f:
reader = csv.reader(f)
next(reader, None) # I did this to skip the header I labelled Centimeters
with open('test1.csv', 'wb') as o:
writer = csv.writer(o)
for row in reader
f.close()
o.close()
I guess I dont know how to convert the number in the rows to float and then output the values. I want to just be able to multiply the number in the row by 0.393701 so that in the new csv the header is labelled inches with the output beneath in the rows.
If the numbers are one per row it's not really a CSV file, it's just a text file in which case you don't need to use the csv reading system from Python's libraries to read it (although that library will read the file just fine as well).
Basically, you're program will look like this (this isn't real Python code, it's your job to come up with that!):
with either the CSV module or regular file operations
open your file
with either the CSV module or regular file operations
open an output file with a different name
for each line in the input file
convert the value you read to float()
transform the value
write the value to the output file
Is that enough to get you started? (This is actually more lines than the final Python program will need since you can combine some of the lines easily into one).
Related
I have a csv-file that doesn't delimit. Screenshot of csv-file.
This means that all the data stays in row[0], and does not divide into 6 columns. Does anybody know how to solve this issue?
import csv
n=1048576
id=[]*n
a=[]*n
date=[]*n
b=[]*n
c=[]*n
with open('C:\\Users\\andsc\\data_1.csv') as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
line_count = 0
for row in csv_reader:
id[line_count] = row[0]
a[line_count] = row[1]
date[line_count] = row[2]
b[line_count] = row[3]
c[line_count] = row[4]
line_count += 1
You appear to be using a non-US version of Excel. In locales where the comma is used as a decimal separator, Excel expects the semicolon as the column delimiter:
csv_reader = csv.reader(csv_file, delimiter=';')
Firstly, don't do this:
id=[]*n
a=[]*n
...etc...
What you are trying to do is emulate a fixed-length array. That won't work. As you will see if you do this at the command prompt:
>>> [] * 9
[]
This is because the * really is a multiply, and just as [1] * 3 gives [1, 1, 1] (three repetitions of the list [1]) doing [] * 9 gives 9 repetitions of the empty list, which is just as empty as one repetition.
Instead create empty lists:
id=[]
a=[]
...etc...
Then, in your loop, do not index into these lists, append() new values to them instead:
id.append(row[0])
a.append(row[1])
...etc...
That means you don't need to keep track of line_count, and even if you do need to do that, use the provided method csv_reader.line_num().
Using Excel screenshots to look at a CSV is often misleading. It is clear that your version of Excel expects the delimiter of the CSV to be a semicolon not a comma, which is why the data is all in one column. To be 100% sure of what is in the file, open it in a text editor like Notepad or Notepad++. That avoids Excel's aggressive type coercion, which changes anything that looks like a date, or a hexadecimal string, into a number. And above all do not save the CSV back from Excel and assume the file still to be as expected.
It is clear that the code you presented will not run. It will get an IndexError the first time through the loop. You have to fix the code before it will run, and when you do that you will see that Python really does respect the comma as delimiter.
But opening the input file in Excel has given you a mistaken idea of where the problem is. You are quite right to say that comma is clearly the intended delimiter in the file. But when you open a CSV in Excel, Excel uses your system decimal and delimiter settings, which for European installations of Windows and MacOS are usually , and ;.
Excel is not bright enough to figure out on its own that those settings are inappropriate for a given file; it needs help from you. You can change Excel's File | Open behaviour by altering your system settings, but if you change the delimiter to , you will have to change the decimal point to . (for every single application, not just Excel) and it is unlikely you would want to do that.
The workaround is to set it manually for a particular file, by importing the CSV instead of simply opening it. On the Data tab select From Text/CSV and Excel will then try to guess the settings from the first 2000 rows. If it guesses wrong you have the opportunity to fix it.
But getting Excel to display the file as you expect has nothing to do with the way Python is reading it.
I'm doing some measurements in the lab and want to transform them into some nice Python plots. The problem is the way the software exports CSV files, as I can't find a way to properly read the numbers. It looks like this:
-10;-0,0000026
-8;-0,00000139
-6;-0,000000546
-4;-0,000000112
-2;-5,11E-09
0,0000048;6,21E-09
2;0,000000318
4;0,00000304
6;0,0000129
8;0,0000724
10;0,000268
Separation by ; is fine, but I need every , to be ..
Ideally I would like Python to be able to read numbers such as 6.21E-09 as well, but I should be able to fix that in excel...
My main issue: Change every , to . so Python can read them as a float.
The simplest way would be for you to convert them to string and then use the .replace() method to pretty much do anything. For i.e.
txt = "0,0000048;6,21E-09"
txt = txt.replace(';', '.')
You could also read the CSV file (I don't know how you are reading the file) but depending on the library, you could change the 'delimiter' (to : for example). CSV is Comma-separated values and as the name implies, it separates columns by means of '.
You can do whatever you want in Python, for example:
import csv
with open('path_to_csv_file', 'r') as csv_file:
data = list(csv.reader(csv_file, delimiter=';'))
data = [(int(raw_row[0]), float(raw_row[1].replace(',', '.'))) for row in data]
with open('path_to_csv_file', 'w') as csv_file:
writer = csv.writer(csv_file, delimiter=';')
writer.writerows(data)
Can you consider a regex to match the ',' all in the text, then loop the match results in a process that takes ',' to '.'.
I want to generate a log file in which I have to print two lists for about 50 input files. So, there are approximately 100 lists reported in the log file. I tried using pickle.dump, but it adds some strange characters in the beginning of each value. Also, it writes each value in a different line and the enclosing brackets are also not shown.
Here is a sample output from a test code.
import pickle
x=[1,2,3,4]
fp=open('log.csv','w')
pickle.dump(x,fp)
fp.close()
output:
I want my log file to report:
list 1 is: [1,2,3,4]
If you want your log file to be readable, you are approaching it the wrong way by using pickle which "implements binary protocols"--i.e. it is unreadable.
To get what you want, replace the line
pickle.dump(x,fp)
with
fp.write(' list 1 is: '
fp.write(str(x))
This requires minimal change in the rest of your code. However, good practice would change your code to a better style.
pickle is for storing objects in a form which you could use to recreate the original object. If all you want to do is create a log message, the builtin __str__ method is sufficient.
x = [1, 2, 3, 4]
with open('log.csv', 'w') as fp:
print('list 1 is: {}'.format(x), file=fp)
Python's pickle is used to serialize objects, which is basically a way that an object and its hierarchy can be stored on your computer for use later.
If your goal is to write data to a csv, then read the csv file and output what you read inside of it, then read below.
Writing To A CSV File see here for a great tutorial if you need more info
import csv
list = [1,2,3,4]
myFile = open('yourFile.csv', 'w')
writer = csv.writer(myFile)
writer.writerow(list)
the function writerow() will write each element of an iterable (each element of the list in your case) to a column. You can run through each one of your lists and write it to its own row in this way. If you want to write multiple rows at once, check out the method writerows()
Your file will be automatically saved when you write.
Reading A CSV File
import csv
with open('example.csv', newline='') as File:
reader = csv.reader(File)
for row in reader:
print(row)
This will run through all the rows in your csv file and will print it to the console.
I'm trying to read in a csv file with many rows and columns; i would like to print one row, in a particular format, to a text file, and do some hashing on the values. SO far, i have been able to read in the file, parse thru it using DictReader, find the row i want using an IF statement and then print the keys and values. I cannot figure out how to format it to the format i want in the end ( Key = Value \n), and i cannot figure how to write to a file (much less in the format i want) using the value of 'row' obtained below. I've been trying for days and make a little progress but cannot get it to work. Here is what i got to work (with much detail left out of results):
>>>import csv
with open("C:\path_to_script\filename_Brief.csv") as infh:
reader = csv.DictReader(infh)
for row in reader:
if row['ALIAS'] == 'Y4K':
print(row)
result-output
{'Full_Name': 'Jack Flash', 'PHONE_NO': '555 555-1212', 'ALIAS': 'Y4K'}
I'd like to ask the user to input the Alias and then use that to determine row to print. I've done a ton of research but am new-ish to Python so am asking for help! i've used pyexcel, xlrd/xlwt, even thought I'd try pandas but too much to learn. I also got it to format the way i wanted in one test but then could not get the row selection to work--in other words, it prints all the records rather than the row i want. Have 30 Firefox tabs open trying to find an answer! Thanks in advance!
The following may at least be close to what you want (I think):
import csv
with open(r'C:\path_to_script\filename_Brief.csv') as infh, \
open('new_file.txt', 'wt') as outfh:
reader = csv.DictReader(infh)
for row in reader:
if row['ALIAS'] == 'Y4K':
outfh.write('Full_Name = {Full_Name}\n'
'PHONE_NO = {PHONE_NO}\n'
'ALIAS = {ALIAS}\n'.format(**row))
This would write 3 lines formatted like this into the output file for every matchingrow:
Full_Name = Jack Flash
PHONE_NO = 555 555-1212
ALIAS = Y4K
BTW, the **rownotation means basically "take all the entries in the specified dictionary and turn them into keyword arguments for this function call". The {keyword} syntax in the format string refers to any keyword arguments that will be passed to the str.format() method.
This is maybe a very basic question, but let's suppose one has a csv file which looks as follows:
a,a,a,a
b,b,b,b
c,c,c,c
d,d,d,d
e,e,e,e
And I am interested in deleting row[1], and row[3] and rewrite a new file that does not contain such rows. What would be the best way to do this?. As the module csv is already loaded in my code, I'd like to know how to do it within such scheme. I'd be glad if somebody could help me with this.
Since each row is on a separate line (assuming there are no newlines within the data items of the rows themselves), you can do this by simply copying the file line-by-line and skipping any you don't want kept. Since I'm unsure whether you number rows starting from zero or one, I've added a symbolic constant at the beginning to control it. You could, of course, hardcode it, as well as ROWS_TO_DELETE, directly into the code.
Regardless, this approach would be faster than using, for example, the csv module, because it avoids all the unnecessarily parsing and reformatting of the data being processed that the module has to do.
FIRST_ROW_NUM = 1 # or 0
ROWS_TO_DELETE = {1, 3}
with open('infile.csv', 'rt') as infile, open('outfile.csv', 'wt') as outfile:
outfile.writelines(row for row_num, row in enumerate(infile, FIRST_ROW_NUM)
if row_num not in ROWS_TO_DELETE)
Resulting output file's contents:
b,b,b,b
d,d,d,d
e,e,e,e