I created a function for convert the csv.
The main topic is: get a csv file like:
,features,corr_dropped,var_dropped,uv_dropped
0,AghEnt,False,False,False
and I want to conver it to an another csv file:
features
corr_dropped
var_dropped
uv_dropped
0
AghEnt
False
False
False
I created a function for that but it is not working. The output is same as the input file.
function
def convert_file():
input_file = "../input.csv"
output_file = os.path.splitext(input_file)[0] + "_converted.csv"
df = pd.read_table(input_file, sep=',')
df.to_csv(output_file, index=False, header=True, sep=',')
you could use
df = pd.read_csv(input_file)
this works with your data. There is not much difference though. The only thing that changes is that the empty space before the first delimiter now has Unnamed: 0 in there.
Is that what you wanted? (Still not entirely sure what you are trying to achieve, as you are importing a csv and exporting the same data as a csv without really doing anything with it. the output example you showed is just a formated version of your initial data. but formating is not something csv can do.)
Im reading from my serialport data, I can store this data to .csv file. But the problem is that I want to write my data to a second or third column.
With code the data is stored in the first column:
file = open('test.csv', 'w', encoding="utf",newline="")
writer = csv.writer(file)
while True:
if serialInst.in_waiting:
packet = (serialInst.readline())
packet = [str(packet.decode().rstrip())] #decode remove \r\n strip the newline
writer.writerow(packet)
output of the code .csv file:
Column A
Column B
Data 1
Data 2
Data 3
Data 4
example desired output .csv file:
Column A
Column B
Data1
data 2
Data3
Data 4
I've not use the csv.writer before, but a quick read of the docs, seems to indicate that you can only write one row at a time, but you are getting data one cell/value at a time.
In your code example, you already have a file handle. Instead of writing one row at a time, you want to write one cell at a time. You'll need some extra variables to keep track of when to make a new line.
file = open('test.csv', 'w', encoding="utf",newline="")
writer = csv.writer(file)
ncols = 2 # 2 columns total in this example, but it's easy to imagine you might want more one day
col = 0 # use Python convention of zero based lists/arrays
while True:
if serialInst.in_waiting:
packet = (serialInst.readline())
packet = [str(packet.decode().rstrip())] #decode remove \r\n strip the newline
if col == ncols-1:
# last column, leave out comma and add newline \n
file.write(packet + '\n')
col = 0 # reset col to first position
else:
file.write(packet + ',')
col = col + 1
In this code, we're using the write method of a file object instead of using the csv module. See these docs for how to directly read and write from/to files.
I have a CSV file and I want to:
1. Import the CSV as a Dataframe
2. Read in a row at a time
3. Copy the VALUES of each cell to a separate string
4. Print the strings
5. Go to the next row and repeat steps 3-4 until done.
My code kind of works, it does read in and prints the first 2 rows, but there are 6 in my CSVC file.
I tried adding an index field but that didn't help much, 3 lines printed instead of 6.
Here is what my CSV file looks like: (the extra line return is so you can read it, not shown in my file.
00C525B70C246049E4.dwg,011021a.dwg
00CD5B2301DF204DCC.dwg,010636e.dwg
00F70B6C0B1EF04B54.dwg,005159v.dwg
0A02B9F7087BF040D5.dwg,003552n.dwg
0A1EE7CC078B404C64.dwg,020526c.dwg
0A1F67D201CCD04F81.doc,X1771-a.doc
import pandas
colnames = ['infocard','file_name']
data = pandas.read_csv('E:/test_Files_To_Rename.csv', names=colnames)
for i, elem in enumerate(data,0):
sfile = data.loc[i,"infocard"]
dst = data.loc[i,"file_name"]
print( sfile +' to ' + dst )
Once I get the program to print the two different file names I want to replace the print statement with:
os.rename(sfile, dst)
so I can rename the files. I am testing with 6 files, my database has 50,000 files which is why I want to use a script.
This is what is displayed:
00C525B70C246049E4.dwg to 011021a.dwg
00CD5B2301DF204DCC.dwg to 010636e.dwg
Any ideas?
Thanks!
I used the following code to iterate through the .csv spreadsheet:
import pandas as pd
df = pd.read_csv('/home/stephen/Desktop/data.csv')
for i in range(len(df)):
sfile = df.values[i][0]
dst = df.values[i][1]
print(sfile + ' to ' + dst)
I got the following output:
00C525B70C246049E4.dwg to 011021a.dwg
00CD5B2301DF204DCC.dwg to 010636e.dwg
00F70B6C0B1EF04B54.dwg to 005159v.dwg
0A02B9F7087BF040D5.dwg to 003552n.dwg
0A1EE7CC078B404C64.dwg to 020526c.dwg
0A1F67D201CCD04F81.doc to X1771-a.doc
This is the spreadsheet that I used:
I am new to both Python and Stack Overflow.
I extract from a csv file a few columns into an interim csv file and clean up the data to remove the nan entries. Once I have extracted them, I endup with below two csv files.
Main CSV File:
Sort,Parent 1,Parent 2,Parent 3,Parent 4,Parent 5,Name,Parent 6
1,John,,,Ned,,Dave
2,Sam,Mike,,,,Ken
3,,,Pete,,,Steve
4,,Kerry,,Rachel,,Rog
5,,,Laura,Mitchell,,Kim
Extracted CSV:
Name,ParentNum
Dave,Parent 4
Ken,Parent 2
Steve,Parent 3
Rog,Parent 4
Kim,Parent 4
What I am trying to accomplish is that I would like to recurse through main csv using the name and parent number. But, if I write a for loop it prints empty rows because it is looking up every row for the first value. What is the best approach instead of for loop. I tried dictionary reader to read scv but could not get far. Any help will be appreciated.
CODE:
import xlrd
import csv
import pandas as pd
print('Opening and Reading the msl sheet from the xlsx file')
with xlrd.open_workbook('msl.xlsx') as wb:
sh = wb.sheet_by_index(2)
print("The sheet name is :", sh.name)
with open(msl.csv, 'w', newline="") as f:
c = csv.writer(f)
print('Writing to the CSV file')
for r in range(sh.nrows):
c.writerow(sh.row_values(r))
df1 = pd.read_csv(msl.csv, index_col='Sort')
with open('dirty-processing.csv', 'w', newline="") as tbl_writer1:
c2 = csv.writer(tbl_writer1)
c2.writerow(['Name','Parent'])
for list_item in first_row:
for item in df1[list_item].unique():
row_content = [item, list_item]
c2.writerow(row_content)
Expected Result:
Input Main CSV:
enter image description here
In the above CSV, I would like to grab unique values from each column into a separate file or any other data type. Then also capture the header of the column they are taken from.
Ex:
Negarnaviricota,Phylum
Haploviricotina,Subphylum
...
so on
Next thing is would like to do is get its parent. Which is where I am stuck. Also, as you can see not all columns have data, so I want to get the last non-blank column. Up to this point everything is accomplished using the above code. So the sample output should look like below.
enter image description here
I'm new in Python language and i'm facing a small challenge in which i havent been able to figure it out so far.
I receive a csv file with around 30-40 columns and 5-50 rows with various details in each cell. The 1st row of the csv has the title for each column and by the 2nd row i have item values.
What i want to do is to create a python script which will read the csv file and every time to do the following:
Add a row after the actual 1st item row, (literally after the 2nd row, cause the 1st row is titles), and in that new 3rd row to contain the same information like the above one with one difference only. in the column "item_subtotal" i want to add the value from the column "discount total".
all the bellow rows should remain as they are, and save this modified csv as a new file with the word "edited" added in the file name.
I could really use some help because so far i've only managed to open the csv file with a python script im developing, but im not able so far to add the contents of the above row to that newly created row and replace that specific value.
Looking forward any help.
Thank you
Here Im attaching the CSV with some values changed for privacy reasons.
order_id,order_number,date,status,shipping_total,shipping_tax_total,fee_total,fee_tax_total,tax_total,discount_total,order_total,refunded_total,order_currency,payment_method,shipping_method,customer_id,billing_first_name,billing_last_name,billing_company,billing_email,billing_phone,billing_address_1,billing_address_2,billing_postcode,billing_city,billing_state,billing_country,shipping_first_name,shipping_last_name,shipping_address_1,shipping_address_2,shipping_postcode,shipping_city,shipping_state,shipping_country,shipping_company,customer_note,item_id,item_product_id,item_name,item_sku,item_quantity,item_subtotal,item_subtotal_tax,item_total,item_total_tax,item_refunded,item_refunded_qty,item_meta,shipping_items,fee_items,tax_items,coupon_items,order_notes,download_permissions_granted,admin_custom_order_field:customer_type_5
15001_TEST_2,,"2017-10-09 18:53:12",processing,0,0.00,0.00,0.00,5.36,7.06,33.60,0.00,EUR,PayoneCw_PayPal,"0,00",0,name,surname,,name.surname#gmail.com,0123456789,"address 1",,41541_TEST,location,,DE,name,surname,address,01245212,14521,location,,DE,,,1328,302,"product title",103,1,35.29,6.71,28.24,5.36,0.00,0,,"id:1329|method_id:free_shipping:3|method_title:0,00|total:0.00",,id:1330|rate_id:1|code:DE-MWST-1|title:MwSt|total:5.36|compound:,"id:1331|code:#getgreengent|amount:7.06|description:Launchcoupon for friends","text string",1,
You can also use pandas to manipulate the data from the csv like this:
import pandas
import copy
Read the csv file into a pandas dataframe:
df = pandas.read_csv(filename)
Make a deepcopy of the first row of data and add the discount total to the item subtotal:
new_row = copy.deepcopy(df.loc[1])
new_row['item_subtotal'] += new_row['discount total']
Concatenate the first 2 rows with the new row and then everything after that:
df = pandas.concat([df.loc[:1], new_row, df.loc[2:]], ignore_index=True)
Change the filename and write the out the new csv file:
filename = filename.strip('.csv') + 'edited.csv'
df.to_csv(filename)
I hope this helps! Pandas is great for cleanly handling massive amounts of data, but may be overkill for what you are trying to do. Then again, maybe not. It would help to see an example data file.
The first step is to turn that .csv into something that is a little easier to work with. Fortunately, python has the 'csv' module which makes it easy to turn your .csv file into a much nicer list of lists. The below will give you a way to both turn your .csv into a list of lists and turn the modified data back into a .csv file.
import csv
import copy
def csv2list(ifile):
"""
ifile = the path of the csv to be converted into a list of lists
"""
f = open(ifile,'rb')
olist=[]
c = csv.reader(f, dialect='excel')
for line in c:
olist.append(line) #and update the outer array
f.close
return olist
#------------------------------------------------------------------------------
def list2csv(ilist,ofile):
"""
ilist = the list of lists to be converted
ofile = the output path for your csv file
"""
with open(ofile, 'wb') as csvfile:
csvwriter = csv.writer(csvfile, delimiter=',',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
[csvwriter.writerow(x) for x in ilist]
Now, you can simply copy list[1] and change the appropriate element to reflect your summed value using:
listTemp = copy.deepcopy(ilist[1])
listTemp[n] = listTemp[n] + listTemp[n-x]
ilist.insert(2,listTemp)
As for how to change the file name, just use:
import os
newFileName = os.path.splitext(oldFileName)[0] + "edited" + os.path.splitext(oldFileName)[1]
Hopefully this will help you out!