Python Export Dictionary to CSV - python

I'm new in Python and I've been trying to create csv file and save each result in a new row. The results consist of several rows and each line should be captured in csv. However, my csv file separate each letter into new row. I also need to add new key values for the filename, but I dont know how to get the image filename (input is images). I used the search bar searching for similar case/recommended solution but still stumped. Thanks in advance.
with open('glaresss_result.csv','wt') as f:
f.write(",".join(res1.keys()) + "\n")
for imgpath in glob.glob(os.path.join(TARGET_DIR, "*.png")):
res1,res = send_request_qcglare(imgpath)
for row in zip(*res1.values()):
f.write(",".join(str(n) for n in row) + "\n")
f.close()
dictionary res1 printed during the iteration returns:
{'glare': 'Passed', 'brightness': 'Passed'}
The results should be like this (got 3 rows):
glare brightness
Passed Passed
Passed Passed
Passed. Passed
But the current output looks like this:

Few things I changed.
w is enough, since t for text mode is default
no need to close the csv when using a context manager
no need for zip and str(n) for n in row. Just join the 2 values of the dictionary
UPDATED
with open('glaresss_result.csv','w') as f:
f.write(",".join([*res1] + ['filename']) + "\n") # replace filename with whatever columnname you want
for imgpath in glob.glob(os.path.join(TARGET_DIR, "*.png")):
res1,res = send_request_qcglare(imgpath)
f.write(",".join([*res1.values()] + [imgpath]) + "\n") # imgpath (which needs to be a string) will be the value of each row, replace with whatever you suits

If you plan to do more with data, you might want to check out the pandas library.
In your use case p.ex DataFrame.from_records
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.from_records.html
It provides a lot of out of the box functionalities to read, transform and write data.
import pandas as pd
results = []
for imgpath in glob.glob(os.path.join(TARGET_DIR, "*.png")):
res1,res = send_request_qcglare(imgpath)
result.append(res1)
df = pd.DataFrame.from_records(results)
df.to_csv("glaresss_result.csv", index=False)

Related

Python entire XML file to list and then into dataframe, missing most of the file

My final goal is to take each xml file and enter the raw format of the XML into Snowflake, and this is the result I have so far. For some reason though when i convert the list to a Dataframe, the dataframe is only take a couple items from the list for each file...and not the entire 5000 rows in the xml.
My list Data is grabbing all contents from multiple files, in the list you can see the following:
Each list item is genertating a numpy array and its splitting up the elements from the looks of it.
dated = datetime.today().strftime('%Y-%m-%d')
source_dir = r'C:\Users\jSmith\.spyder-py3\SampleXML'
table_name = 'LV_XML'
file_list = glob.glob(source_dir + '/*.XML')
data = []
for file_path in file_list:
data.append(
np.genfromtxt(file_path,dtype='str',delimiter='|',encoding='utf-8')) #delimiter used to make sure it is not splitting based on spaces, might be the issue?
df = pd.DataFrame(list(zip(data)),
columns =['SRC_XML'])
df['SRC_XML']=df['SRC_XML'].astype(str)
df = df.replace(',','', regex=True)
df["TPR_AS_OF_DT"] = dated
The data frame has the following in each column:
Solution via Dave, with a small tweak:
for file_path in file_list:
with open(file_path,'r') as afile:
content = ''
for aline in afile:
content += aline.replace('\n',' ') # changed to replace for my needs
data.append(content)
This puts the data into a single string, and allows it to be ready to be inserted into the Snowflake table as 1 string...for future queries
Perhaps replace the file reading with this:
for file_path in file_list:
with open(file_path,'r') as afile:
content = ''
for aline in afile:
content += aline.strip('\n')
data.append(content)

How to read a CSV to pandas and get the value of one cell

I have a CSV file and I want to:
1. Import the CSV as a Dataframe
2. Read in a row at a time
3. Copy the VALUES of each cell to a separate string
4. Print the strings
5. Go to the next row and repeat steps 3-4 until done.
My code kind of works, it does read in and prints the first 2 rows, but there are 6 in my CSVC file.
I tried adding an index field but that didn't help much, 3 lines printed instead of 6.
Here is what my CSV file looks like: (the extra line return is so you can read it, not shown in my file.
00C525B70C246049E4.dwg,011021a.dwg
00CD5B2301DF204DCC.dwg,010636e.dwg
00F70B6C0B1EF04B54.dwg,005159v.dwg
0A02B9F7087BF040D5.dwg,003552n.dwg
0A1EE7CC078B404C64.dwg,020526c.dwg
0A1F67D201CCD04F81.doc,X1771-a.doc
import pandas
colnames = ['infocard','file_name']
data = pandas.read_csv('E:/test_Files_To_Rename.csv', names=colnames)
for i, elem in enumerate(data,0):
sfile = data.loc[i,"infocard"]
dst = data.loc[i,"file_name"]
print( sfile +' to ' + dst )
Once I get the program to print the two different file names I want to replace the print statement with:
os.rename(sfile, dst)
so I can rename the files. I am testing with 6 files, my database has 50,000 files which is why I want to use a script.
This is what is displayed:
00C525B70C246049E4.dwg to 011021a.dwg
00CD5B2301DF204DCC.dwg to 010636e.dwg
Any ideas?
Thanks!
I used the following code to iterate through the .csv spreadsheet:
import pandas as pd
df = pd.read_csv('/home/stephen/Desktop/data.csv')
for i in range(len(df)):
sfile = df.values[i][0]
dst = df.values[i][1]
print(sfile + ' to ' + dst)
I got the following output:
00C525B70C246049E4.dwg to 011021a.dwg
00CD5B2301DF204DCC.dwg to 010636e.dwg
00F70B6C0B1EF04B54.dwg to 005159v.dwg
0A02B9F7087BF040D5.dwg to 003552n.dwg
0A1EE7CC078B404C64.dwg to 020526c.dwg
0A1F67D201CCD04F81.doc to X1771-a.doc
This is the spreadsheet that I used:

Write within-patient data from single observation to multiple text files

I am trying to write individual column data within a dataframe in which each row represents one patient's data. I have a loop function that takes one patient's 'id' to generate 25 'id'.txt files - one for each patient. I now want to loop through the df, pick up individual data points (e.g. the 'fio2' value for patient with id=6) and append it to that patient's .txt file.
Here is the problem I need some guidance with: when I run the for loops (I've tried multiple variations) all I get is ALL 25 values for all patients are appended to every individual patient's text file.
The df/data look like this
My basic code that create/write to the text files is:
for i in data['id']:
filename = str(i) + '.txt'
f = open(filename, 'a+')
f.write('{}\n'.format('-----------------------------------------------'))
f.write(datetime.datetime.now().strftime("%d.%m.%y"))
f.write('{}\n'.format(''))
f.write('{}\n'.format('Updated summary of patient data'))
f.close()
I believe (probably incorrectly) that I need a nested loop. How would I modify this code to do what I need done?
You could try something like this:
import pandas as pd
d = {
'id':range(10),
'name': list('abcdefghij')
}
df = pd.DataFrame(d)
print(df.head(2))
def search_id_and_return_field(id,return_field_name):
return df.loc[df.id==id][return_field_name].values[0]
required_ids = [1,5]
for id in required_ids:
print(search_id_and_return_field(id=id,return_field_name='name'))
break
In your code, it would fit in somewhere like so:
for i in required_ids:
filename = str(i) + '.txt'
f = open(filename, 'a+')
f.write('{}\n'.format('-----------------------------------------------'))
f.write(datetime.datetime.now().strftime("%d.%m.%y"))
f.write('{}\n'.format(search_id_and_return_field(id=i,return_field_name="fio2"))) # Change your fieldname to be returned here
f.write('{}\n'.format('Updated summary of patient data'))
f.close()

how to edit a csv in python and add one row after the 2nd row that will have the same values in all columns except 1

I'm new in Python language and i'm facing a small challenge in which i havent been able to figure it out so far.
I receive a csv file with around 30-40 columns and 5-50 rows with various details in each cell. The 1st row of the csv has the title for each column and by the 2nd row i have item values.
What i want to do is to create a python script which will read the csv file and every time to do the following:
Add a row after the actual 1st item row, (literally after the 2nd row, cause the 1st row is titles), and in that new 3rd row to contain the same information like the above one with one difference only. in the column "item_subtotal" i want to add the value from the column "discount total".
all the bellow rows should remain as they are, and save this modified csv as a new file with the word "edited" added in the file name.
I could really use some help because so far i've only managed to open the csv file with a python script im developing, but im not able so far to add the contents of the above row to that newly created row and replace that specific value.
Looking forward any help.
Thank you
Here Im attaching the CSV with some values changed for privacy reasons.
order_id,order_number,date,status,shipping_total,shipping_tax_total,fee_total,fee_tax_total,tax_total,discount_total,order_total,refunded_total,order_currency,payment_method,shipping_method,customer_id,billing_first_name,billing_last_name,billing_company,billing_email,billing_phone,billing_address_1,billing_address_2,billing_postcode,billing_city,billing_state,billing_country,shipping_first_name,shipping_last_name,shipping_address_1,shipping_address_2,shipping_postcode,shipping_city,shipping_state,shipping_country,shipping_company,customer_note,item_id,item_product_id,item_name,item_sku,item_quantity,item_subtotal,item_subtotal_tax,item_total,item_total_tax,item_refunded,item_refunded_qty,item_meta,shipping_items,fee_items,tax_items,coupon_items,order_notes,download_permissions_granted,admin_custom_order_field:customer_type_5
15001_TEST_2,,"2017-10-09 18:53:12",processing,0,0.00,0.00,0.00,5.36,7.06,33.60,0.00,EUR,PayoneCw_PayPal,"0,00",0,name,surname,,name.surname#gmail.com,0123456789,"address 1",,41541_TEST,location,,DE,name,surname,address,01245212,14521,location,,DE,,,1328,302,"product title",103,1,35.29,6.71,28.24,5.36,0.00,0,,"id:1329|method_id:free_shipping:3|method_title:0,00|total:0.00",,id:1330|rate_id:1|code:DE-MWST-1|title:MwSt|total:5.36|compound:,"id:1331|code:#getgreengent|amount:7.06|description:Launchcoupon for friends","text string",1,
You can also use pandas to manipulate the data from the csv like this:
import pandas
import copy
Read the csv file into a pandas dataframe:
df = pandas.read_csv(filename)
Make a deepcopy of the first row of data and add the discount total to the item subtotal:
new_row = copy.deepcopy(df.loc[1])
new_row['item_subtotal'] += new_row['discount total']
Concatenate the first 2 rows with the new row and then everything after that:
df = pandas.concat([df.loc[:1], new_row, df.loc[2:]], ignore_index=True)
Change the filename and write the out the new csv file:
filename = filename.strip('.csv') + 'edited.csv'
df.to_csv(filename)
I hope this helps! Pandas is great for cleanly handling massive amounts of data, but may be overkill for what you are trying to do. Then again, maybe not. It would help to see an example data file.
The first step is to turn that .csv into something that is a little easier to work with. Fortunately, python has the 'csv' module which makes it easy to turn your .csv file into a much nicer list of lists. The below will give you a way to both turn your .csv into a list of lists and turn the modified data back into a .csv file.
import csv
import copy
def csv2list(ifile):
"""
ifile = the path of the csv to be converted into a list of lists
"""
f = open(ifile,'rb')
olist=[]
c = csv.reader(f, dialect='excel')
for line in c:
olist.append(line) #and update the outer array
f.close
return olist
#------------------------------------------------------------------------------
def list2csv(ilist,ofile):
"""
ilist = the list of lists to be converted
ofile = the output path for your csv file
"""
with open(ofile, 'wb') as csvfile:
csvwriter = csv.writer(csvfile, delimiter=',',
quotechar='|', quoting=csv.QUOTE_MINIMAL)
[csvwriter.writerow(x) for x in ilist]
Now, you can simply copy list[1] and change the appropriate element to reflect your summed value using:
listTemp = copy.deepcopy(ilist[1])
listTemp[n] = listTemp[n] + listTemp[n-x]
ilist.insert(2,listTemp)
As for how to change the file name, just use:
import os
newFileName = os.path.splitext(oldFileName)[0] + "edited" + os.path.splitext(oldFileName)[1]
Hopefully this will help you out!

How can I write to an existing csv file from a dictionary to a specific column?

I have a dictionary I created from a csv file and would like to use this dict to update the values in a specific column of a different csv file called sheet2.csv.
Sheet2.csv has many columns with different headers and I need to only update the column PartNumber based on my key value pairs in my dict.
My question is how would I use the keys in dict to search through sheet2.csv and update/write to only the column PartNumber with the appropriate value?
I am new to python so I hope this is not too confusing and any help is appreciated!
This is the code I used to create the dict:
import csv
a = open('sheet1.csv', 'rU')
csvReader = csv.DictReader(a)
dict = {}
for line in csvReader:
dict[line["ReferenceID"]] = line["PartNumber"]
print(dict)
dict = {'R150': 'PN000123', 'R331': 'PN000873', 'C774': 'PN000064', 'L7896': 'PN000447', 'R0640': 'PN000878', 'R454': 'PN000333'}
To make things even more confusing, I also need to make sure that already existing rows in sheet2 remain unchanged. For example, if there is a row with ReferenceID as R1234 and PartNumber as PN000000, it should stay untouched. So I would need to skip rows which are not in my dict.
Link to sample CSVs:
http://dropbox.com/s/zkagunnm0xgroy5/Sheet1.csv
http://dropbox.com/s/amb7vr48mdc94v6/Sheet2.csv
EDIT: Let me rephrase my question and provide a better example csvfile.
Let's say I have a Dict = {'R150': 'PN000123', 'R331': 'PN000873', 'C774': 'PN000064', 'L7896': 'PN000447', 'R0640': 'PN000878', 'R454': 'PN000333'}.
I need to fill in this csv file: https://www.dropbox.com/s/c95mlitjrvyppef/sheet.csv
Specifically, I need to fill in the PartNumber column using the keys of the dict I created. So I need to iterate through column ReferenceID and compare that value to my keys in dict. If there is a match I need to fill in the corresponding PartNumber cell with that value.... I'm sorry if this is all confusing!
The code below should do the trick. It first builds a dictionary just like your code and then moves on to read Sheet2.csv row by row, possibly updating the part number. The output goes to temp.csv which you can compare with the inital Sheet2.csv. In case you want to overwrite Sheet2.csv with the contents of temp.csv, simply uncomment the line with shutil.move.
Note that the sample files you provided do not contain any updateable data, so Sheet2.csv and temp.csv will be identical. I tested this with a slightly modified Sheet1.csv where I made sure that it actually contains a reference ID used by Sheet2.csv.
import csv
import shutil
def createReferenceIdToPartNumberMap(csvToReadPath):
result = {}
print 'read part numbers to update from', csvToReadPath
with open(csvToReadPath, 'rb') as csvInFile:
csvReader = csv.DictReader(csvInFile)
for row in csvReader:
result[row['ReferenceID']] = row['PartNumber']
return result
def updatePartNumbers(csvToUpdatePath, referenceIdToPartNumberMap):
tempCsvPath = 'temp.csv'
print 'update part numbers in', csvToUpdatePath
with open(csvToUpdatePath, 'rb') as csvInFile:
csvReader = csv.reader(csvInFile)
# Figure out which columns contain the reference ID and part number.
titleRow = csvReader.next()
referenceIdColumn = titleRow.index('ReferenceID')
partNumberColumn = titleRow.index('PartNumber')
# Write tempoary CSV file with updated part numbers.
with open(tempCsvPath, 'wb') as tempCsvFile:
csvWriter = csv.writer(tempCsvFile)
csvWriter.writerow(titleRow)
for row in csvReader:
# Check if there is an updated part number.
referenceId = row[referenceIdColumn]
newPartNumber = referenceIdToPartNumberMap.get(referenceId)
# If so, update the row just read accordingly.
if newPartNumber is not None:
row[partNumberColumn] = newPartNumber
print ' update part number for %s to %s' % (referenceId, newPartNumber)
csvWriter.writerow(row)
# TODO: Move the temporary CSV file over the initial CSV file.
# shutil.move(tempCsvPath, csvToUpdatePath)
if __name__ == '__main__':
referenceIdToPartNumberMap = createReferenceIdToPartNumberMap('Sheet1.csv')
updatePartNumbers('Sheet2.csv', referenceIdToPartNumberMap)

Categories

Resources