openpyxl: Fetch value from excel and store in key value pair

openpyxl: Fetch value from excel and store in key value pair - python

Have written a python script that fetches the cell values and displays in a list row by row.
Here is my script:
book = openpyxl.load_workbook(excel_file_name)
active = book.get_sheet_by_name(excel_file_name)
def iter_rows(active):
for row in active.iter_rows():
yield [cell.value for cell in row]
res = list(iter_rows(active))
for new in res:
print new
Output for the above script:
[state, country, code]
[abc, xyz, 0][def, lmn, 0]
I want output in below format:
[state:abc, country:xyz, code:0][state:def, country:lmn, code:0]
Please note: I want to do this from openpyxl

I think this should do exactly what you want.
import openpyxl
book = openpyxl.load_workbook("Book1.xlsx")
active = book.get_sheet_by_name("Sheet1")
res = list(active)
final = []
for x in range(1, len(res)):
partFinal = {}
partFinal[res[0][0].value] = res[x][0].value
partFinal[res[0][1].value] = res[x][1].value
partFinal[res[0][2].value] = res[x][2].value
final.append(partFinal)
for x in final:
print x

Related

CSV: how to add columns to list and find best match (closest value) from another list?

I have a code that reads a column in a CSV file with 3 columns: Zone, Offnet calls, and Traffic.
I need the code that will create a list from rows: for example line 3 will have a list of [421, 30167] and it will search the best match (closest value) from the list provided from the code:
tp_usp15 = [10, 200]
tp_usp23 = [15, 250]
tp_usp27 = [20, 300]
list_usp = [tp_usp15,tp_usp23, tp_usp27]
tp_bsnspls_s = [1,30]
tp_bsnspls_steel = [13,250]
tp_bsnspls_chrome = [18,350]
list_bsnspls = [tp_bsnspls_s,tp_bsnspls_steel,tp_bsnspls_chrome]
tp_bsnsrshn10 = [10,200]
tp_bsnsrshn15 = [15,300]
tp_bsnsrshn20 = [20,400]
list_bsnsrshn = [tp_bsnsrshn10,tp_bsnsrshn15,tp_bsnsrshn20]
common_list = list_usp + list_bsnspls + list_bsnsrshn
So, I need a code that will compare rows from CSV to this list. For example, I have a code that works with the close concept but it works with inputs provided by the user.
client_traffic = int(input("Enter the expected monthly traffic: "))
client_offnet = int(input("Enter monthly offnet calls: "))
list_client = [client_payment, client_offnet]
from functools import partial
def distance_squared(x, y):
return (x[0] - y[0])**2 + (x[1] - y[1])**2
best_match_overall = min(common_list, key=partial(distance_squared, list_client))
name_best_match_overall = [k for k,v in locals().items() if v == best_match_overall][0]
How can I update this code to work with every row from the CSV file? Please, help.

UnboundLocalError in python while returning list variable

I am creating a python script to load data from a csv file. I want that data to be used for training of neural network and so I want its format to be a list of tupples (x,y) where x and y are numpy array containing input and output. But when I return that list(td in following code) I get this error 'UnboundLocalError: local variable 'td' referenced before assignment'
Moderators, there are many questions about this error on stackoverflow and I have read them and still couldn't find a solutions so I am posting this.
import csv
import numpy as np
def load_data():
// loading the file
with open('train.csv','rb') as csvfile:
reader = csv.DictReader(csvfile,delimiter=',')
for row in reader:
// these if statements are to check if any of the field in csv file
// is empty or not
if(row['start_date'] != ""):
a = True
if(row['sold'] != ""):
b = True
if(row['euribor_rate'] != ""):
c = True
if(row['libor_rate'] != ""):
d = True
if(row['bought'] != ""):
e = True
if(row['creation_date'] != ""):
f = True
if(row['sell_date'] != ""):
g = True
if(row['return'] != ""):
h = True
if(a and b and c and d and e and f and g and h):
// if any of the fields is empty then go to next row
pass
else:
// now grab the fields
mrow = {'sd':row['start_date'],'s':row['sold'],'er':row['euribor_rate'],'lr':row['libor_rate'],'b':row['bought'],'cd':row['creation_date'],'sd':row['sell_date'],'r':row['return']}
// this will change the data type of fields to float
int_dict = dict((k,float(v)) for k,v in mrow.iteritems())
// save an input data field in x
x = np.array([int_dict['s']])
// save the output data field in y
y = np.array([int_dict['r']])
// put them in tuple
tuple = (x,y)
//make a list
td = []
// append them to list
td.append(tuple)
//return the list
return td
As most of the answers say - that is - to declare td = [] outside of function and then using 'global td' and then td.append() . That also didn't worked. It then didn't gave the error but it returned a empty list.

You're probably not entering in the else part of the loop. To avoid that, you'll probably need to move the td part to the top of the loop, so it's always defined:
def load_data():
with open('train.csv','rb') as csvfile:
reader = csv.DictReader(csvfile,delimiter=',')
td = []
for row in reader:
...

reading across the rows and columns

I have following text that I am able to read using my code
1;6;7.1023;13;7.4583;15;7.8140;45;6;7.1023;13;7.4583;15;7.8140;45
2;6;19.1023;13;19.4583;15;19.8140;45;6;19.1023;13;19.4583;15;19.8140;45
4;6;19.1023;13;19.4583;15;19.8140;45;6;19.1023;13;19.4583;15;19.8140;45
...
20; ...
I wrote following code:
my_val = []
row=20
col=15
fr = open("%s" %filename,"r")
for i in range(0,row):
for j in range (0,col):
a = fr.readline().split(";")
my_val = my_val + [float(a[2])]
print my_val
This gives me values at location a[2] ( e.g. in first row: 7.1023) on every line from row 1 to 20.
What I want is to simultaneously capture the values a[2]..a[4],a[6] from every row for rows- row1,4..7 (i.e. every third row) and store it in my_val.
Any ideas how I can extend the above code to do this.

if I understand correctly what you want it to do, then I think this code will do it:
my_val = []
row=20
col=15
f = open("%s" %filename,"r")
lines = f.readlines()
for i in range(1,row,3):
a = lines[i].split(";")
for j in range(2,col,2):
my_val.append(float(a[j]))
print my_val
f.close()

Writing a list to excel

I am trying to do the following:
For each entry in Col A, if that entry recurs in the same Col A, then add together all of its values in Col E.
Then, write only that (added) values from Col E into another excel sheet. Each Col A entry should have all the Col E values corresponding to it.
However, I can create that output sheet for the last row only.
Here is the code that I've written,
#! /usr/bin/env python
from xlrd import open_workbook
from tempfile import TemporaryFile
from xlwt import Workbook
wb = open_workbook('/Users/dem/Documents/test.xlsx')
wk = wb.sheet_by_index(0)
for i in range(wk.nrows):
a = str(wk.cell(i,0).value)
b = []
e = []
for j in range(wk.nrows):
c = str(wk.cell(j,0).value)
d = str(wk.cell(j,4).value)
if a == c:
b.append(d)
print b
e.append(b)
book = Workbook()
sheet1 = book.add_sheet('sheet1')
n = 0
for n, item in enumerate(e):
sheet1.write(n,0,item)
n +=1
book.save('/Users/dem/Documents/res.xls')
book.save(TemporaryFile())
Erred resulting sheet(mine):

Comments in the code.
#! /usr/bin/env python
from xlrd import open_workbook
from tempfile import TemporaryFile
from xlwt import Workbook
import copy
wb = open_workbook('C:\\Temp\\test.xls')
wk = wb.sheet_by_index(0)
# you need to put e=[] outside the loop in case they are reset to empty list every loop
# e is used to store final result
e = []
# f is used to store value in Col A which means we only record value once
f = []
for i in range(wk.nrows):
b = []
temp = None
a = str(wk.cell(i,0).value)
#here we only record value once
if a in f:
continue
#here you should start from i+1 to avoid double counting
for j in range(i+1, wk.nrows):
c = str(wk.cell(j,0).value)
if a == c:
# you can put operations here in order to make sure they are executed only when needed
d = str(wk.cell(j,4).value)
k = str(wk.cell(i,4).value)
f.append(a)
# record all the value in Col E
b.append(k)
b.append(d)
# you need to use deepcopy here in order to get accurate value
temp = copy.deepcopy(b)
# in your case, row 5 has no duplication, temp for row 5 will be none, we need to avoid adding none to final result
if temp:
e.append(temp)
book = Workbook()
sheet1 = book.add_sheet('sheet1')
n = 0
for n, item in enumerate(e):
sheet1.write(n,0,item)
# you don't need n+=1 here, since n will increase itself
book.save('C:\\Temp\\res.xls')
book.save(TemporaryFile())

I think you should look forward to use csv.writer with dialect='excel' There is an example in this documentation on usage. I think this is just the simplest way to work with excel if you don't need huge functionality like in your case.

Printing to a .csv file from a Random List

When I create a random List of numbers like so:
columns = 10
rows = 10
for x in range(rows):
a_list = []
for i in range(columns):
a_list.append(str(random.randint(1000000,99999999)))
values = ",".join(str(i) for i in a_list)
print values
then all is well.
But when I attempt to send the output to a file, like so:
sys.stdout = open('random_num.csv', 'w')
for i in a_list:
print ", ".join(map(str, a_list))
it is only the last row that is output 10 times. How do I write the entire list to a .csv file ?

In your first example, you're creating a new list for every row. (By the way, you don't need to convert them to strs twice).
In your second example, you print the last list you had created previously. Move the output into the first loop:
columns = 10
rows = 10
with open("random_num.csv", "w") as outfile:
for x in range(rows):
a_list = [random.randint(1000000,99999999) for i in range(columns)]
values = ",".join(str(i) for i in a_list)
print values
outfile.write(values + "\n")

Tim's answer works well, but I think you are trying to print to terminal and the file in different places.
So with minimal modifications to your code, you can use a new variable all_list
import random
import sys
all_list = []
columns = 10
rows = 10
for x in range(rows):
a_list = []
for i in range(columns):
a_list.append(str(random.randint(1000000,99999999)))
values = ",".join(str(i) for i in a_list)
print values
all_list.append(a_list)
sys.stdout = open('random_num.csv', 'w')
for a_list in all_list:
print ", ".join(map(str, a_list))

The csv module takes care of a bunch the the crap needed for dealing with csv files.
As you can see below, you don't need to worry about conversion to strings or adding line-endings.
import csv
columns = 10
rows = 10
with open("random_num.csv", "wb") as outfile:
writer = csv.writer(outfile)
for x in range(rows):
a_list = [random.randint(1000000,99999999) for i in range(columns)]
writer.writerow(a_list)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

openpyxl: Fetch value from excel and store in key value pair - python

Related

CSV: how to add columns to list and find best match (closest value) from another list?

UnboundLocalError in python while returning list variable

reading across the rows and columns

Writing a list to excel

Printing to a .csv file from a Random List

Categories

Resources