So I was using openpyxl for all my Excel projects, but now I have to work with .xls files, so I was forced to change library. I've chosen pyexcel cuz it seemed to be fairly easy and well documented. So I've gone through hell with creating hundreds of variables, cuz there is no .index property, or something.
What I want to do now is to read the column in the correct file, f.e "Quantity" column, and get f.e value 12 from it, then check the same column in other file, and if it is not 12, then make it 12. Easy. But I cannot find any words about changing a single cell value in their documentation. Can you help me?
I didn't get it, wouldn't it be the most simple thing?
column_name = 'Quantity'
value_to_find = 12
sheets1 = pe.get_book(file_name='Sheet1.xls')
sheets1[0].name_columns_by_row(0)
row = sheets1[0].column[column_name].index(value_to_find)
sheets2 = pe.get_book(file_name='Sheet2.xls')
sheets2[0].name_columns_by_row(0)
if sheets2[0][row, column_name] != value_to_find:
sheets2[0][row, column_name] = value_to_find
EDIT
Strange, you can only assign values if you use cell_address indexing, must be some bug. Add this function:
def index_to_letter(n):
alphabet = list('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
result = []
while (n > 26):
result.insert(0, alphabet[(n % 26)])
n = n // 26
result.insert(0, alphabet[n])
return ''.join(result)
And modify the last part:
sheets2[0].name_columns_by_row(0)
col_letter = index_to_letter(sheets2[0].colnames.index(column_name))
cel_address = col_letter+str(row+1)
if sheets2[0][cel_address] != value_to_find:
sheets2[0][cel_address] = value_to_find
EDIT 2
Looks like you cannot assign only when you use the column name directly, so a around would be to find the column_name's index:
sheets2[0].name_columns_by_row(0)
col_index = sheets2[0].colnames.index(column_name)
if sheets2[0][row, col_index] != value_to_find:
sheets2[0][row, col_index] = value_to_find
Excel uses 2 sets of references to a cell. Cell name ("A1") and cell vector (Row, Column).
The PyExcel documentation tutorial states it supports both methods. caiohamamura's method tries to build the cell name - you don't need to if the cells are in the same location in each file, you can use the vector.
Once you have the cell, assigning a value to a single cell is simple - you assign the value. Example:
import pyexcel
sheet = pyexcel.get_sheet(file_name="Units.xls")
print(sheet[3,2]) # this gives me "cloud, underwater"
sheet[3,2] = "cloud, underwater, special"
sheet.save_as("Units1.xls")
Note that all I had to do was "sheet[3,2] =".
This is not explicitly stated but is hinted at in the pyexcel documentation where it states that to update a whole column you do:
sheet.column["Column 2"] = [11, 12, 13]
i.e. replace a list by assigning a new list. Same logic applies to a single cell - just assign a new value.
Bonus - the [Row, column] method gets around cell locations in columns greater than 26 (i.e. 'AA' and above).
Caveat - make sure in your comparison you are comparing like-for-like i.e. int is understood to be an int and not a str. Python should implicitly converted but in some circumstances it may not - especially if you are using python 2 and Unicode is involved.
Related
I am taking a user input of "components" splitting it into a list and comparing those components to a list of available components generated from column A of a google sheet. Then what I am attempting to do is return the cell value from column G corresponding the Column A index. Then repeat this for all input values.
So far I am getting the first value just fine but I'm obviously missing something to get it to cycle back and to the remaining user input components. I tried some stuff using itertools but wasn't able to get the results I wanted. I have a feeling I will facepalm when I discover the solution to this through here or on my own.
mix = select.split(',') # sets user input to string and sparates elements
ws = s.worksheet("Details") # opens table in google sheet
c_list = ws.col_values(1) # sets column A to a list
modifier = [""] * len(mix) # sets size of list based on user input
list = str(c_list).lower()
for i in range(len(mix)):
if str(mix[i]).lower() in str(c_list).lower():
for j in range(len(c_list)):
if str(mix[i]).lower() == str(c_list[j]).lower():
modifier[i] = ws.cell(j+1,7).value # get value of cell from Column G corresponding to Column A for component name
print(mix)
print(modifier)
You are over complicating the code by writing C like code.
I have changed all the loops you had to a simpler single loop, I have also left comments above each code line to explain what it does.
# Here we use .lower() to lower case all the values in select
# before splitting them and adding them to the list "mix"
mix = select.lower().split(",")
ws = s.worksheet("Details")
# Here we used a list comprehension to create a list of the "column A"
# values but all in lower case
c_list = [cell.lower() for cell in ws.col_values(1)]
modifier = [""] * len(mix)
# Here we loop through every item in mix, but also keep a count of iterations
# we have made, which we will use later to add the "column G" element to the
# corresponding location in the list "modifier"
for i, value in enumerate(mix):
# Here we check if the value exists in the c_list
if value in c_list:
# If we find the value in the c_list, we get the index of the value in c_list
index = c_list.index(value)
# Here we add the value of column G that has an index of "index + 1" to
# the modifier list at the same location of the value in list "mix"
modifier[i] = ws.cell(index + 1, 7).value
I am trying to figure out the most efficient way of finding similar values of a specific cell in a specified column(not all columns) in an excel .xlsx document. The code I have currently assumes all of the strings are unsorted. However the file I am using and the files I will be using all have strings sorted from A-Z. So instead of doing a linear search I wonder what other search algorithm I could use as well as being able to fix my coding eg(binary search etc).
So far I have created a function: find(). Before the function runs the program takes in a value from the user's input that then gets set as the sheet name. I print out all available sheet names in the excel doc just to help the user. I created an empty array results[] to store well....the results. I created a for loop that iterates through only column A because I only want to iterate through a custom column. I created a variable called start that is the first coordinate in column A eg(A1 or A400) this will change depending on the iteration the loop is on. I created a variable called next that will get compared with the start. Next is technically just start + 1, however since I cant add +1 to a string I concatenate and type cast everything so that the iteration becomes a range from A1-100 or however many cells are in column A. My function getVal() gets called with two parameters, the coordinate of the cell and the worksheet we are working from. The value that is returned from getVal() is also passed inside my function Similar() which is just a function that calls SequenceMatcher() from difflib. Similar just returns the percentage of how similar two strings are. Eg. similar(hello, helloo) returns int 90 or something like that. Once the similar function is called if the strings are above 40 percent similar appends the coordinates into the results[] array.
def setSheet(ws):
sheet = wb[ws]
return sheet
def getVal(coordinate, worksheet):
value = worksheet[coordinate].value
return value
def similar(first, second):
percent = SequenceMatcher(None, first, second).ratio() * 100
return percent
def find():
column = "A"
print("\n")
print("These are all available sheets: ", wb.sheetnames)
print("\n")
name = input("What sheet are we working out of> ")
results = []
ws = setSheet(name)
for i in range(1, ws.max_row):
temp = str(column + str(i))
x = ws[temp]
start = ws[x].coordinate
y = str(column + str(i + 1))
next = ws[y].coordinate
if(similar(getVal(start,ws), getVal(next,ws)) > 40):
results.append(getVal(start))
return results
This is some nasty looking code so I do apologize in advance. The expected results should just be a list of strings that are "similar".
I have started to use the gspread library and have sheet already that I'd like to append after the last row that has data in it. I'll retrieve the values between A1 and maxrows to loop through them and check if they are empty. However, I am unable to add a variable to the second line here. But perhaps I am just not escaping it correct? I bet this is very simple:
maxrows = "A" + str(worksheet.row_count)
cell_list = worksheet.range('A1:A%s') % (maxrows)
Your variable maxrows already is in the form of "An", the concatenation already contains the letter and the number
But you are adding an extra A to it here worksheet.range('A1:A%s')
Also you're not using the string interpolation correctly with % (in your code you are not applying % to the range string)
It should have been one of these
maxrows = "A" + str(worksheet.row_count)
worksheet.range('A1:%s' % maxrows)
or
worksheet.range('A1:A%d' % worksheet.row_count)
(among other possible solutions)
I want to convert the row and column indices into an Excel alphanumeric cell reference like 'A1'. I'm using python and openpyxl, and I suspect there's a utility somewhere in that package that does this, but I haven't found anything after some searching.
I wrote the following, which works, but I'd rather use something that's part of the openpyxl package if it's available.
def xlref(row,column):
"""
xlref - Simple conversion of row, column to an excel string format
>>> xlref(0,0)
'A1'
>>> xlref(0,26)
'AA1'
"""
def columns(column):
from string import uppercase
if column > 26**3:
raise Exception("xlref only supports columns < 26^3")
c2chars = [''] + list(uppercase)
c2,c1 = divmod(column,26)
c3,c2 = divmod(c2,26)
return "%s%s%s" % (c2chars[c3],c2chars[c2],uppercase[c1])
return "%s%d" % (columns(column),row+1)
Does anyone know a better way to do this?
Here's the full new xlref using openpyxl.utils.get_column_letter from #Rick's answer:
from openpyxl.utils import get_column_letter
def xlref(row, column, zero_indexed=True):
if zero_indexed:
row += 1
column += 1
return get_column_letter(column) + str(row)
Now
>>> xlref(0, 0)
'A1'
>>> xlref(100, 100)
'CW101'
Looks like openpyxl.utils.get_column_letter does the same function as my columns function above, and is no doubt a little more hardened than mine is. Thanks for reading!
Older question, but maybe helpful: when using XlsxWriter, one can use xl_rowcol_to_cell() like this:
from xlsxwriter.utility import xl_rowcol_to_cell
cell = xl_rowcol_to_cell(1, 2) # C2
See Working with Cell Notation.
In MySQL table with myISAM I have a integer value ex.011. When I query in Python it prints me value 11 removing 0 before number. It should print the exact value that is stored in DB ex. 011 instead of 11. Any help ?
Your column is an int, so MySQLdb gives you an integer value back in the query result. However, I think you should be able to write a mySQLdb result set wrapper (or maybe find one someone else already wrote) that inspects the flags set on the columns of the result set and casts to a string appropriately.
Look at cursor.description and cursor.description_flags as well as PEP-249. I think (ie I have not actually tried it) something along the lines of the following should get you started:
def get_result_set_with_db_specified_formatting(cursor):
integer_field_types = (MySQLdb.constants.FIELD_TYPE.TINY,
MySQLdb.constants.FIELD_TYPE.SHORT,
MySQLdb.constants.FIELD_TYPE.LONG,
MySQLdb.constants.FIELD_TYPE.LONGLONG,
MySQLdb.constants.FIELD_TYPE.INT24)
rows = cursor.fetchall()
for row in rows:
for index, value in enumerate(row):
value = str(value)
if (cursor.description[index][1] in integer_field_types
and cursor.description_flags[index] & MySQLdb.constants.FLAG.ZEROFILL):
if len(value) < cursor.description[index][2]:
value = ('0' * (cursor.description[index][2] - len(value))) + value
row[index] = value
return rows
May be, simple zero filling is OK in this case?
>>> print str(11).zfill(3)
011
As I understood, it's additional part of number. If its length is not constant, you need to change data type in DB to VARCHAR.