Python Index out of range Error in lib loop issue - python

everything's fine? I hope so.
I'm dealing with this issue: List index out of range. -
Error message:
c:\Users.....\Documents\t.py:41: FutureWarning: As the xlwt package is no longer maintained, the xlwt engine will be removed in a future version of pandas. This is the only engine in pandas that supports writing in the xls format. Install openpyxl and write to an xlsx file instead. You can set the option io.excel.xls.writer to 'xlwt' to silence this warning. While this option is deprecated and will also raise a warning, it can be globally set and the warning suppressed.
read_file.to_excel(planilhaxls, index = None, header=True)
The goal: I need to create a loop that store a specific line of a worksheet such as sheet_1.csv, this correspondent line in sheet_2.csv and a third sheet also, stored in 3 columns in a sheet_output.csv
Issue: It's getting an index error out of range that I don't know what to do
Doubt: There is any other way that I can do it?
The code is below:
(Please, ignore portuguese comments)
import xlrd as ex
import pyautogui as pag
import os
import pyperclip as pc
import pandas as pd
import pygetwindow as pgw
import openpyxl
#Inputs
numerolam = int(input('Escolha o número da lamina: '))
amostra = input('Escoha a amostra: (X, Y, W ou Z): ')
milimetro_inicial = int(input("Escolha o milimetro inicial: "))
milimetro_final = int(input("Escolha o milimetro final: "))
tipo = input("Escolha o tipo - B para Branco & E para Espelho: ")
linha = int(input("Escolha a linha da planilha: "))
# Conversão de código
if tipo == 'B':
tipo2 = 'BRA'
else:
tipo2 = 'ESP'
#Arquivo xlsx
#planilhaxlsx = f'A{numerolam}{amostra}{milimetro_inicial}{tipo2}.xlsx'
#planilhaxls = f'A{numerolam}{amostra}{milimetro_inicial}{tipo2}.xls'
#planilhacsv = f'A{numerolam}{amostra}{milimetro_inicial}{tipo2}.csv'
#planilhacsv_ = f'A{numerolam}{amostra}{milimetro_final}{tipo2}.csv'
#arquivoorigin = f'A{numerolam}{amostra}{milimetro_inicial}{tipo2}.opj'
#Pasta
pasta = f'L{numerolam}{amostra}'
while milimetro_inicial < milimetro_final:
planilhaxlsx = f'A{numerolam}{amostra}{milimetro_inicial}{tipo2}.xlsx'
planilhaxls = f'A{numerolam}{amostra}{milimetro_inicial}{tipo2}.xls'
planilhacsv = f'A{numerolam}{amostra}{milimetro_inicial}{tipo2}.csv'
planilhacsv_ = f'A{numerolam}{amostra}{milimetro_final}{tipo2}.csv'
arquivoorigin = f'A{numerolam}{amostra}{milimetro_inicial}{tipo2}.opj'
# Converte o arquivo .csv para .xls e .xlsx
read_file = pd.read_csv(planilhacsv)
read_file.to_excel(planilhaxls, index = None, header=True)
#read_file.to_excel(planilhaxlsx, index = None, header=True)
# Abre o arquivo .xls com o xlrd - arquivo excel.
book = ex.open_workbook(planilhaxls)
sh = book.sheet_by_index(0)
# Declaração de variáveis.
coluna_inicial = 16 # Q - inicia em 0
valor = []
index = 0
# Loop que armazena o valor da linha pela coluna Q-Z na variável valor 0-(len-1)
while coluna_inicial < 25:
**#ERRO NA LINHA ABAIXO**
**temp = sh.cell_value(linha, coluna_inicial)**
valor.append(temp) # Adiciona o valor
print(index)
print(valor[index])
index +=1
coluna_inicial += 1
# Abre planilha de saída
wb = openpyxl.Workbook()
ws = wb.active
#Inicia loop de escrita
colunas = ['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z']
idx_colunas = 0
contador_loop = colunas[idx_colunas]
linha_loop = 1
index_out = 0
s = f'{contador_loop}{linha_loop}'
print(s)
while linha_loop < len(valor):
valor[index_out] = "{}".format(valor[index_out])
ws[s].value = valor[index_out]
print(valor[index_out] + ' feito')
linha_loop += 1
idx_colunas += 1
index_out += 1
# Salva planilha de saída
wb.save("teste.xlsx")
milimetro_inicial += 1

Your problem is on this line
temp = sh.cell_value(linha, coluna_inicial)
There are two index params used linha and coluna_inicial, 'linha' appears to be a static value so the problem would seem to be with 'coluna_inicial' which gets increased by 1 each iteration
coluna_inicial += 1
The loop continues while 'coluna_inicial' value is less than 25. I suggest you check number of columns in the sheet 'sh' using
sh.ncols
either for debugging or as the preferred upper value of your loop. If this is less than 25 you will get the index error once 'coluna_inicial' value exceeds the 'sh.ncols' value.
<---------------Additional Information---------------->
Since this is an xls file there would be no need for delimiter settings, your code as is should open it correctly. However since the xls workbook to be opened is determined by params entered by the user at the start presumably meaning there are a number in the directory to choose from, are you sure you are checking the xls file your code run is opening? Also if there is more than one sheet in the workbook(s) are you opening the correct sheet?
You can print the workbook name to be sure which one is being opened. Also by adding verbosity to the open_workbook command (level 2 should be high enough), it will upon opening the book, print in console details of the sheets available including number of rows and columns in each.
print(planilhaxls)
book = ex.open_workbook(planilhaxls, verbosity=2)
sh = book.sheet_by_index(0)
print(sh.name)
E.g.
BOF: op=0x0809 vers=0x0600 stream=0x0010 buildid=14420 buildyr=1997 -> BIFF80
sheet 0('Sheet1') DIMENSIONS: ncols=21 nrows=21614
BOF: op=0x0809 vers=0x0600 stream=0x0010 buildid=14420 buildyr=1997 ->
BIFF80
sheet 1('Sheet2') DIMENSIONS: ncols=13 nrows=13
the print(sh.name) as shown checks the name of the sheet that 'sh' is assigned to.

Related

Filter Pandas dataframe with user input

I'm trying to develop this code where I would have certain inputs for different variables, these would make the filter happen and return the filtered dataframe, this input will always only receive a single value that the user will choose amoung fewer options and if the input is empty, that filter must bring all the data.
I didn't put the user input because I was testing the function first, however, the function always returns an empty dataframe and I can't find out why. Here is the code I was developing:
I didn't put the dataframe because it comes from an excel, but if necessary I'll put together a sample that fits
df = pd.DataFrame({"FarolAging":["Vermelho","Verde","Amarelo"],"Dias Pendentes":["20 dias","40 dias","60 dias"],"Produto":["Prod1","Prod1","Prod2"],
"Officer":["Alexandre Denardi","Alexandre Denardi","Lucas Fernandes"],"Analista":["Guilherme De Oliveira Moura","Leonardo Silva","Julio Cesar"],
"Coord":["Anna Claudia","Bruno","Bruno"]})
FarolAging1 = ['Vermelho']
DiasPendentes = []
Produto = []
Officer = []
def func(FarolAging1,DiasPendentes,Produto,Officer):
if len(Officer) <1:
Officer = df['Officer'].unique()
if len(FarolAging1) <1:
FarolAging1 = df['FarolAging'].unique()
if len(DiasPendentes) <1:
DiasPendentes = df['Dias Pendentes'].unique()
if len(Produto) <1:
Produto = df['Produto'].unique()
dados2 = df.loc[df['FarolAging'].isin([FarolAging1]) & (df['Dias Pendentes'].isin([DiasPendentes])) & (df['Produto'].isin([Produto])) & (df['Officer'].isin([Officer]))]
print(dados2)
func(FarolAging1, DiasPendentes, Produto, Officer) ```
You have to remove the square brackets in isin because you already have lists:
def func(FarolAging1,DiasPendentes,Produto,Officer):
if len(Officer) <1:
Officer = df['Officer'].unique()
if len(FarolAging1) <1:
FarolAging1 = df['FarolAging'].unique()
if len(DiasPendentes) <1:
DiasPendentes = df['Dias Pendentes'].unique()
if len(Produto) <1:
Produto = df['Produto'].unique()
# Transform .isin([...]) into .isin(...)
dados2 = (df.loc[df['FarolAging'].isin(FarolAging1)
& (df['Dias Pendentes'].isin(DiasPendentes))
& (df['Produto'].isin(Produto))
& (df['Officer'].isin(Officer))])
print(dados2)
return dados2 # don't forget to return something
Output:
>>> func(FarolAging1, DiasPendentes, Produto, Officer)
FarolAging Dias Pendentes Produto Officer Analista Coord
0 Vermelho 20 dias Prod1 Alexandre Denardi Guilherme De Oliveira Moura Anna Claudia

Formula Cell Wont Update Openpyxl

I fill the cells I need, then I set the total formula. It works right in one column, with normal numbers, but in the column with times (hh:mm:ss) the total cell is not being updated. If i manually change a cell, then it will be computed to the total. I dont know why this happens.
excel sheet:
The G column total cell (Qt_horas) has the formula, but it does not apply via openpyxl
the code:
df = pd.read_excel('Status Report Coagril 20-04-2022.xlsx', 'Acompanhamento Solics. Projeto')
planilha = load_workbook('Status Report Coagril 20-04-2022.xlsx', data_only=True)
ws = planilha['Acompanhamento Solics. Projeto']
start = 17
for i, nr_solic in enumerate(sols_filhas):
query_sol = f"select nr_solicitacao, ds_assunto, st_solic, ds_status from solicservico_v2 a, statussolic b where a.nr_solicitacao = {nr_solic} and a.st_solic = b.cd_status"
query = db.query(query_sol)
# atribuo as variaveis que vieram da query do banco
nr_solicitacao = query['nr_solicitacao'].values[0]
ds_assunto = query['ds_assunto'].values[0]
ds_status = query['ds_status'].values[0]
horas_realizadas = formata_horas(db.query(f'SELECT retorna_min_realizados_solic({nr_solicitacao})/60 minutos FROM DUAL')['minutos'].values[0])
# verifico se alguma das células que estou preenchendo, foi erroneamente mergeada com as coluna E ou F, e então desfaço o merge
e_merged = f'B{start+i}:E{start+i}'
f_merged = f'B{start+i}:F{start+i}'
lista = str(ws.merged_cells).split(' ')
if e_merged in lista:
ws.unmerge_cells(e_merged)
if f_merged in lista:
ws.unmerge_cells(f_merged)
# insiro uma nova linha, a não ser que seja o primeiro registro, pois já há uma linha em branco
# if i != 0:
# ws.insert_rows(start+i)
ws.insert_rows(start+i)
# preenche_linha(ds_assunto, 'B', start+1, 'left', 'no_right', ws)
# preenche_linha(None, 'C', start+1, 'left', 'no_left', ws)
# preenche_linha(nr_solicitacao, 'D', start+1, 'center', 'total', ws)
# preenche_linha(ds_status, 'E', start+1, 'center', 'total', ws)
# preenche_linha(1, 'F', start+1, 'center', 'total', ws)
# preenche_linha(horas_realizadas, 'G', start+1, 'center', 'total', ws)
# preenche_linha('Maxicon', 'H', start+1, 'center', 'total', ws)
ws[f'B{start+i}'] = ds_assunto
ws[f'B{start+i}'].alignment = Alignment(horizontal='left')
ws[f'B{start+i}'].border = aplica_borda_sem_direita()
ws[f'C{start+i}'].border = aplica_borda_sem_esquerda()
ws[f'D{start+i}'] = nr_solicitacao
ws[f'D{start+i}'].alignment = Alignment(horizontal='center')
ws[f'D{start+i}'].border = aplica_borda_total()
ws[f'E{start+i}'] = ds_status
ws[f'E{start+i}'].alignment = Alignment(horizontal='center')
ws[f'E{start+i}'].border = aplica_borda_total()
ws[f'F{start+i}'] = 1
ws[f'F{start+i}'].alignment = Alignment(horizontal='center')
ws[f'F{start+i}'].border = aplica_borda_total()
ws[f'G{start+i}'].number_format = 'h:mm:ss'
ws[f'G{start+i}'] = horas_realizadas
ws[f'G{start+i}'].border = aplica_borda_total()
ws[f'G{start+i}'].alignment = Alignment(horizontal='center')
ws[f'H{start+i}'] = 'Maxicon'
ws[f'H{start+i}'].alignment = Alignment(horizontal='center')
ws[f'H{start+i}'].border = aplica_borda_total()
last_sol = len(sols_filhas) + start - 1
ws[f'F{last_sol+1}'] = f'=SUM(F{start}:F{last_sol})'
ws[f'G{last_sol+1}'].number_format = 'h:mm:ss'
ws[f'G{last_sol+1}'] = f'=SUM(G{start}:G{last_sol})'
ws[f'F{last_sol+1}'].alignment = Alignment(horizontal='center')
ws[f'G{last_sol+1}'].alignment = Alignment(horizontal='center')
planilha.save('teste.xlsx')````
[1]: https://i.stack.imgur.com/e2xOV.png
I would say that even though you are setting the cell format to 'h:mm:ss', Excel does not recognise the cells as being time values until a cell is updated. However the cell format you're using will result in an incorrect resultE.g. those values that do not conform with 'normal time' will be reset to 24hr time as the format assumes the highest value to be 23:59:59. Anything larger like 197:06:00 and 29:32:00 are going to be converted to 5:06:00 and 5:32:00 respectively.
You should use the format
'[h]:mm:ss'
(where 'h' is in square brackets) to allow hours above 24.
Even given a different format you may still have the same issues summing the values as
'=SUM(GX:GY)'
if there are and you want/need to do the sum in Excel it might work better to use the long hand like
'=SUM(GA+GB+GC+GD+...)'
Otherwise use python to change the format e.g. change everything to seconds and sum as numerical data (Excel can then display in format as required) or just do the summing in python writing all values to Excel.

How to set a value for an empy list

I am starting to learn to program using BeautifulSoup. What I want to achieve with this code is to save prices from different pages. To achieve this I store the prices of each page in a list and all those lists in a list. The problem is some pages do not save the prices so there are some lists that are completely empty. What I am looking for is that those empty lists are assigned the elements of the "ListaR" so that later I do not have problems. Here's my code:
from bs4 import BeautifulSoup
import requests
import pandas as pd
from decimal import Decimal
from typing import List
AppID = ['495570', '540190', '607210', '575780', '338840', '585830', '637330', '514360', '575760', '530540', '361890', '543170', '346500', '555930', '575700', '595780', '362400', '562360', '745670', '763360', '689360', '363610', '575770', '467310', '380560']
ListaPrecios = list()
ListaUrl = list() #<------- LISTA
Blanco = [""]
ListaR = ["$0.00 USD", "$0.00 USD"]
for x in AppID: # <--------- Para cada una de las AppID...
#STR#
url = "https://steamcommunity.com/market/search?category_753_Game%5B%5D=tag_app_"+x+"&category_753_cardborder%5B%5D=tag_cardborder_0&category_753_item_class%5B%5D=tag_item_class_2#p1_price_asc" # <------ Usa AppID para entrar a sus links de mercado
ListaUrl += [url] # <---------- AGREGA CADA LINK A UNA LISTA
PageCromos = [requests.get(x) for x in ListaUrl]
SoupCromos = [BeautifulSoup(x.content, "html.parser") for x in PageCromos]
PrecioCromos = [x.find_all("span", {"data-price": True}) for x in SoupCromos] # <--------- GUARDA LISTAS DENTRO DE LISTAS CON CODIGO
min_CromoList = []
for item in PrecioCromos:
CromoList = [float(i.text.strip('USD$')) for i in item]
min_CromoList.append(min(CromoList)) # <---------------- Lista con todos los precios minimos de cromos de cada juego
print(min_CromoList)
Output:
ValueError: min() arg is an empty sequence
You can change this line
min_CromoList.append(min(CromoList))
to:
if not CromoList: # this will evaluate to True if the list is empty
min_CromoList.append(min(ListaR))
else:
min_CromoList.append(min(CromoList))
A neat feature of python is that empty lists evaluate to False and non-empty lists evaluate to True.
Since min(ListaR) will always evaluate to '$0.00 USD' it is probably neater to write this as:
if not CromoList:
min_CromoList.append('$0.00 USD')
else:
min_CromoList.append(min(CromoList))

Generating a table with docx from a dataframe in python

Hellow,
Currently I´m working in a project in which I have to generate some info with docx library in python. I want to know how to generate a docx table from a dataframe in order to have the output with all the columns and rows from de dataframe I've created. Here is my code, but its not working correctly because I can´t reach the final output:
table = doc.add_table(rows = len(detalle_operaciones_total1), cols=5)
table.style = 'Table Grid'
table.rows[0].cells[0].text = 'Nombre'
table.rows[0].cells[1].text = 'Operacion Nro'
table.rows[0].cells[2].text = 'Producto'
table.rows[0].cells[3].text = 'Monto en moneda de origen'
table.rows[0].cells[4].text = 'Monto en moneda local'
for y in range(1, len(detalle_operaciones_total1)):
Nombre = str(detalle_operaciones_total1.iloc[y,0])
Operacion = str(detalle_operaciones_total1.iloc[y,1])
Producto = str(detalle_operaciones_total1.iloc[y,2])
Monto_en_MO = str(detalle_operaciones_total1.iloc[y,3])
Monto_en_ML = str(detalle_operaciones_total1.iloc[y,4])
table.rows[y].cells[0].text = Nombre
table.rows[y].cells[1].text = Operacion
table.rows[y].cells[2].text = Producto
table.rows[y].cells[3].text = Monto_en_MO
table.rows[y].cells[4].text = Monto_en_ML

Parsing a simple text file in python

I've the following text file taken from a csv file. The file is two long to be shown properly here, so here's the line info:
The file has 5 lines:The 1st one starts in ETIQUETASThe 2nd one stars in RECURSOSThe 3rd one starts in DATOS CLIENTE Y PIEZAThe 4th one starts in Numero Referencia,The 5th and last one starts in BRIDA Al.
ETIQUETAS:;;;;;;;;;START;;;;;;;;;;;;;;;;;;;;;END;;
RECURSOS:;;;;;;;;;0;0;0;0;0;0;0;0;0;1;0;0;0;0;0;0;1;1;1;0;1;0;;Nota: 0
equivale a infinito, para decir que no existen recursos usar un numero
negativo DATOS CLIENTE Y PIEZA;;;;PLAZOS Y PROCESOS;;;;;;;;;;hoja
de ruta;MU;;;;;;;;;;;;;;;;; Numero Referencia;Descripcion
Referencia;Nombre Cliente;Codigo Cliente;PLAZO DE
ENTREGA;piezas;PROCESO;MATERIAL;stock;PROVEEDOR;tiempo ida
pulidor;pzas dia;TPO;tiempo vuelta pulidor;TIEMPO RECEPCION;CONTROL
CALIDAD DE ENTRADA;TIEMPO CONTROL CALIDAD DE ENTRADA;ALMACEN A (ANTES
DE ENTRAR
MAQUINA);GRANALLA;TPO;LIMPIADO;TPO;BRILLADO;TPO;;CARGA;MAQUINA;SOLTAR;control;EMPAQUETADO;ALMACENB;TIEMPO;
BRIDA Al;BRIDA Al;AEROGRAFICAS AHE,
S.A.;394;;;niquelado;aluminio;;;;matriz;;;5min;NO;;3dias;;;;;;;;1;1;1;;1;4D;;
I want to do two things:
Count the between START and END of the first line, both inclusive and save it as TOTAL_NUMBERS. This means if I've START;;END has to count 3; the START itself, the blank space between the two ;; and the END itself. In the example of the test, START;;;;;;;;;;;;;;;;;;;;;END it has to count 22.
What I've tried so far:
f = open("lt.csv", 'r')
array = []
for line in f:
if 'START' in line:
for i in line.split(";"):
array.append(i)
i = 0
while i < len(array):
if i == 'START':
# START COUNTING, I DONT KNOW HOW TO CONTINUE
i = i + 1
2.Check the file, go until the word PROVEEDOR appears, and save that word and the following TOTAL_NUMBERS(in the example, 22) on an array.
This means it has to save:
final array = ['PROVEEDOR', 'tiempo ida pulidor', 'pzas dia, 'TPO', 'tiempo vuelta pulidor', 'TIEMPO RECEPCION', 'CONTROL CALIDAD DE ENTRADA', 'TIEMPO CONTROL CALIDAD DE ENTRADA, 'ALMACEN A (ANTES DE ENTRAR MAQUINA)', 'GRANALLA', 'TPO', 'LIMPIADO', 'TPO','BRILLADO','TPO','','CARGA', 'MAQUINA', 'SOLTAR', 'control', 'EMPAQUETADO', 'ALMACENB']
Thanks in advance.
I am assuming the file is split into two lines; the first line with START and END and then a long line which needs to be parsed. This should work:
with open('somefile.txt') as f:
first_row = next(f).strip().split(';')
TOTAL_NUMBER = len(first_row[first_row.index('START'):first_row.index('END')+1])
bits = ''.join(line.rstrip() for line in f).split(';')
final_array = bits[bits.index('PROVEEDOR'):bits.index('PROVEEDOR')+TOTAL_NUMBER]

Categories

Resources