Python help, reading and writing to a txt file - python

I have posted the relevant part of my code below. Before that are just load functions, which I am pretty sure have no errors.
I am recieving error
IndexError: list index out of range( "namestaj["Naziv"] = deon[1]")
Does anyone see something out of order?
#load furniture from a txt file
def ucitajNamestaj():
listaNamestaja = open("namestaj.txt", "r").readlines()
namestaj = []
for red in listaNamestaja:
namestaj.append(stringToNamestaj(red))
return namestaj
#String to Furniture, dictionary
def stringToNamestaj(red):
namestaj = {}
deon = red.strip().split("|")
namestaj["Sifra"] = deon[0]
namestaj["Naziv"] = deon[1]
namestaj["Boja"] = deon[2]
namestaj["Kolicina"] = int(deon[3])
namestaj["Cena"] = float(deon[4])
namestaj["Kategorija"] = deon[5]
namestaj["Dostupan"] = deon[6]
return namestaj

Couple of things first, try always to provide a mcve and make sure you use properly the SO code directives, otherwise your question is unreadable.
Now, probably what's happening is your file has some empty lines and you're not skipping those, try this:
def ucitajNamestaj():
listaNamestaja = open("namestaj.txt", "r").readlines()
namestaj = []
for red in listaNamestaja:
if red.strip() == "":
continue
namestaj.append(stringToNamestaj(red))
return namestaj
def stringToNamestaj(red):
namestaj = {}
deon = red.strip().split("|")
namestaj["Sifra"] = deon[0]
namestaj["Naziv"] = deon[1]
namestaj["Boja"] = deon[2]
namestaj["Kolicina"] = int(deon[3])
namestaj["Cena"] = float(deon[4])
namestaj["Kategorija"] = deon[5]
namestaj["Dostupan"] = deon[6]
return namestaj

Related

Issue with reading from txt using csV

I'm working with project and I'm stuck because when I tried to read from file nrwolek,nrwolbiz,nrwolpr instead of getting [1E, 2E, 3E, 4E, 5E], [1B,2B,3B,4B,5B], [1P, 2P, 3P] I got nrwolek = 1E, nrwolbiz = 2E, nrwolpr u 3E. It seams that it doesn't read a whole list but only elements from it. Is it a method to correct this? Or is a good solution to solve this with json? Code -reading :
import csv
from lot import DatabaseofLoty, Lot
def read_from_csv(path):
loty = []
with open(path,"r") as file_handle:
reader = csv.DictReader(file_handle)
for row in reader:
numer_lotu = row["numer_lotu"]
id_samolotu = row["id_samolotu"]
czas_lotu = row['czas_lotu']
trasa = row['trasa']
wolne_miejscaek = row['wolne_miejscaek']
wolne_miejscabiz = row['wolne_miejscabiz']
wolne_miejscapr = row['wolne_miejscapr']
bramka = row['bramka']
cenaek = row['cenaek']
cenabiz = row['cenabiz']
cenapr = row['cenapr']
nrwolek = row['nrwolek']
nrwolbiz = row['nrwolbiz']
nrwolpr = row['nrwolpr']
lot = Lot(numer_lotu, id_samolotu, czas_lotu, trasa,
wolne_miejscaek,
wolne_miejscabiz, wolne_miejscapr, bramka,
cenaek, cenabiz, cenapr)
loty.append(lot)
database = DatabaseofLoty(loty)
return database
print(read_from_csv("loty.txt"))
Text file:
numer_lotu,id_samolotu,czas_lotu,trasa,wolne_miejscaek,wolne_miejscabiz,wolne_miejscapr,bramka,cenaek,cenabiz,cenapr,nrwolek,nrwolbiz,nrwolpr
1,3,3:52,Amsterdam-Berlin,129,92,192,8,52,68,75, [1E, 2E, 3E, 4E, 5E], [1B,2B,3B,4B,5B], [1P, 2P, 3P]
2,3,3:52,Tokio-Berlin,129,92,192,8,580,720,1234

empty files in python

I am trying to create a file with all the magnetic information in my lists. I've used this code before as well. for some reason the file it returns is empty. I'm not sure why.
here is my code:
magnetosheath_Bx = JSS_Bx[12339:13795]
magnetosheath_By = JSS_By[12339:13795]
magnetosheath_Bz = JSS_Bz[12339:13795]
magnetosheath_B = JSS_Bmag[12339:13795]
magnetosheath_time = epochtime_magdata[12339:13795]
magnetosheath_r = new_RJSE[12339:13795]
Magnetosheath_data = zip(magnetosheath_Bx, magnetosheath_By, magnetosheath_Bz, magnetosheath_B, magnetosheath_time, magnetosheath_r)
filenew= open('Ulysses_Magnetoseath.txt' , 'w')
filenew.write("hello")
for magnetic_data in Magnetosheath_data:
filenew.write('{} {} {} {} {} {}\n'.format(magnetic_data[0], magnetic_data[1], magnetic_data[2], magnetic_data[3],magnetic_data[4],magnetic_data[5] ))
filenew.write("hello")

How should I Execute this Python Script in powershell

I've solved the problem. The problem is related my %PATH%
I have a script which work with an argument. In powershell I've tried the command you can see below;
.\dsrf2csv.py C:\Python27\a\DSR_testdata.tsv.gz
And also you can see the script below,
def __init__(self, dsrf2csv_arg):
self.dsrf_filename = dsrf2csv_arg
dsrf_path, filename = os.path.split(self.dsrf_filename)
self.report_outfilename = os.path.join(dsrf_path, filename.replace('DSR', 'Report').replace('tsv', 'csv'))
self.summary_outfilename = os.path.join(dsrf_path, filename.replace('DSR', 'Summary').replace('tsv.gz', 'csv'))
But when I try to run this script there is no any action. How should I run this script with a file? (example : testdata.tsv.gz)
Note : Script and file in same location.
Full Scritp;
import argparse
import atexit
import collections
import csv
import gzip
import os
SKIP_ROWS = ['HEAD', '#HEAD', '#SY02', '#SY03', '#AS01', '#MW01', '#RU01',
'#SU03', '#LI01', '#FOOT']
REPORT_HEAD = ['Asset_ID', 'Asset_Title', 'Asset_Artist', 'Asset_ISRC',
'MW_Asset_ID', 'MW_Title', 'MW_ISWC', 'MW_Custom_ID',
'MW_Writers', 'Views', 'Owner_name', 'Ownership_Claim',
'Gross_Revenue', 'Amount_Payable', 'Video_IDs', 'Video_views']
SUMMARY_HEAD = ['SummaryRecordId', 'DistributionChannel',
'DistributionChannelDPID', 'CommercialModel', 'UseType',
'Territory', 'ServiceDescription', 'Usages', 'Users',
'Currency', 'NetRevenue', 'RightsController',
'RightsControllerPartyId', 'AllocatedUsages', 'AmountPayable',
'AllocatedNetRevenue']
class DsrfConverter(object):
"""Converts DSRF 3.0 to YouTube CSV."""
def __init__(self, dsrf2csv_arg):
""" Creating output file names """
self.dsrf_filename = dsrf2csv_arg
dsrf_path, filename = os.path.split(self.dsrf_filename)
print(dsrf_filename)
input("Press Enter to continue...")
self.report_outfilename = os.path.join(dsrf_path, filename.replace(
'DSR', 'Report').replace('tsv', 'csv'))
self.summary_outfilename = os.path.join(dsrf_path, filename.replace(
'DSR', 'Summary').replace('tsv.gz', 'csv'))
def parse_blocks(self, reader):
"""Generator for parsing all the blocks from the file.
Args:
reader: the handler of the input file
Yields:
block_lines: A full block as a list of rows.
"""
block_lines = []
current_block = None
for line in reader:
if line[0] in SKIP_ROWS:
continue
# Exit condition
if line[0] == 'FOOT':
yield block_lines
raise StopIteration()
line_block_number = int(line[1])
if current_block is None:
# Initialize
current_block = line_block_number
if line_block_number > current_block:
# End of block, yield and build a new one
yield block_lines
block_lines = []
current_block = line_block_number
block_lines.append(line)
# Also return last block
yield block_lines
def process_single_block(self, block):
"""Handles a single block in the DSR report.
Args:
block: Block as a list of lines.
Returns:
(summary_rows, report_row) tuple.
"""
views = 0
gross_revenue = 0
summary_rows = []
owners_data = {}
# Create an ordered dictionary with a key for every column.
report_row_dict = collections.OrderedDict(
[(column_name.lower(), '') for column_name in REPORT_HEAD])
for line in block:
if line[0] == 'SY02': # Save the financial Summary
summary_rows.append(line[1:])
continue
if line[0] == 'AS01': # Sound Recording information
report_row_dict['asset_id'] = line[3]
report_row_dict['asset_title'] = line[5]
report_row_dict['asset_artist'] = line[7]
report_row_dict['asset_isrc'] = line[4]
if line[0] == 'MW01': # Composition information
report_row_dict['mw_asset_id'] = line[2]
report_row_dict['mw_title'] = line[4]
report_row_dict['mw_iswc'] = line[3]
report_row_dict['mw_writers'] = line[6]
if line[0] == 'RU01': # Video level information
report_row_dict['video_ids'] = line[3]
report_row_dict['video_views'] = line[4]
if line[0] == 'SU03': # Usage data of Sound Recording Asset
# Summing up views and revenues for each sub-period
views += int(line[5])
gross_revenue += float(line[6])
report_row_dict['views'] = views
report_row_dict['gross_revenue'] = gross_revenue
if line[0] == 'LI01': # Ownership information
# if we already have parsed a LI01 line with that owner
if line[3] in owners_data:
# keep only the latest ownership
owners_data[line[3]]['ownership'] = line[6]
owners_data[line[3]]['amount_payable'] += float(line[9])
else:
# need to create the entry for that owner
data_dict = {'custom_id': line[5],
'ownership': line[6],
'amount_payable': float(line[9])}
owners_data[line[3]] = data_dict
# get rid of owners which do not have an ownership or an amount payable
owners_to_write = [o for o in owners_data
if (owners_data[o]['ownership'] > 0
and owners_data[o]['amount_payable'] > 0)]
report_row_dict['owner_name'] = '|'.join(owners_to_write)
report_row_dict['mw_custom_id'] = '|'.join([owners_data[o]
['custom_id']
for o in owners_to_write])
report_row_dict['ownership_claim'] = '|'.join([owners_data[o]
['ownership']
for o in owners_to_write])
report_row_dict['amount_payable'] = '|'.join([str(owners_data[o]
['amount_payable'])
for o in owners_to_write])
# Sanity check. The number of values must match the number of columns.
assert len(report_row_dict) == len(REPORT_HEAD), 'Row is wrong size :/'
return summary_rows, report_row_dict
def run(self):
finished = False
def removeFiles():
if not finished:
os.unlink(self.report_outfilename)
os.unlink(self.summary_outfilename)
atexit.register(removeFiles)
with gzip.open(self.dsrf_filename, 'rb') as dsr_file, gzip.open(
self.report_outfilename, 'wb') as report_file, open(
self.summary_outfilename, 'wb') as summary_file:
dsr_reader = csv.reader(dsr_file, delimiter='\t')
report_writer = csv.writer(report_file)
summary_writer = csv.writer(summary_file)
report_writer.writerow(REPORT_HEAD)
summary_writer.writerow(SUMMARY_HEAD)
for block in self.parse_blocks(dsr_reader):
summary_rows, report_row = self.process_single_block(block)
report_writer.writerow(report_row.values())
summary_writer.writerows(summary_rows)
finished = True
if __name__ == '__main__':
arg_parser = argparse.ArgumentParser(
description='Converts DDEX DSRF UGC profile reports to Standard CSV.')
required_args = arg_parser.add_argument_group('Required arguments')
required_args.add_argument('dsrf2csv_arg', type=str)
args = arg_parser.parse_args()
dsrf_converter = DsrfConverter(args.dsrf2csv_arg)
dsrf_converter.run()
In general to execute a python script in powershell like this .\script.py has two requirements:
Add the path to the python binaries to your %path%: $env:Path = $env:Path + ";C:\Path\to\python\binaries\"
Add the ending .py to the pathtext environment variable: $env:PATHEXT += ";.PY"
The latter will only be used in the current powershell session. If you want to add it to all future powershell sessions, add this line to your powershell profile (f.e. notepad $profile).
In your case there is also an issue with the python script you are trying to excute. def __init__(self) is an constructor for a class, like:
class Foo:
def __init__(self):
print "foo"
Did you give us your complete script?

Having trouble parsing a .CSV file into a dict

I've done some simple .csv parsing in python but have a new file structure that's giving me trouble. The input file is from a spreadsheet converted into a .CSV file. Here is an example of the input:
Layout
Each set can have many layouts, and each layout can have many layers. Each layer has only one layer and name.
Here is the code I am using to parse it in. I suspect it's a logic/flow control problem because I've parsed things in before, just not this deep. The first header row is skipped via code. Any help appreciated!
import csv
import pprint
def import_layouts_schema(layouts_schema_file_name = 'C:\\layouts\\LAYOUT1.csv'):
class set_template:
def __init__(self):
self.set_name =''
self.layout_name =''
self.layer_name =''
self.obj_name =''
def check_layout(st, row, layouts_schema):
c=0
if st.layout_name == '':
st.layer_name = row[c+2]
st.obj_name = row[c+3]
layer = {st.layer_name : st.obj_name}
layout = {st.layout_name : layer}
layouts_schema.update({st.set_name : layout})
else:
st.layout_name = row[c+1]
st.layer_name = row[c+2]
st.obj_name = row[c+3]
layer = {st.layer_name : st.obj_name}
layout = {st.layout_name : layer}
layouts_schema.update({st.set_name : layout})
return layouts_schema
def layouts_schema_parsing(obj_list_raw1): #, location_categories, image_schema, set_location):
#------ init -----------------------------------
skipfirst = True
c = 0
firstrow = True
layouts_schema = {}
end_flag = ''
st = set_template()
#---------- start parsing here -----------------
print('Now parsing layouts schema list')
for row in obj_list_raw1:
#print ('This is the row: ', row)
if skipfirst==True:
skipfirst=False
continue
if row[c] != '':
st.set_name = row[c]
st.layout_name = row[c+1]
st.layer_name = row[c+2]
st.obj_name = row[c+3]
print('FOUND A NEW SET. SET details below:')
print('Set name:', st.set_name, 'Layout name:', st.layout_name, 'Layer name:', st.layer_name, 'Object name:', st.obj_name)
if firstrow == True:
print('First row of layouts import!')
layer = {st.layer_name : st.obj_name}
layout = {st.layout_name : layer}
layouts_schema = {st.set_name : layout}
firstrow = False
check_layout(st, row, layouts_schema)
continue
elif firstrow == False:
print('Not the first row of layout import')
layer = {st.layer_name : st.obj_name}
layout = {st.layout_name : layer}
layouts_schema.update({st.set_name : layout})
check_layout(st, row, layouts_schema)
return layouts_schema
#begin subroutine main
layouts_schema_file_name ='C:\\Users\\jason\\Documents\\RAY\\layout_schemas\\ANIBOT_LAYOUTS_SCHEMA.csv'
full_path_to_file = layouts_schema_file_name
print('============ Importing LAYOUTS schema from: ', full_path_to_file , ' ==============')
openfile = open(full_path_to_file)
reader_ob = csv.reader(openfile)
layout_list_raw1 = list(reader_ob)
layouts_schema = layouts_schema_parsing(layout_list_raw1)
print('=========== End of layouts schema import =========')
return layouts_schema
layouts_schema = import_layouts_schema()
Feel free to throw any part away that doesn't work. I suspect I've inside my head a little bit here. A for loop or another while loop may do the trick. Ultimately I just want to parse the file into a dict with the same key structure shown. i.e. the final dict's first line would look like:
{'RESTAURANT': {'RR_FACING1': {'BACKDROP': 'restaurant1'}}}
And the rest on from there. Ultimately I am goign to use this key structure and the dict for other purposes. Just can't get the parsing down!
Wouaw, that's a lot of code !
Maybe try something simpler :
with open('file.csv') as f:
keys = f.readline().split(';') # assuming ";" is your csv fields separator
for line in f:
vals = line.split(';')
d = dict(zip(keys, vals))
print(d)
Then either make a better data file (without blanks), or have the parser remembering the previous values.
While I agree with #AK47 that the code review site may be the better approach, I received so many help from SO that I'll try to give back a little: IMHO you are overthinking the problem. Please find below an approach that should get you in the right direction and doesn't even require converting from Excel to CSV (I like the xlrd module, it's very easy to use). If you already have a CSV, just exchange the loop in the process_sheet() function. Basically, I just store the last value seen for "SET" and "LAYOUT" and if they are different (and not empty), I set the new value. Hope that helps. And yes, you should think about a better data structure (redundancy is not always bad, if you can avoid empty cells :-) ).
import xlrd
def process_sheet(sheet : xlrd.sheet.Sheet):
curr_set = ''
curr_layout = ''
for rownum in range(1, sheet.nrows):
row = sheet.row(rownum)
set_val = row[0].value.strip()
layout_val = row[1].value.strip()
if set_val != '' and set_val != curr_set:
curr_set = set_val
if layout_val != '' and layout_val != curr_layout:
curr_layout = layout_val
result = {curr_set: {curr_layout: {row[2].value: row[3].value}}}
print(repr(result))
def main():
# open a workbook (adapt your filename)
# then get the first sheet (index 0)
# and call the process function
wbook = xlrd.open_workbook('/tmp/test.xlsx')
sheet = wbook.sheet_by_index(0)
process_sheet(sheet)
if __name__ == '__main__':
main()

Working with Python and pickle to save complex data in object

I am experimenting with python to do a script for a program that works with python, and I need to save an object (with custom classes and arrays inside) to a file so that I can read it afterwards (so that I don't have to remake the object everytime, which takes hours)
I was reading in many forums that the easiest way to do that is to use pickle, but I am making a mistake in some place and I don't understand where...
Now, the code would be:
First I define this class:
class Issue_class:
Title_ID = None
Publisher_ID = None
Imprint_ID = None
Volume = None
Format = None
Color = None
Original = None
Rating = None
Issue_Date_Month = None
Issue_Date_Year = None
Reprint = None
Pages = None
Issue_Title = None
Number = None
Number_str = None
Synopsis = None
Characters_ID = None
Groups_ID = None
Writer_ID = None
Inker_ID = None
Colorist_ID = None
Letterer_ID = None
CoverArtist_ID = None
Penciller_ID = None
Editor_ID = None
Alternatives_ID = None
Reprints_ID = None
Story_ID = None
Multi = None
Multistories = None
then I define a list/array for this class:
Issuesdata = []
then during a loop I fill and append these to the list:
Issuedata = Issue_class()
Issuedata.Color = "unknown"
Issuedata.Tagline = "none"
Issuedata.Synopsis = "none"
Issuedata.Format = "none"
Issuedata.Publisher_ID = "none"
Issuedata.Imprint_ID = -1
Issuedata.Title_ID = -1
Issuedata.Volume = "none"
Issuedata.Number = -1
Issuedata.Number_str = "none"
Issuedata.Issue_Title = "none"
Issuedata.Rating = -1
Issuedata.Pages = -1
Issuedata.Issue_Date_Year = 0
Issuedata.Issue_Date_Month = 0
Issuedata.Original = True
Issuedata.Reprint = False
Issuedata.Multi= True
Issuedata.Letterer_ID = []
Issuedata.Characters_ID = []
Issuedata.Story_ID = []
Issuedata.Groups_ID = []
Issuedata.Writer_ID = []
Issuedata.Penciller_ID = []
Issuedata.Alternatives_ID = []
Issuedata.Reprints_ID = []
Issuedata.Inker_ID = []
Issuedata.Colorist_ID = []
Issuedata.Editor_ID = []
Issuedata.CoverArtist_ID = []
Issuedata.Multistories = []
Then I work with the data inside the object, and when it is complete, I append it to the list:
Issuesdata.append(Issuedata)
After that I print some info inside one of the objects in the list to be sure everything is ok:
print Issuesdata[3].Title_ID
print Issuesdata[3].Publisher_ID
print Issuesdata[3].Imprint_ID
print Issuesdata[3].Volume
print Issuesdata[3].Format
etc...
And everything is ok, the printed data is perfect
Now, I try to save the list to a file with:
filehandler = open("data.dat","wb")
pickle.dump(Issuesdata,filehandler)
filehandler.close()
This create the file with info inside... but when I try to read it with:
file = open("data.dat",'rb')
Issuesdat = pickle.load(file)
file.close()
The Python console tells me "'module' object has no attribute 'Issue_class'"
The first thing I thought was that I was reading the file wrong... But then I open the saved file with notepad and inside it it was full of "wrong data", like name of files or name of classes outside the code... which makes me suspect I am dumping the data wrong in the file...
Am I using pickle wrong?
Ok, I found the problem... It seems you have to define the class of your object in the main module for pickle to see it... I had it defined in the module I was working and calling the pickle command...
Try using pandas library with simple functions like:
DataFrame.to_pickle(file-path) to save pandas Dataframe in pickle.
pandas.read_pickle(file-path) to read pickle file.
Here you can find pandas reference to_pickle read_pickle.

Categories

Resources