i'm a begginer. I was programming a code that wrote certain information in a .txt file in table format, after coding I thought if I could simplify this big code. So is it better to put everything inside one function, or separate them? Or is there a better way to do it?
def adicionar(quantidade, animal, reprodu, tempera, estrutura, classe):
arquivo = open('Banco_de_Dados_Animais.txt', 'a', encoding='utf8')
arquivo.write(f'\n {quantidade}')
arquivo.write('|'.rjust(7-len(quantidade)))
arquivo.write(f'{animal}'.rjust(len(animal)+8))
arquivo.write('|'.rjust(15 - len(animal)))
arquivo.write(f'{reprodu}'.rjust(len(reprodu)+4))
or this way for exemple
def add_quantidade (quantidade):
arquivo = open('Animais_Banco_de_Dados.txt', 'a', encoding='utf8')
arquivo.write(f'\n {quantidade}')
arquivo.write(''.rjust (7-len (quantidade)))
arquivo.close()
def add_animal (animal):
arquivo = open('Animais_Banco_de_Dados.txt', 'a', encoding='utf8')
arquivo.write(f' (animal)'.rjust(len(animal)+8))
arquivo.write('|'.rjust (15 - Len(animal)))
arquivo.close()
Related
I have written a piece of code that compares data from two csv's and writes the final output to a new csv. The problem is except for the header nothing else is being written into the csv. Below is my code,
import csv
data_3B = open('3B_processed.csv', 'r')
reader_3B = csv.DictReader(data_3B)
data_2A = open('2A_processed.csv', 'r')
reader_2A = csv.DictReader(data_2A)
l_3B_2A = [["taxable_entity_id", "return_period", "3B", "2A"]]
for row_3B in reader_3B:
for row_2A in reader_2A:
if row_3B["taxable_entity_id"] == row_2A["taxable_entity_id"] and row_3B["return_period"] == row_2A["return_period"]:
l_3B_2A.append([row_3B["taxable_entity_id"], row_3B["return_period"], row_3B["total"], row_2A["total"]])
with open("3Bvs2A_new.csv", "w") as csv_file:
writer = csv.writer(csv_file)
writer.writerows(l_3B_2A)
csv_file.close()
How do I solve this?
Edit:
2A_processed.csv sample:
taxable_entity_id,return_period,total
2d9cc638-5ed0-410f-9a76-422e32f34779,072019,0
2d9cc638-5ed0-410f-9a76-422e32f34779,062019,0
2d9cc638-5ed0-410f-9a76-422e32f34779,082019,0
e5091f99-e725-44bc-b018-0843953a8771,082019,0
e5091f99-e725-44bc-b018-0843953a8771,052019,41711.5
920da7ba-19c7-45ce-ba59-3aa19a6cb7f0,032019,2862.94
410ecd0f-ea0f-4a36-8fa6-9488ba3c095b,082018,48253.9
3B_processed sample:
taxable_entity_id,return_period,total
1e5ccfbc-a03e-429e-b79a-68041b69dfb0,072017,0.0
1e5ccfbc-a03e-429e-b79a-68041b69dfb0,082017,0.0
1e5ccfbc-a03e-429e-b79a-68041b69dfb0,092017,0.0
f7d52d1f-00a5-440d-9e76-cb7fbf1afde3,122017,0.0
1b9afebb-495d-4516-96bd-1e21138268b7,072017,146500.0
1b9afebb-495d-4516-96bd-1e21138268b7,082017,251710.0
The csv.DictReader objects in your code can only read through the file once, because they are reading from file objects (created with open). Therefore, the second and subsequent times through the outer loop, the inner loop does not run, because there are no more row_2A values in reader_2A - the reader is at the end of the file after the first time.
The simplest fix is to read each file into a list first. We can make a helper function to handle this, and also ensure the files are closed properly:
def lines_of_csv(filename):
with open(filename) as source:
return list(csv.DictReader(source))
reader_3B = lines_of_csv('3B_processed.csv')
reader_2A = lines_of_csv('2A_processed.csv')
I put your code into a file test.py and created test files to simulate your csvs.
$ python3 ./test.py
$ cat ./3Bvs2A_new.csv
taxable_entity_id,return_period,3B,2A
1,2,3,2
$ cat ./3B_processed.csv
total,taxable_entity_id,return_period,3B,2A
3,1,2,3,4
3,4,3,2,1
$ cat ./2A_processed.csv
taxable_entity_id,return_period,2A,3B,total
1,2,3,4,2
4,3,2,1,2
So as you can see the order of the columns doesn't matter as they are being accessed correctly using the dict reader and if the first row is a match your code works but there are no rows left in the second csv file after the processing the first row from the first file. I suggest making a dictionary if taxable_entity_id and return_period tuple values, processing the first csv file by adding totals into the dict then running through the second one and looking them up.
row_lookup = {}
for row in first_csv:
rowLookup[(row['taxable_entity_id'], row['return_period'])] = row['total']
for row in second_csv:
if (row['taxable_entity_id'],row['return_period']) in row_lookup.keys():
newRow = [row['taxable_entity_id'], row['return_period'], row['total'] ,row_lookup[(row['taxable_entity_id'],row['return_period']] ]
Of course that only works if pairs of taxable_entity_ids and return_periods are always unique... Hard to say exactly what you should do without knowing the exact nature of your task and full format of your csvs.
You can do this with pandas if the data frames are equal-sized like this :
reader_3B=pd.read_csv('3B_processed.csv')
reader_2A=pd.read_csv('2A_processed.csv')
l_3B_2A=row_3B[(row_3B["taxable_entity_id"] == row_2A["taxable_entity_id"])&(row_3B["return_period"] == row_2A["return_period"])]
l_3B_2A.to_csv('3Bvs2A_new.csv')
I have a program that reads in a input text file (DNA.txt) of a DNA sequence, and then translates the DNA sequence (saved as a string) into various amino acid SLC codes using this function:
def translate(uppercase_dna_sequence, codon_list):
slc = ""
for i in range(0, len(uppercase_dna_sequence), 3):
codon = uppercase_dna_sequence[i:i+3]
if codon in codon_list:
slc = slc + codon_list[codon]
else:
slc = slc + "X"
return slc
I then have a function that creates two output text files called:
normalDNA.txt and mutatedDNA.txt
Each of these files has one long DNA sequence.
I now want to write a function that allows me to read both these files as input files, and use the "translate" function mentioned above to translate the DNA sequences that are in the containing text files. (just like I did the original DNA.txt file mentioned at the top of this explanation) but using the original translate function. (So I assume I am trying to inherit the other function's properties to this one). I have this code:
def txtTranslate(translate):
with open('normalDNA.txt') as inputfile:
normalDNA_input = inputfile.read()
print normalDNA_input
with open('mutatedDNA.txt') as inputfile:
mutatedDNA_input = inputfile.read()
print mutatedDNA_input
return txtTranslate
The program runs when I call it with:
print txtTranslate(translate)
But it prints:
function txtTranslate at 0x103bf39b0>
I want the second function (txtTranslate) to read in the external text files, and then have the first function translate the inputs and "print"out the result to the user...
I have my full code available on request, but i think I'm missing something small, hopefully! or should I put everything into classes with OOP?
I'm new to linking two functions, so please excuse the lack of knowledge in the second function...
This doesn't have anything to do with inheritance. If you want txtTranslate to execute translate, you have to actually call it. Try:
def txtTranslate():
with open('normalDNA.txt') as inputfile:
normalDNA_input = inputfile.read()
print normalDNA_input
with open('mutatedDNA.txt') as inputfile:
mutatedDNA_input = inputfile.read()
print mutatedDNA_input
#todo: get codon_list from somewhere
print translate(normalDNA_input, codon_list)
print translate(mutatedDNA_input, codon_list)
txtTranslate()
I want the program to read a text file and to put every word in a list, here's the code that I've written:
class Teater(object):
def__init__(self, namn, AntalPlatser, VuxenBiljettpris,
PensionärBiljettpris, BarnBiljettpris,
vuxen=0, pansionär=0, barn=0):
self.namn=namn
self.AntalPlatser=AntalPlatser
self.VuxenBiljettpris=VuxenBiljettpris
self.PensionärBiljettpris=PensionärBiljettpris
self.BarnBiljettpris=BarnBiljettpris
self.vuxen=vuxen
self.barn=barn
self.pansionär=pansionär
def teaterLista():
infil = open("teatrar.txt", "r", encoding="utf8")
lista=[]
lista = lista[4:]
for rad in lista:
splitList=rad.split("/")
namn=splitlist[0]
AntalPlatser=splitlist[1]
VuxenBiljettpris=splitlist[2]
PensionärBiljettpris=splitlist[3]
BarnBiljettpris=splitlist[4]
nyTeater = Teater(
namn,
AntalPlatser,
VuxenBiljettpris,
PensionärBiljettpris,
BarnBiljettpris)
lista.append(nyTeater)
return lista
and my text file looks like this:
Såda Teaterbiljetter
TeaternsNamn/Antal platser i salongen/Vuxenbiljettpris/Pensionärbiljettpris/Barnbiljettpris
-------------------------------------------------------------------------------------
SodraTeatern/414/330/260/200
Dramaten/770/390/350/100
ChinaTeatern/1230/395/300/250
I don't want the first 4 rows of the text file to be printed. but when I type läsFil() in python Shell i only get this: [ ]
There are many problems with your code, but most notably you don't ever read from the file infil. Instead you create an empty list and take a slice on that list, so it is still empty:
lista=[]
lista = lista[4:]
Anyway, you should have a look at the CSV module, which can handle for you the reading of delimited text files (whether separated by commas or slashes).
This part of the code looks very suspicious:
infil = open("teatrar.txt", "r", encoding="utf8") # not used after.
lista=[]
lista = lista[4:] # does nothing since lista is empty
for rad in lista:
You probably mean:
infil = open("teatrar.txt", "r", encoding="utf8")
lista=[]
for rad in infil:
Then
namn=splitlist[0]
AntalPlatser=splitlist[1]
VuxenBiljettpris=splitlist[2]
PensionärBiljettpris=splitlist[3]
BarnBiljettpris=splitlist[4]
Could be better written as
namn, AntalPlatser, VuxenBiljettpris, PensionärBiljettpris, BarnBiljettpris = splitlist[:5]
I'm trying to remove the brackets and commas using .join. It works in other places in my program but not here. This is the code:
def load():
fileName = raw_input("Please enter the name of the save file to load. Please don't enter '.txt'.")
return open(fileName+".txt", "r")
fileToLoad = load()
fileData = fileToLoad.readlines()
code = (fileData[4])
splitcode = "".join(code)
print code
print splitcode
and the two outputs I'm getting are both:
['Y', 'G', 'R']
['Y', 'G', 'R']
I thought that the second output should be:
YGR
Thanks for the help!
It appears that code is the literal string "['Y', 'G', 'R']" and not a list as would be required for join to work as you expect. The simplest way around this is to first convert code into a list by calling ast.literal_eval on it, or, if you can be absolutely sure the contents of the file contain nothing malicious or malformed, eval.
Rather than doing something as dangerous as eval, you could try turning it in to valid JSON then loading that.
code = code.replace("'", '"')
listified = json.loads(code)
joined = ''.join(listified)
I assumed sorting a CSV file on multiple text/numeric fields using Python would be a problem that was already solved. But I can't find any example code anywhere, except for specific code focusing on sorting date fields.
How would one go about sorting a relatively large CSV file (tens of thousand lines) on multiple fields, in order?
Python code samples would be appreciated.
Python's sort works in-memory only; however, tens of thousands of lines should fit in memory easily on a modern machine. So:
import csv
def sortcsvbymanyfields(csvfilename, themanyfieldscolumnnumbers):
with open(csvfilename, 'rb') as f:
readit = csv.reader(f)
thedata = list(readit)
thedata.sort(key=operator.itemgetter(*themanyfieldscolumnnumbers))
with open(csvfilename, 'wb') as f:
writeit = csv.writer(f)
writeit.writerows(thedata)
Here's Alex's answer, reworked to support column data types:
import csv
import operator
def sort_csv(csv_filename, types, sort_key_columns):
"""sort (and rewrite) a csv file.
types: data types (conversion functions) for each column in the file
sort_key_columns: column numbers of columns to sort by"""
data = []
with open(csv_filename, 'rb') as f:
for row in csv.reader(f):
data.append(convert(types, row))
data.sort(key=operator.itemgetter(*sort_key_columns))
with open(csv_filename, 'wb') as f:
csv.writer(f).writerows(data)
Edit:
I did a stupid. I was playing with various things in IDLE and wrote a convert function a couple of days ago. I forgot I'd written it, and I haven't closed IDLE in a good long while - so when I wrote the above, I thought convert was a built-in function. Sadly no.
Here's my implementation, though John Machin's is nicer:
def convert(types, values):
return [t(v) for t, v in zip(types, values)]
Usage:
import datetime
def date(s):
return datetime.strptime(s, '%m/%d/%y')
>>> convert((int, date, str), ('1', '2/15/09', 'z'))
[1, datetime.datetime(2009, 2, 15, 0, 0), 'z']
Here's the convert() that's missing from Robert's fix of Alex's answer:
>>> def convert(convert_funcs, seq):
... return [
... item if func is None else func(item)
... for func, item in zip(convert_funcs, seq)
... ]
...
>>> convert(
... (None, float, lambda x: x.strip().lower()),
... [" text ", "123.45", " TEXT "]
... )
[' text ', 123.45, 'text']
>>>
I've changed the name of the first arg to highlight that the per-columns function can do what you need, not merely type-coercion. None is used to indicate no conversion.
You bring up 3 issues:
file size
csv data
sorting on multiple fields
Here is a solution for the third part. You can handle csv data in a more sophisticated way.
>>> data = 'a,b,c\nb,b,a\nb,c,a\n'
>>> lines = [e.split(',') for e in data.strip().split('\n')]
>>> lines
[['a', 'b', 'c'], ['b', 'b', 'a'], ['b', 'c', 'a']]
>>> def f(e):
... field_order = [2,1]
... return [e[i] for i in field_order]
...
>>> sorted(lines, key=f)
[['b', 'b', 'a'], ['b', 'c', 'a'], ['a', 'b', 'c']]
Edited to use a list comprehension, generator does not work as I had expected it to.