Python reading rows from csv, operating and organizing rows of numbers - python

I am a non-programmer geographer, heard some programming concepts but very newby :-)
I am to read six rows of environmental data. 1000 lines at the most, each time.
Each row housing two digit numbers (0 to 99) a summer issue, only positive numbers.
Once I read them I am to display the numbers 0 to 99 vertically with the number of occurrences for the reading for each of the six rows:
0 = 230.....0 = 3........0 = 230......0 = 123......0 = 223......0 = 334
1 = 67......1 = 657......1 = 627......1 = 767......1 = 467......1 = 337
2 = 762.....2 = 328......2 = 987......2 = 326......2 = 32.......2 = 123
.
.
99 = 3.....99 = 34.......99 = 1.......99 = 89......99 = 78......99 = 123
If I can get this far I will feel great. Once I learn how to do this and I can look at the data I can decide what makes sense to run next; excel, graphs, statistics, statistics in R, get the numbers into a matrix to manipulate from there, etc. First time so I am figuring this out as I go.
Any help will be much appreciated,
Adolfo
I am working in the research for the restoration of Quebrada Verde watershed in Valparaiso, Chile.

from array import array
import sys
if len(sys.argv) > 1:
count = array('H', [0]*100)
file = open(sys.argv[1], 'r')
if file:
for line in file:
count[int(line)]+=1
file.close()
for a in range (100):
print(a, count[a], sep='\t')
else:
print('unable to open the file')
else:
print('usage: python', sys.argv[0], ' file')

Related

Data Deletion in File Using Python

As new programmer in Python Programming Language, I thought to create a student Database Management System in Python. But while deleting the Data from the file I got stuck and I thought to apply these steps to the file to delete the characters but how shall I Implement it? I have developed my code but it's not working.
The algorithm:
STEP 1: Create an additional file and open the current file in reading mode and open the new file in writing mode
STEP 2: Read and copy the Data to the newly created file except for the line we want to delete
STEP 3: Close both the file and remove the old file and rename the newly created file with the deleted filename
But while implementing it I got stuck on how to implement as it is not remaining the same.
Here is the code which I wrote:
def delete():
rollno = int(input('\n Enter The Roll number : '))
f = open('BCAstudents3.txt','r')
f1 = open('temp.txt','a+')
for line in f:
fo = line.split()
if fo:
if fo[3] != rollno:
f1.write(str(str(fo).replace('[','').replace(']','').replace("'","").replace(",","")))
f.close()
f1.close()
os.remove('BCAstudents3.txt')
os.rename('temp.txt','BCAstudents3.txt')
The Data From the Original File Looks Like This :
Roll Number = 1 Name : Alex Section = C Optimisation Technique = 99 Maths III = 99 Operating System = 99 Software Engneering = 99 Computer Graphics = 99 {Here Line change is present but it is not showing while typing on to stackoverflow } Roll Number = 2 Name : Shay Section = C Optimisation Technique = 99 Maths III = 99 Operating System = 99 Software Engneering = 99 Computer Graphics = 99`
and the Resullt after The Deletion is this :
Roll Number = 1 Name : Alex Section = C Optimisation Technique = 99 Maths III = 99 Operating System = 99 Software Engneering = 99 Computer Graphics = 99Roll Number = 2 Name : Shay Section = C Optimisation Technique = 99 Maths III = 99 Operating System = 99 Software Engneering = 99 Computer Graphics = 99
and I also want to give comma after the end of the data But don't have any idea that how to do this one
I modified your code and it should work how you wanted. A couple of things to consider:
Your original text file seems to indicate that there are line breaks for each Roll Number. I assumed that with my answer.
Because you are reading a text file, there are no integers so fo[3] would not ever match rollno if you are converting the input to an int.
I wasn't sure exactly where you wanted the comma. After each line? Or just at the very end.
I wasn't sure if you wanted new lines for each Roll Number.
def delete():
rollno = input('\n Enter The Roll number : ')
f = open('BCAstudents3.txt','r')
f1 = open('temp.txt','a+')
for line in f:
fo = line.split()
if fo:
if fo[3] != rollno:
newline = " ".join(fo) + ","
#print(newline)
f1.write(newline)
f.close()
f1.close()
os.remove('BCAstudents3.txt')
os.rename('temp.txt','BCAstudents3.txt')
I made your programm a little simpler.
Hopefully you can use it:
def delete():
line = input("Line you want to delete: ")
line = int(line)
line -= 1
file = open("file.txt","r")
data = file.readlines()
del data[line]
file = open("file.txt","w")
for line in data:
file.write(line)
file.close()

Python - Formatting a print to align a specific column

I am trying to format a print statement to align a specific column.
Currently my output is:
0 - Rusty Bucket (40L bucket - quite rusty) = $0.00
1 - Golf Cart (Tesla powered 250 turbo) = $195.00*
2 - Thermomix (TM-31) = $25.50*
3 - AeroPress (Great coffee maker) = $5.00
4 - Guitar (JTV-59) = $12.95
The output I am looking for is:
0 - Rusty Bucket (40L bucket - quite rusty) = $0.00
1 - Golf Cart (Tesla powered 250 turbo) = $195.00*
2 - Thermomix (TM-31) = $25.50*
3 - AeroPress (Great coffee maker) = $5.00
4 - Guitar (JTV-59) = $12.95
Here is the code I am currently using for the print:
def list_items():
count = 0
print("All items on file (* indicates item is currently out):")
for splitline in all_lines:
in_out = splitline[3]
dollar_sign = "= $"
daily_price = "{0:.2f}".format(float(splitline[2]))
if in_out == "out":
in_out = str("*")
else:
in_out = str("")
print(count, "- {} ({}) {}{}{}".format(splitline[0], splitline[1], dollar_sign, daily_price, in_out))
count += 1
I have tried using formatting such as:
print(count, "- {:>5} ({:>5}) {:>5}{}{}".format(splitline[0], splitline[1], dollar_sign, daily_price, in_out))
but have never been able to get just the one column to align. Any help or suggestions would be greatly appreciated! I am also using python 3.x
To note I am using tuples to contain all the information, with all_lines being the master list, as it were. The information is being read from a csv originally. Apologies for the horrible naming conventions, trying to work on functionality first.
Sorry if this has been answered elsewhere; I have tried looking.
EDIT: Here is the code im using for my all_lines
import csv
open_file = open('items.csv', 'r+')
all_lines = []
for line in open_file:
splitline = line.strip().split(',')
all_lines.append((splitline[0], splitline[1], splitline[2], splitline[3]))
And here is the csv file information:
Rusty Bucket,40L bucket - quite rusty,0.0,in
Golf Cart,Tesla powered 250 turbo,195.0,out
Thermomix,TM-31,25.5,out
AeroPress,Great coffee maker,5.0,in
Guitar,JTV-59,12.95,in
You should look at str.ljust(width[, fillchar]):
> '(TM-31)'.ljust(15)
'(TM-31) ' # pad to width
Then extract the variable-length {} ({}) part and padd it to the necessary width.
What you are looking for may be:
[EDIT] I have now also included a tabulate version since you may gain flexibility with this.
import csv
from tabulate import tabulate
open_file = open('items.csv', 'r+')
all_lines = []
for line in open_file:
splitline = line.strip().split(',')
all_lines.append((splitline[0], splitline[1], splitline[2], splitline[3]))
#print all_lines
count = 0
new_lines=[]
for splitline in all_lines:
in_out = splitline[3]
dollar_sign = "= $"
daily_price = "{0:.2f}".format(float(splitline[2]))
if in_out == "out":
in_out = str("*")
else:
in_out = str("")
str2='('+splitline[1]+')'
print count, "- {:<30} {:<30} {}{:<30} {:<10}".format(splitline[0], str2, dollar_sign, daily_price, in_out)
new_lines.append([splitline[0], str2, dollar_sign, daily_price, in_out])
count += 1
print tabulate(new_lines, tablefmt="plain")
print
print tabulate(new_lines, tablefmt="plain", numalign="left")
I do not like the idea of controlling printing format myself.
In this case, I would leverage a tabulating library such as: Tabulate.
The two key points are:
keep data in a table (e.g. list in a list)
select proper printing format with tablefmt param.

python csv delimiter doesn't work properly

I try to write a python code to extract DVDL values from the input. Here is the truncated input.
A V E R A G E S O V E R 50000 S T E P S
NSTEP = 50000 TIME(PS) = 300.000 TEMP(K) = 300.05 PRESS = -70.0
Etot = -89575.9555 EKtot = 23331.1725 EPtot = -112907.1281
BOND = 759.8213 ANGLE = 2120.6039 DIHED = 4231.4019
1-4 NB = 940.8403 1-4 EEL = 12588.1950 VDWAALS = 13690.9435
EELEC = -147238.9339 EHBOND = 0.0000 RESTRAINT = 0.0000
DV/DL = 13.0462
EKCMT = 10212.3016 VIRIAL = 10891.5181 VOLUME = 416404.8626
Density = 0.9411
Ewald error estimate: 0.6036E-04
R M S F L U C T U A T I O N S
NSTEP = 50000 TIME(PS) = 300.000 TEMP(K) = 1.49 PRESS = 129.9
Etot = 727.7890 EKtot = 115.7534 EPtot = 718.8344
BOND = 23.1328 ANGLE = 36.1180 DIHED = 19.9971
1-4 NB = 12.7636 1-4 EEL = 37.3848 VDWAALS = 145.7213
EELEC = 739.4128 EHBOND = 0.0000 RESTRAINT = 0.0000
DV/DL = 3.7510
EKCMT = 76.6138 VIRIAL = 1195.5824 VOLUME = 43181.7604
Density = 0.0891
Ewald error estimate: 0.4462E-04
Here is the script. Basically we have a lot of DVDL in the input (not in the above truncated input) and we only want the last two. So we read all of them into a list and only get the last two. Finally, we write the last two DVDL in the list into a csv file. The desire output is
13.0462, 3.7510
However, the following script (python 2.7) will bring the output like this. Could any guru enlighten? Thanks.
13.0462""3.7510""
Here is the script:
import os
import csv
DVDL=[]
filename="input.out"
file=open(filename,'r')
with open("out.csv",'wb') as outfile: # define output name
line=file.readlines()
for a in line:
if ' DV/DL =' in a:
DVDL.append(line[line.index(a)].split(' ')[1]) # Extract DVDL number
print DVDL[-2:] # We only need the last two DVDL
yeeha="".join(str(a) for a in DVDL[-2:])
print yeeha
writer = csv.writer(outfile, delimiter=',',lineterminator='\n')#Output the list into a csv file called "outfile"
writer.writerows(yeeha)
As the commenter who proposed an approach has not had the chance to outline some code for this, here's how I'd suggest doing it (edited to allow optionally signed floating point numbers with optional exponents, as suggested by an answer to Python regular expression that matches floating point numbers):
import re,sys
pat = re.compile("DV/DL += +([+-]?(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?)")
values = []
for line in open("input.out","r"):
m = pat.search(line)
if m:
values.append(m.group(1))
outfile = open("out.csv","w")
outfile.write(",".join(values[-2:]))
Having run this script:
$ cat out.csv
13.0462,3.7510
I haven't used the csv module in this case because it isn't really necessary for a simple output file like this. However, adding the following lines to the script will use csv to write the same data into out1.csv:
import csv
writer = csv.writer(open("out1.csv","w"))
writer.writerow(values[-2:])

Sympy refuses to calculate values

I'm trying to implement a markdown-like language to do math with. The basic idea is to have a file where you can write down your math, then have a python-script do the calculations and spit out tex.
However, I'm facing the problem, that Sympy refuses to spit out values, it only gives me back the equation. Much weirder is the fact, that it DOES spit out values in an alternate test-script, that is essentially the same code.
This is the working code:
import sympy as sp
m = sp.symbols('m')
kg = sp.symbols('kg')
s = sp.symbols('s')
g = sp.sympify(9.80665*m/s**2)
mass = sp.sympify(0.2*kg)
acc = sp.sympify(g)
F = sp.sympify(mass*acc)
print F
Output:
1.96133*kg*m/s**2
This the not working code:
import re
import sympy as sp
print 'import sympy as sp'
#read units
mymunits = 'units.mymu'
with open(mymunits) as mymu:
mymuinput = mymu.readlines()
for lines in mymuinput:
lines = re.sub('\s+','',lines).split()
if lines != []:
if lines[0][0] != '#':
unit = lines[0].split('#')[0]
globals()[unit] = sp.symbols(unit)
print unit+' = sp.symbols(\''+unit+'\')'
#read constants
mymconstants = 'constants.mymc'
with open(mymconstants) as mymc:
mymcinput = mymc.readlines()
for lines in mymcinput:
lines = re.sub('\s+','',lines).split()
if lines != []:
if lines[0][0] != '#':
constant = lines[0].split('#')[0].split(':=')
globals()[constant[0]] = sp.sympify(constant[1])
print constant[0]+' = sp.sympify('+constant[1]+')'
#read file
mymfile = 'test.mym'
with open(mymfile) as mym:
myminput = mym.readlines()
#create equations by removing spaces and splitting lines
for line in myminput:
line = line.replace(' ','').strip().split(';')
for eqstr in line:
if eqstr != '':
eq = re.split(':=',eqstr)
globals()[eq[0]] = sp.sympify(eq[1])
print eq[0]+' = sp.sympify('+eq[1]+')'
print 'print F'
print F
It outputs this:
acc*mass
It SHOULD output a value, just like the test-script.
The same script also outputs the code that is used in the test-script. The only difference is, that in the not-working script, I try to generate the code from an input-file, which looks like that:
mass := 0.2*kg ; acc := g
F := mass*acc
as well as files for units:
#SI
m #length
kg #mass
s #time
and constants:
#constants
g:=9.80665*m/s**2 #standard gravity
The whole code is also to be found on github.
What I don't get is why the one version works, while the other doesn't. Any ideas are welcomed.
Thank you.
Based on Everts comment, I cam up with this solution:
change:
sp.sympify(eq[1])
to:
sp.sympify(eval(eq[1]))

How do I convert integers into high-resolution times in Python? Or how do I keep Python from dropping zeros?

Currently, I'm using this to calculate the time between two messages and listing the times if they are above 20 seconds.
def time_deltas(infile):
entries = (line.split() for line in open(INFILE, "r"))
ts = {}
for e in entries:
if " ".join(e[2:5]) == "OuchMsg out: [O]":
ts[e[8]] = e[0]
elif " ".join(e[2:5]) == "OuchMsg in: [A]":
in_ts, ref_id = e[0], e[7]
out_ts = ts.pop(ref_id, None)
yield (float(out_ts),ref_id[1:-1],(float(in_ts)*10000 - float(out_ts)*10000))
n = (float(in_ts)*10000 - float(out_ts)*10000)
if n> 20:
print float(out_ts),ref_id[1:-1], n
INFILE = 'C:/Users/klee/Documents/text.txt'
import csv
with open('output_file1.csv', 'w') as f:
csv.writer(f).writerows(time_deltas(INFILE))
However, there are two major errors. First of all, python drops zeros when the time is before 10, ie. 0900. And, it drops zeros making the time difference not accurate.
It looks like:
130203.08766
when it should be:
130203.087660
You are yielding floats, so the csv writer turns those floats into strings as it pleases.
If you want your output values to be a certain format, yield a string in that format.
Perhaps something like this?
print "%04.0f" % (900) # prints 0900

Categories

Resources