import csv
import datetime
with open('soundTransit1_remote_rawMeasurements_15m.txt','r') as infile, open('soundTransit1.txt','w') as outfile:
inr = csv.reader(infile,delimiter='\t')
#ouw = csv.writer(outfile,delimiter=' ')
for row in inr:
d = datetime.datetime.strptime(row[0],'%Y-%m-%d %H:%M:%S')
s = 1
p = int(row[5])
nr = [format(s,'02')+format(d.year,'04')+format(d.month,'02')+format(d.day,'02')+format(d.hour,'02')+format(d.minute,'02')+format(int(p*0.2),'04')]
outfile.writelines(nr+'/n')
Using the above script, I have read in a .txt file and reformatted it as 'nr' so it looks like this:
['012015072314000000']
['012015072313450000']
['012015072313300000']
['012015072313150000']
['012015072313000000']
['012015072312450000']
['012015072312300000']
['012015072312150000']
..etc.
I need to now print it onto my new .txt file, but Python is not allowing me to print 'nr' with line breaks after each entry, I think because the data is in strings. I get this error:
TypeError: can only concatenate list (not "str") to list
Is there another way to do this?
You are trying to combine a list with a string, which cannot work. Simply don't create a list in nr.
import csv
import datetime
with open('soundTransit1_remote_rawMeasurements_15m.txt','r') as infile, open('soundTransit1.txt','w') as outfile:
inr = csv.reader(infile,delimiter='\t')
#ouw = csv.writer(outfile,delimiter=' ')
for row in inr:
d = datetime.datetime.strptime(row[0],'%Y-%m-%d %H:%M:%S')
s = 1
p = int(row[5])
nr = "{:02d}{:%Y%m%d%H%M}{:04d}\n".format(s,d,int(p*0.2))
outfile.write(nr)
There is no need to put your string into a list; just use outfile.write() here and build a string without a list:
nr = format(s,'02') + format(d.year,'04') + format(d.month, '02') + format(d.day, '02') + format(d.hour, '02') + format(d.minute, '02') + format(int(p*0.2), '04')
outfile.write(nr + '\n')
Rather than use 7 separate format() calls, use str.format():
nr = '{:02}{:%Y%m%d%H%M}{:04}\n'.format(s, d, int(p * 0.2))
outfile.write(nr)
Note that I formatted the datetime object with one formatting operation, and I included the newline into the string format.
You appear to have hard-coded the s value; you may as well put that into the format directly:
nr = '01{:%Y%m%d%H%M}{:04}\n'.format(d, int(p * 0.2))
outfile.write(nr)
Together, that updates your script to:
with open('soundTransit1_remote_rawMeasurements_15m.txt', 'r') as infile,\
open('soundTransit1.txt','w') as outfile:
inr = csv.reader(infile, delimiter='\t')
for row in inr:
d = datetime.datetime.strptime(row[0], '%Y-%m-%d %H:%M:%S')
p = int(int(row[5]) * 0.2)
nr = '01{:%Y%m%d%H%M}{:04}\n'.format(d, p)
outfile.write(nr)
Take into account that the csv module works better if you follow the guidelines about opening files; in Python 2 you need to open the file in binary mode ('rb'), in Python 3 you need to set the newline parameter to ''. That way the module can control newlines correctly and supports including newlines in column values.
Related
Help fix the code. My script sorts into even and odd numbers of coordinates in the list and only works with a list in decimal number format, but I need to fix the code to work with a list in HEX format (hexadecimal number format)
I don't know the Python language well, but I need to add function hex(str)
Here is a list like this List.txt
(0x52DF625,0x47A406E)
(0x3555F30,0x3323041)
(0x326A573,0x5A5E578)
(0x48F8EF7,0x98A4EF3)
(0x578FE62,0x331DF3E)
(0x3520CAD,0x1719BBB)
(0x506FC9F,0x40CF4A6)
Сode:
with open('List.txt') as fin,\
open('Save+even.txt', 'a') as foutch,\
open('Save-odd.txt', 'a') as foutnch:
data = [line.strip() for line in fin]
nch = [foutnch.write(str(i) + '\n')
for i in data if int(i[1:-1].split(',')[1]) % 2]
ch = [foutch.write(str(i) + '\n')
for i in data if int(i[1:-1].split(',')[1]) % 2 != 1]
this may work for you (i used StringIO instead of real files - but added a comment on how you could use that with real files)
in_file = StringIO("""(0x52DF625,0x47A406E)
(0x3555F30,0x3323041)
(0x326A573,0x5A5E578)
(0x48F8EF7,0x98A4EF3)
(0x578FE62,0x331DF3E)
(0x3520CAD,0x1719BBB)
(0x506FC9F,0x40CF4A6)
""")
even_file = StringIO()
odd_file = StringIO()
# with open( "List.txt") as in_file, open("Save-even.txt", "w") as even_file, open("Save-odd.txt", "w") as odd_file:
for line in in_file:
x_str, y_str = line.strip()[1:-1].split(",")
x, y = int(x_str, 0), int(y_str, 0)
if y & 1: # y is odd
odd_file.write(line)
else:
even_file.write(line)
print("odd")
print(odd_file.getvalue())
print("even")
print(even_file.getvalue())
it outputs:
odd
(0x3555F30,0x3323041)
(0x48F8EF7,0x98A4EF3)
(0x3520CAD,0x1719BBB)
even
(0x52DF625,0x47A406E)
(0x326A573,0x5A5E578)
(0x578FE62,0x331DF3E)
(0x506FC9F,0x40CF4A6)
the trick is to use base 0 when converting a hex string to int: int(x_str, 0),. see this answer.
I have a log file and am trying to print the data between two dates.
2020-01-31T20:12:38.1234Z, asdasdasdasdasdasd,...\n
2020-01-31T20:12:39.1234Z, abcdef,...\n
2020-01-31T20:12:40.1234Z, ghikjl,...\n
2020-01-31T20:12:41.1234Z, mnopqrstuv,...\n
2020-01-31T20:12:42.1234Z, wxyzdsasad,...\n
This is the sample log file and I want to print the lines between 2020-01-31T20:12:39 up to 2020-01-31T20:12:41.
So far I have manged to find and print the starting date line. I have passed the starting date as start.
with open("logfile.log") as myFile:
for line in myFile:
linenum += 1
if line.find(start) != -1:
print("Line " + str(linenum) + ": " + line.rstrip('\n'))
but how do I keep printing till the end date?
Not the answer in python but in bash.
sed -n '/2020-01-31T20:12:38.1234Z/,/2020-01-31T20:12:41.1234Z/p' file.log
Output:
2020-01-31T20:12:38.1234Z, asdasdasdasdasdasd,...\n
2020-01-31T20:12:39.1234Z, abcdef,...\n
2020-01-31T20:12:40.1234Z, ghikjl,...\n
2020-01-31T20:12:41.1234Z, mnopqrstuv,...\n
Since the time string is already structured nicely in your file, you can just do a simple string comparison between the times you're interested in without converting the string to a datetime object.
Use the csv module to read in the file, using the default comma delimiter, and then the filter() function to filter between two dates.
import csv
reader = csv.reader(open("logfile.log"))
filtered = filter(lambda p: p[0].split('.')[0] >= '2020-01-31T20:12:39' and p[0].split('.')[0] <= '2020-01-31T20:12:41', reader)
for l in filtered:
print(','.join(l))
Edit:
I used split() to remove the fractional part of the time string in the string comparison since you're interested in times to the nearest minute accuracy, e.g. 2020-01-31T20:12:39.
if you want in python,
import time
from datetime import datetime as dt
def to_timestamp(date,forma='%Y-%m-%dT%H:%M:%S'):
return time.mktime(dt.strptime(date,forma).timetuple())
start=to_timestamp(startdate)
end=to_timestamp(enddate)
logs={}
with open("logfile.log") as f:
for line in f:
date=line.split(', ')[0].split('.')[0]
logline=line.split(', ')[1].strip('\n')
if to_timestamp(date)>=start and to_timestamp(end) <= end:
logs[date]=logline
I have created a script which a number of random passwords are generated (see below)
import string
import secrets
import datetime
now = datetime.datetime.now()
T = now.strftime('%Y_%m_d')
entities = ['AA','BB','CC','DD','EE','FF','GG','HH']
masterpass = ('MasterPass' + '_' + T + '.csv')
f= open(masterpass,"w+")
def random_secure_string(stringLength):
secureStrMain = ''.join((secrets.choice(string.ascii_lowercase + string.ascii_uppercase + string.digits + ('!'+'?'+'"'+'('+')'+'$'+'%'+'#'+'#'+'/'+':'+';'+'['+']'+'#')) for i in range(stringLength)))
return secureStrMain
def random_secure_string_lower(stringLength):
secureStrLower = ''.join((secrets.choice(string.ascii_lowercase)) for i in range(stringLength))
return secureStrLower
def random_secure_string_upper(stringLength):
secureStrUpper = ''.join((secrets.choice(string.ascii_uppercase)) for i in range(stringLength))
return secureStrUpper
def random_secure_string_digit(stringLength):
secureStrDigit = ''.join((secrets.choice(string.digits)) for i in range(stringLength))
return secureStrDigit
def random_secure_string_char(stringLength):
secureStrChar = ''.join((secrets.choice('!'+'?'+'"'+'('+')'+'$'+'%'+'#'+'#'+'/'+':'+';'+'['+']'+'#')) for i in range(stringLength))
return secureStrChar
for x in entities:
f.write(x + ',' + random_secure_string(6) + random_secure_string_lower(1) + random_secure_string_upper(1) + random_secure_string_digit(1) + random_secure_string_char(1) + ',' + T + "\n")
f.close()
I use pandas to get the code to import a list, so normally it is for 200-250 entities, not just the 8 in the example.
The issue comes every so often where it looks like the comma delimiter fails to be read (see row 6 of attached photo)
In all the cases I have had of this (multiple run throughs), it looks like the 10th character is a comma, the 4 before (characters 6-9) are as stated in the script, but then instead of generating 6 initial characters (from random_secure_string(6)), it is generating 5. Could this be causing the issue? If so, how do I fix this?
Thank you in advance
Wild guess, because the content of the csv file as text is required to make sure.
A csv is a Comma Separated Values text file. That means that it is a plain text files where fields are delimited with a separator, normally the comma (,). In order to allow text fields to contain commas or even new lines, they can be enclosed in quotes (normally ") or special characters can be escaped, normally with \.
That means that if a line contains abcdefg\,2020_05 the comma will not be interpreted as a separator.
How to fix:
CSV is a simple format, but with many corner cases. The rule is avoid to read or write it by hand. Just use the standard library csv module here:
...
import csv
...
with open(masterpass,"w+", newline='') as f:
wr = csv.writer(f)
for x in entities:
wr.writerow([x, random_secure_string(6) + random_secure_string_lower(1) + random_secure_string_upper(1) + random_secure_string_digit(1) + random_secure_string_char(1), T])
The writer will take care for special characters and ensure that appropriate encoding or escaping will be used
I have a set of .txt files where the date of creation features in the filename, e.g. 'ab1900906cde.txt'. I want to tag each line within the file with that date. Helpfully each line begins with a common string while I'll refer to as 'xyz'.
I've written a code which captures the date and as I intended (confirmed by printing the date tag in line 10), but re.sub does not work to introduce the date tag into the output file. The code is:
mypath = 'users/mypath/'
myfiles = glob.glob(mypath + '*.txt')
for fin in myfiles:
fn = os.path.basename(fin)
fout = os.path.join(mypath + 'date_tagged_' + 'fn')
yy = (fn[2:4])
mm = (fn[4:6])
dd = (fn[6:8])
datetag = ("".join(['<Date: 20',yy,'-',mm,'-',dd,'>']))
print(datetag)
s = open(fin, encoding='utf8').read()
s = re.sub('xyz', (datetag), s)
with open(fout, 'w', encoding='utf8') as f:
f.write(s)
Any suggestions? Thanks in advance.
Try this.
s = re.sub('xyz', str(datetag), s)
Adding a little explanation to the answer:
Your code datetag = ("".join(['<Date: 20',yy,'-',mm,'-',dd,'>'])) is actually a generator expression. Calling str on it materializes it into a string. You can also resolve this issue by removing the surrounding parenthesis, which are unnecessary. E.G. datetag = "".join(['<Date: 20',yy,'-',mm,'-',dd,'>']).
Similarly, some of your other variable assignments are unnecessarily generator expressions:
yy = (fn[2:4])
mm = (fn[4:6])
dd = (fn[6:8])
Can instead be
yy = fn[2:4]
mm = fn[4:6]
dd = fn[6:8]
Currently, I'm using this to calculate the time between two messages and listing the times if they are above 20 seconds.
def time_deltas(infile):
entries = (line.split() for line in open(INFILE, "r"))
ts = {}
for e in entries:
if " ".join(e[2:5]) == "OuchMsg out: [O]":
ts[e[8]] = e[0]
elif " ".join(e[2:5]) == "OuchMsg in: [A]":
in_ts, ref_id = e[0], e[7]
out_ts = ts.pop(ref_id, None)
yield (float(out_ts),ref_id[1:-1],(float(in_ts)*10000 - float(out_ts)*10000))
n = (float(in_ts)*10000 - float(out_ts)*10000)
if n> 20:
print float(out_ts),ref_id[1:-1], n
INFILE = 'C:/Users/klee/Documents/text.txt'
import csv
with open('output_file1.csv', 'w') as f:
csv.writer(f).writerows(time_deltas(INFILE))
However, there are two major errors. First of all, python drops zeros when the time is before 10, ie. 0900. And, it drops zeros making the time difference not accurate.
It looks like:
130203.08766
when it should be:
130203.087660
You are yielding floats, so the csv writer turns those floats into strings as it pleases.
If you want your output values to be a certain format, yield a string in that format.
Perhaps something like this?
print "%04.0f" % (900) # prints 0900