I get a .csv file with values inside and one of the columns contains durations in the format hh:mm:ss for example 06:42:13 (6 hours, 42 minutes and 13 seconds). Now I want to compare this time with a given time for example 00:00:00 because I have to handle the information in that row different.
time is the value I got out of the .csv file
if time == 00:00:00:
do something
else:
do something different
Thats what I want but it obviously doesn't work how I did it. I thought python stored the time as a String but when i compared it like this:
if time == "00:00:00":
it didn't work either.
Thats how I get the values out of the .csv file:
import csv
import_list = []
with open("input.csv", "r") as csvfile:
inputreader = csv.reader(csvfile, delimiter=';')
for row in inputreader:
import_list.append(row)
The .csv file looks like this:
Name; Duration; Tests; Warnings; Errors
Test1; 06:42:13; 2000; 2; 1
Test2; 00:00:00; 0; 0; 0
and so on.
Try it like this:
if time == " 00:00:00":
...
You have a trailing space at the beginning.
Alternatively you can change your code into this:
import csv
import_list = []
with open("input.csv", "r") as csvfile:
inputreader = csv.reader(csvfile, delimiter=';')
for row in inputreader:
import_list.append([item.strip() for item in row])
Do this instead:
if time.strip() == "00:00:00":
do something
else:
do something different
Instead of doing string comparisions, using inbuilt datetime library to create datetime objects. Use datetime.strptime to convert date string.
Related
I have a CSV file like this:
2021-08-09 15:50:44 38962 part-00000-6baa0883-5212-49f7-9ba2-63a352211fdd-c000.snappy.parquet
2021-08-09 16:50:44 38962 part-00000-6baa0883-5212-49f7-9ba2-63a352211fdd-c000.snappy.parquet
I'd like to extract all the timestamps into one list so that I can perform the evaluation function below (ie evaluating if check_timestamps_updated is true).
Problem is also taking the date into account, not just the time. What's the most efficient way of combining the two separate columns (date and time) from the csvreader object so that it can be compared with control_time?
from datetime import datetime as dt
control_time = str(str(dt.now()))
reader = csv.reader(results, delimiter=" ")
time_column = list(zip(*reader))[1]
check_timestamps_updated = all(i >= control_time for i in time_column)
As far as I understand what you want to do can be implemented as below,
import csv
from datetime import datetime as dt
check_timestamps_updated = True
control_time = dt.now().timestamp()
with open('example.csv', newline='\n') as f:
reader = csv.reader(f, delimiter=" ")
for line in reader:
date = dt.strptime(f'{line[0]} {line[1]}', '%Y-%m-%d %H:%M:%S').timestamp()
if date >= control_time:
check_timestamps_updated = False
print(check_timestamps_updated)
You asked the most efficient way to merge two columns but I think it depends on what you are mentioning as efficiency. If the csv file is too big and if there is a chance to have a memory problem what I implemented above can work without an issue. But if you are mentioning time this is still a good one.
I'm trying to copy headers from one Python file to another and it's splitting the headers into individual characters, one character for a column. I'm not sure why.
I've read through StackOverflow but couldn't find a question/solution to this problem.
first.csv file data
Date,Data
1/2/2019,a
12/1/2018,b
11/3/2018,c
Python Code
import csv
from datetime import datetime, timedelta
date_ = datetime.strftime(datetime.now(),'%Y_%m_%d')
with open('first.csv', 'r') as full_file, open('second.csv' + '_' + date_ + '.csv', 'w') as past_10_days:
writer = csv.writer(past_10_days)
writer.writerow(next(full_file)) #copy headers over from original file
for row in csv.reader(full_file): #run through remaining rows
if datetime.strptime(row[0],'%m/%d/%Y') > datetime.now() - timedelta(days=10): #write rows where timestamp is greater than today - 10
writer.writerow(row)
Result I get:
D,a,t,e,D,a,t,a
1/2/2019,a
I'd like the result to just be
Date,Data
1/2/2019,a
Am I just missing setting an option? This is Python 3+
Thanks!
Change
writer.writerow(next(full_file))
To
writer.writerow(next(csv.reader(full_file)))
Your code is reading full_file as a text file, not as a CSV, so you'll just get the characters.
Ideally, as roganjosh pointed out, you should simply define the reader once, so the code should look like this:
reader = csv.reader(full_file)
writer.writerow(next(reader))
for row in reader:
if datetime.strptime(row[0],'%m/%d/%Y') > datetime.now() - timedelta(days=10):
writer.writerow(row)
I have one CSV file, and I want to extract the first column of it. My CSV file is like this:
Device ID;SysName;Entry address(es);IPv4 address;Platform;Interface;Port ID (outgoing port);Holdtime
PE1-PCS-RANCAGUA;;;192.168.203.153;cisco CISCO7606 Capabilities Router Switch IGMP;TenGigE0/5/0/1;TenGigabitEthernet3/3;128 sec
P2-CORE-VALPO.cisco.com;P2-CORE-VALPO.cisco.com;;200.72.146.220;cisco CRS Capabilities Router;TenGigE0/5/0/0;TenGigE0/5/0/4;128 sec
PE2-CONCE;;;172.31.232.42;Cisco 7204VXR Capabilities Router;GigabitEthernet0/0/0/14;GigabitEthernet0/3;153 sec
P1-CORE-CRS-CNT.entel.cl;P1-CORE-CRS-CNT.entel.cl;;200.72.146.49;cisco CRS Capabilities Router;TenGigE0/5/0/0;TenGigE0/1/0/6;164 sec
For that purpose I use the following code that I saw here:
import csv
makes = []
with open('csvoutput/topologia.csv', 'rb') as f:
reader = csv.reader(f)
# next(reader) # Ignore first row
for row in reader:
makes.append(row[0])
print makes
Then I want to replace into a textfile a particular value for each one of the values of the first column and save it as a new file.
Original textfile:
PLANNED.IMPACTO_ID = IMPACTO.ID AND
PLANNED.ESTADOS_ID = ESTADOS_PLANNED.ID AND
TP_CLASIFICACION.ID = TP_DATA.ID_TP_CLASIFICACION AND
TP_DATA.PLANNED_ID = PLANNED.ID AND
PLANNED.FECHA_FIN >= CURDATE() - INTERVAL 1 DAY AND
PLANNED.DESCRIPCION LIKE '%P1-CORE-CHILLAN%’;
Expected output:
PLANNED.IMPACTO_ID = IMPACTO.ID AND
PLANNED.ESTADOS_ID = ESTADOS_PLANNED.ID AND
TP_CLASIFICACION.ID = TP_DATA.ID_TP_CLASIFICACION AND
TP_DATA.PLANNED_ID = PLANNED.ID AND
PLANNED.FECHA_FIN >= CURDATE() - INTERVAL 1 DAY AND
PLANNED.DESCRIPCION LIKE 'FIRST_COLUMN_VALUE’;
And so on for every value in the first column, and save it as a separate file.
How can I do this? Thank you very much for your help.
You could just read the file, apply changes, and write the file back again. There is no efficient way to edit a file (inserting characters is not efficiently possible), you can only rewrite it.
If your file is going to be big, you should not keep the whole table in memory.
import csv
makes = []
with open('csvoutput/topologia.csv', 'rb') as f:
reader = csv.reader(f)
for row in reader:
makes.append(row)
# Apply changes in makes
with open('csvoutput/topologia.csv', 'wb') as f:
writer = csv.writer(f)
writer.writerows(makes);
I'm trying to write each row of a csv to a json (this will then be posted and looped back through so overwriting the json file is not a big deal here). I have code which seems to do this well enough, but also need to some of the data to be floats/integers rather than strings.
I have a method which works for this in other places, but cannot manage to get the two to agree with each other.
Could anyone point me in the right direction to be able to format the csv data before sending it out as a json? Below is the code for when headers are left in, though I also have a tweaked version which just has raw data in the csv and uses fieldnames for the headers instead.
import csv
import json
input_file = 'Test3.csv'
output_file_template = 'Test.json'
with open(input_file, 'r', encoding='utf8') as csvfile:
reader = csv.DictReader(csvfile, delimiter=',')
rows = list(reader)
for i in range(len(rows)):
out = json.dumps(rows[1*i:1*(i+1)])
with open(output_file_template.format(i), 'w') as f:
f.write(out)
Data is in a format like this:
OrderType OrderStatus OrderDateTime SettlementDate MarketId OrderRoute
Sale Executed 18/11/2016 23/11/2016 1 None
Sale Executed 18/11/2016 23/11/2016 1 None
Sale Executed 18/11/2016 23/11/2016 1 None
With row[4] producing the key error.
In your loop if the float/int data is consistently in the same spot, you can simply cast the values.
for i, row in enumerate(rows):
row[0] = int(row[0]) # this column stores ints
row[1] = float(row[1]) # this column stores floats
out = json.dumps([row])
with open(output_file_template.format(i), 'w') as f:
f.write(out)
I don't know if columns 0 and 1 hold ints and floats, but you can change that as necessary.
Update:
It appears row is an OrderedDict, so you'll just need to use the key instead of an index:
row['MarketId'] = int(row['MarketId'])
i need to get a string from CSV file, i know that I can use Python but i've been looking for hours but still can't get it. The is the CSV looks like:
DATE|CUST|PHONE|EMAIL|NAME|CLASS|QTY|AMOUNT|ID|TRX_ID|BOOKING CODE|PIN
01-02-2013 09:04:16|sdasd|43543|csdfd|Voucher Regular|REGULAR|1|2250000|G001T001|0062013000149|32143000341|MV1011302JSGUCFOM
01-02-2013 09:04:16|sdasd|43543|csdfd|Voucher Regular|REGULAR|2|1200000|G001T001|0062013000149|32143000341|MV4011302CBWDQYOU&MV4011302PVSEVAPJ
01-02-2013 11:01:13|ge|||Voucher Regular|REGULAR|1|600000|G001T001|20000027000005|32143000355|MV4011302UHKMJEEM
The string that I want to get is the PIN column (the last one); but in each column, there can be multiple PINs, separated by '&'.
Thanks for the help, been looking at solving this for hours.
Split on | and get the last entry:
pin = line.split('|')[-1]
Or more fancy:
import csv
with open('bookings.csv', 'rb') as csvfile:
bookings = csv.reader(csvfile, delimiter='|')
for values in bookings:
print(values[-1])
When dealing with a csv file, just use the csv module:
import csv
from itertools import chain
with open('path/to/your/file.csv', 'rb') as csvfile:
tmp = (r['PIN'].split('&') for r in csv.DictReader(csvfile, delimiter='|'))
pins = list(chain.from_iterable(tmp))
for pin in pins:
print pin
Iterate through each line and use something like this:
'hello|world|and|noel'.split('|')[-1]
to get the last element