String Replace for Multiple Lines In A CSV

String Replace for Multiple Lines In A CSV - python

Below is a snippet from a csv file. The first column is the product number, 2 is the stock level, 3 is the target level, and 4 is the distance from target (target minus stock level.)
34512340,0,95,95
12395675,3,95,92
56756777,70,95,25
90673412,2,95,93
When the stock level gets to 5 or below, I want to have the stock levels updated from python when a user requests it.
I am currently using this piece of code which I have adapted from just updating one line in the CSV. It isn't working though. The first line is written back to the file as 34512340,0,95,95 and the rest of the file is deleted.
choice = input("\nTo update the stock levels of the above products, type 1. To cancel, enter anything else.")
if choice == '1':
with open('stockcontrol.csv',newline='') as f:
for line in f:
data = line.split(",")
productcode = int(data[0])
target = int(data[2])
stocklevel = int(data[1])
if stocklevel <= 5:
target = str(target)
import sys
import csv
data=[]
newval= target
newtlevel = "0"
f=open("stockcontrol.csv")
reader=csv.DictReader(f,fieldnames=['code','level', 'target', 'distancefromtarget'])
for line in reader:
line['level']= newval
line['distancefromtarget']= newtlevel
data.append('%s,%s,%s,%s'%(line['code'],line['level'],line['target'],line['distancefromtarget']))
f.close()
f=open("stockcontrol.csv","w")
f.write("\n".join(data))
f.close()
print("The stock levels were updated successfully")
else:
print("Goodbye")
Here is the code that I had changing one line in the CSV file and works:
with open('stockcontrol.csv',newline='') as f:
for line in f:
if code in line:
data = line.split(",")
target = (data[2])
newlevel = stocklevel - quantity
updatetarget = int(target) - int(newlevel)
stocklevel = str(stocklevel)
newlevel = str(newlevel)
updatetarget = str(updatetarget)
import sys
import csv
data=[]
code = code
newval= newlevel
newtlevel = updatetarget
f=open("stockcontrol.csv")
reader=csv.DictReader(f,fieldnames=['code','level', 'target', 'distancefromtarget'])
for line in reader:
if line['code'] == code:
line['level']= newval
line['distancefromtarget']= newtlevel
data.append('%s,%s,%s,%s'%(line['code'],line['level'],line['target'],line['distancefromtarget']))
f.close()
f=open("stockcontrol.csv","w")
f.write("\n".join(data))
f.close()
What can I change to make the code work? I basically want the program to loop through each line of the CSV file, and if the stock level (column 2) is equal to or less than 5, update the stock level to the target number in column 3, and then set the number in column 4 to zero.
Thanks,

The below code reads each line and checks the value of column 2. If it is less than or equal to 5, the value of column2 is changed to value of column3 and last column is changed to 0 else all the columns are left unchanged.
import sys
import csv
data=[]
f=open("stockcontrol.csv")
reader=csv.DictReader(f,fieldnames=['code','level','target','distancefromtarget'])
for line in reader:
if int(line['level']) <= 5:
line['level']= line['target']
line['distancefromtarget']= 0
data.append("%s,%s,%s,%s"%(line['code'],line['level'],line['target'],line['distancefromtarget']))
f.close()
f=open("stockcontrol.csv","w")
f.write("\n".join(data))
f.close()
Coming to issues in your code:
You are first reading the file without using the csv module and getting the values in each column by splitting the line. You are again using the DictReader method of csv module to read the values you already had.

Related

Compare 2 csv files and check for first 2 columns, if it matches ask the user to decide to override or not and then proceed to next row

I have a use case where I have 2 CSV files with some rows in each CSV file, and they have three columns each. Compare the 2 csv files for first 2 columns and if it matches then ask the user input if he wants to override the row in the first csv file with the values from second csv file, if not abort the operation.
First time when I run the python code it should update the csv file with the new values from the 2nd CSV file to first csv file, but for consecutive runs of my python code I have to check if first 2 columns match and ask the user to decide if he needs to override the values or not, since now the first csv file will have rows from first csv file.
My code:
import csv
import sys
def csv_file_copy():
csv_file = input("Enter the CSV file needs to be updated ")
csv_file_cp = input("Enter the csv file from where the data needs to be copied ")
csvfile = open(csv_file_cp, 'r',encoding="utf-8-sig")
reader = csv.reader(csvfile)
csv_file_orig = open(csv_file, 'r',encoding="utf-8-sig")
reader2 = csv.reader(csv_file_orig)
res = []
for row in reader:
print("This is row", row)
for row2 in reader2:
print("This is row2", row2)
if (row2[0] == row[0] and row2[1] == row[1]):
user_input = input("Store type and store number already exists in the csv file, continue? y/n ").lower()
if user_input == "y":
res.append(row)
elif user_input == "n":
print("Aborting operation")
sys.exit(1)
else:
res.append(row2)
res.append(row)
continue
print (reader)
with open(csv_file, 'w') as csv_file1:
writer = csv.writer(csv_file1, delimiter=',')
for row in res:
writer.writerow(row)
csv_file_copy()
When the code is executed second time against the same 2 files the second for loop runs only once thereby matching only one value but there are about 10 values that is matching which doesn't work for me.

If the csv_file_orig is not too big (or your available memory too low) then you may store the whole contents into a list.
Instead of
reader2 = csv.reader(csv_file_orig)
You'll use
csv_file_orig_lines = list(csv.reader(csv_file_orig))
Afterwards you may iterate through csv_file_orig_lines list as many times as you want.

Add one column to a text file

I have multiple txt files and each of these txt files has 6 columns. What I want to do : add just one column as a last column, so at the end the txt file has maximum 7 columns and if i run the script again it shouldn't add a new one:
At the beginning each file has six columns:
637.39 718.53 155.23 -0.51369 -0.18539 0.057838 3.209840789730089
636.56 720 155.57 -0.51566 -0.18487 0.056735 3.3520643559939938
635.72 721.52 155.95 -0.51933 -0.18496 0.056504 3.4997850701290125
What I want is to add a new column of zeros only if the current number of columns is 6, after that it should prevent adding a new column when I run the script again (7 columns is the total number where the last one is zeros):
637.39 718.53 155.23 -0.51369 -0.18539 0.057838 3.209840789730089 0
636.56 720 155.57 -0.51566 -0.18487 0.056735 3.3520643559939938 0
635.72 721.52 155.95 -0.51933 -0.18496 0.056504 3.4997850701290125 0
My code works and add one additional column each time i run the script but i want to add just once when the number of columns 6. Where (a) give me the number of column and if the condition is fulfilled then add a new one:
from glob import glob
import numpy as np
new_column = [0] * 20
def get_new_line(t):
l, c = t
return '{} {}\n'.format(l.rstrip(), c)
def writecolumn(filepath):
# Load data from file
with open(filepath) as datafile:
lines = datafile.readlines()
a=np.loadtxt(lines, dtype='str').shape[1]
print(a)
**#if a==6: (here is the problem)**
n, r = divmod(len(lines), len(new_column))
column = new_column * n + new_column[:r]
new_lines = list(map(get_new_line, zip(lines, column)))
with open(filepath, "w") as f:
f.writelines(new_lines)
if __name__ == "__main__":
filepaths = glob("/home/experiment/*.txt")
for path in filepaths:
writecolumn(path)
When i check the number of columns #if a==6 and shift the content inside the if statement I get error. without shifting the content inside the if every thing works fine and still adding one column each time i run it.
Any help is appreciated.
To test the code create two/one txt files with random number of six columns.

Could be an indentation problem, i.e. block below 'if'. writing new-lines should be indented properly --
This works --
def writecolumn(filepath):
# Load data from file
with open(filepath) as datafile:
lines = datafile.readlines()
a=np.loadtxt(lines, dtype='str').shape[1]
print(a)
if int(a)==6:
n, r = divmod(len(lines), len(new_column))
column = new_column * n + new_column[:r]
new_lines = list(map(get_new_line, zip(lines, column)))
with open(filepath, "w") as f:
f.writelines(new_lines)

Use pandas to read your text file:
import pandas as of
df = pd.read_csv("whitespace.csv", header=None, delimiter=" ")
Add a column or more as needed
df['somecolname'] = 0
Save DataFrame with no header.

How to count lines in a text file with specified values?

I'm working with a .csv file that lists Timestamps in one column and Wind Speeds in the second column. I need to read through this .csv file and calculate the percent of time where wind speed was above 2m/s. Here's what I have so far.
txtFile = r"C:\Data.csv"
line = o_txtFile.readline()[:-1]
while line:
line = oTextfile.readline()
for line in txtFile:
line = line.split(",")[:-1]
How do I get a count of the lines where the 2nd element in the line is greater than 2?
CSV File Sample

You will probably have to update slightly your CSV, depending on the chosen option (for option 1 and option 2, you will definitely want to remove all header rows, whereas for option 3, you will keep only the middle one, i.e. the one that starts with TIMESTAMP).
You actually have three options:
Option 1: Vanilla Python
count = 0
with open('data.csv', 'r') as file:
for line in file:
value = int(line.split(',')[1])
if value > 100:
count += 1
# Now you have the value in ``count`` variable
Option 2: CSV module
Here I use the Python's CSV module (you could as well use the DictReader, but I'll let you do the search yourself).
import csv
count = 0
with open('data.csv', 'r') as file:
reader = csv.read(file, delimiter=',')
for row in reader:
if int(row[1]) > 100:
count += 1
# Now you have the value in ``count`` variable
Option 3: Pandas
Pandas is a really cool, awesome library used by a lot of people to do data analysis. Doing what you want to do would look like:
import pandas as pd
df = pd.read_csv('data.csv')
# Here you are
count = len(df[df['WindSpd_ms'] > 100])

You can read in the file line by line, if something in it, split it.
You count the lines read and how many are above 10m/s - then calculate the percentage:
# create data file for processing with random data
import random
random.seed(42)
with open("data.txt","w") as f:
f.write("header\n")
f.write("header\n")
f.write("header\n")
f.write("header\n")
for sp in random.choices(range(10),k=200):
f.write(f"some date,{sp+3.5}, data,data,data\n")
# open/read/calculate percentage of data that has 10m/s speeds
days = 0
speedGreater10 = 0
with open("data.txt","r") as f:
for _ in range(4):
next(f) # ignore first 4 rows containing headers
for line in f:
if line: # not empty
_ , speed, *p = line.split(",")
# _ and *p are ignored (they take 'some date' + [data,data,data])
days += 1
if float(speed) > 10:
speedGreater10 += 1
print(f"{days} datapoints, of wich {speedGreater10} "+
f"got more then 10m/s: {speedGreater10/days}%")
Output:
200 datapoints, of wich 55 got more then 10m/s: 0.275%
Datafile:
header
header
header
header
some date,9.5, data,data,data
some date,3.5, data,data,data
some date,5.5, data,data,data
some date,5.5, data,data,data
some date,10.5, data,data,data
[... some more ...]
some date,8.5, data,data,data
some date,3.5, data,data,data
some date,12.5, data,data,data
some date,11.5, data,data,data

Remove multiple lines from csv

This is my code so far, I have many lines in a CSV that I would like to keep, but if it's the 3rd line, then ignore
This is the line I'd like to be omitted if it is not the third row:
Curriculum Name,,Organization Employee Number,Employee Department,Employee Name,Employee Email,Employee Status,Date Assigned,Completion Date,Completion Status,Manager Name,Manager Email
it is appearing every 10 lines or so, but i want it removed if its not the first row (always the third)
import csv, sys, os
#Read the CSV file and skipping the first 130 lines based on mylist
scanReport = open('Audit.csv', 'r')
scanReader = csv.reader(scanReport)
#search row's in csv - print out list
for file in glob.glob(r'C:\sans\Audit.csv'):
lineNumber = 0
str - "Curriculum Name"
with open('first.csv', 'rb') as inp, open('first_edit.csv', 'wb') as out:
writer = csv.writer(out)
for row in csv.writer(inp):
if row[2] != " 0":
writer.writerow(row)

You want something like this in that loop:
index = 0
for row in csv.writer(inp):
if (index != 3) or (index == 3 and row[2] != " 0"):
writer.writerow(row)
index += 1
I am not familiar with the csv module, so I kept all your stuff assuming it is correct (I don't think you need that module for what you are doing though...)
More info on enumerate here.
EDIT:
To check if it's that line:
def IsThatLine(row):
return row[0] == "Curriculum Name" and row[1] == "" and row[2] == "Organization Employee" and ....
Then the if can become:
if (index != 3) or (index == 3 and not IsThatLine(row)):

Could you please be more specific in your question?
Would you like to remove any line containing the following description?
Curriculum Name,,Organization Employee Number,Employee Department,Employee Name,Employee Email,Employee Status,Date Assigned,Completion Date,Completion Status,Manager Name,Manager Email
Or would you like to remove only the third line (row) of this csv file?

Get number of rows from .csv file

I am writing a Python module where I read a .csv file with 2 columns and a random amount of rows. I then go through these rows until column 1 > x. At this point I need the data from the current row and the previous row to do some calculations.
Currently, I am using 'for i in range(rows)' but each csv file will have a different amount of rows so this wont work.
The code can be seen below:
rows = 73
for i in range(rows):
c_level = Strapping_Table[Tank_Number][i,0] # Current level
c_volume = Strapping_Table[Tank_Number][i,1] # Current volume
if c_level > level:
p_level = Strapping_Table[Tank_Number][i-1,0] # Previous level
p_volume = Strapping_Table[Tank_Number][i-1,1] # Previous volume
x = level - p_level # Intermediate values
if x < 0:
x = 0
y = c_level - p_level
z = c_volume - p_volume
volume = p_volume + ((x / y) * z)
return volume
When playing around with arrays, I used:
for row in Tank_data:
print row[c] # print column c
time.sleep(1)
This goes through all the rows, but I cannot access the previous rows data with this method.
I have thought about storing previous row and current row in every loop, but before I do this I was wondering if there is a simple way to get the amount of rows in a csv.

Store the previous line
with open("myfile.txt", "r") as file:
previous_line = next(file)
for line in file:
print(previous_line, line)
previous_line = line
Or you can use it with generators
def prev_curr(file_name):
with open(file_name, "r") as file:
previous_line = next(file)
for line in file:
yield previous_line ,line
previous_line = line
# usage
for prev, curr in prev_curr("myfile"):
do_your_thing()

You should use enumerate.
for i, row in enumerate(tank_data):
print row[c], tank_data[i-1][c]

Since the size of each row in the csv is unknown until it's read, you'll have to do an intial pass through if you want to find the number of rows, e.g.:
numberOfRows = (1 for row in file)
However that would mean your code will read the csv twice, which if it's very big you may not want to do - the simple option of storing the previous row into a global variable each iteration may be the best option in that case.
An alternate route could be to just read in the file and analyse it from that from e.g. a panda DataFrame (http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html)
but again this could lead to slowness if your csv is too big.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

String Replace for Multiple Lines In A CSV - python

Related

Compare 2 csv files and check for first 2 columns, if it matches ask the user to decide to override or not and then proceed to next row

Add one column to a text file

How to count lines in a text file with specified values?

Remove multiple lines from csv

Get number of rows from .csv file

Categories

Resources