Selecting Certain Cell on CSV - python

My company recently purchased a machine and I'm trying to find a way to store its data in a our data base, but first I need to clean up the CSV or Select Certain cells to write into a new csv. I'm currently Using Python 3.9XX
I need to extract the fallowing items for from this file. Serial number(Highlighted Yellow),Start time,End time, Pass-step,fail-steps, and Test Results.
If I can manage to select one cell it will try to do the rest on my own but im currently stuck trying to select the serial number and then writing into a new csv .
DATA FROM CSV
import csv
# read CSV
csvFile = r"C:\Users\Hunter\Documents\Programing\Python\Measu Dev\11.csv"
f=open(csvFile,'rt')
myReader = csv.reader(f)
Headers = ['SerialNo','PartNo','Startime','Endtime','TabPassed','TabFailed','TestResult']
Serialno = []
with open( 'Processed.csv', 'w', encoding='utf-8', newline='') as csvfile:
writer=csv.writer(csvfile)
writer.writerow(Headers)
writer.writerow(SerialNo)
RESULT
This is my ending result, I want to be able to store the serial number under its header 'SerialNo' but nothing seems to work on my end. I'm pretty new to this any help will be appreciated it.
thank you guys.

Related

Python: Read in multiple csv files from a directory into a dictionary without using Pandas

I have a directory containing multiple csv's that I would to read into a single dictionary. The dictionary would use the original file names as keys and the contents of the csv's as values. I don't want to use pandas because I am new to Python and want to understand these tasks first before pulling out the big guns. I would like to use DictReader for the task. Here is the code I have so far below. It works fine for one file at a time. Help is greatly appreciated.
def read_lines():
data = []
with open('vari_late_low_scores.csv', newline='') as stream:
reader = csv.reader(stream, delimiter=',', skipinitialspace=True)
for row in reader:
data.append(row)
return data
Thank you!

How to update columns of a CSV file if row exists, else how to append to same CSV, using temporary file

I am stuck at trying to build a database using a CSV file.
I am using input of symbols (stock market tickers), and I am able to generate website links for each symbol, corresponding to the company's website.
I would like to save that database to a CSV file named BiotechDatabase.csv
The Database look
Every time I input a new symbol in Python, I would like to verify the first column of the CSV file to see if the symbol exists. If it does, I need to overwrite the Web column to make sure it is updated.
If the symbol does not exist, a row will need to be appended containing the symbol and the Web.
Since I need to expand the columns to add more information in the future, I need to use DictWriter as some columns might have missing information and need to be skipped.
I have been able to update information for a symbol if the symbol is in the database using the code below:
from csv import DictWriter
import shutil
import csv
#Replacing the symbol below with the any stock symbol I want to get the website for
symbol = 'PAVM'
#running the code web(symbol) generates the website I need for PAVM and that is http://www.pavmed.com which I converted to a string below
web(symbol)
filename = 'BiotechDatabase.csv'
tempfile = NamedTemporaryFile('w', newline='', delete=False)
fields = ['symbol','Web']
#I was able to replace any symbol row using the code below:
with open(filename, 'r', newline='') as csvfile, tempfile:
reader = csv.DictReader(csvfile, fieldnames=fields)
writer = csv.DictWriter(tempfile, fieldnames=fields)
for row in reader:
if row['symbol'] == symbol:
print('adding row', row['symbol'])
row['symbol'], row['Web']= symbol, str(web(symbol))
row = {'symbol': row['symbol'], 'Web': row['Web']}
writer.writerow(row)
shutil.move(tempfile.name, filename)
If the symbol I entered in Python doesn't exist however in the CSV file, how can I append a new row in the CSV file at the bottom of the list, without messing with the header, and while still using a temporary file?
Since the tempfile I defined above uses mode 'w', do I need to create another temporary file that allows mode 'a' in order to append rows?
You can simplify your code dramatically using the Pandas python library.
Note: I do not know how the raw data looks like so you might need to do some tweaking in order to get it to work, please feel free to ask me more about the solution in the comments.
import pandas as pd
symbol = 'PAVM'
web(symbol)
filename = 'BiotechDatabase.csv'
fields = ['symbol', 'Web']
# Reading csv from file with names as fields
df = pd.read_csv(filename, names=fields)
# Pandas uses the first column automatically as index
df.loc[symbol, 'Web'] = web(symbol)
# Saving back to filename and overwrites it - Be careful!
pd.to_csv(filename)
There might be some faster ways to do that but this one is very elegant.

Output of terminal to a csv with separate columns in python

my code goes as follows:
import csv
with open('Remarks_Drug.csv', newline='', encoding ='utf-8') as myFile:
reader = csv.reader(myFile)
for row in reader:
product = row[0].lower()
filename = row[1]
product_patterns = ', '.join([i.split("+")[0].strip() for i in product.split(",")])
print(product_patterns, filename)
which outputs as below: (where film-coated tab should be one column and the filename should be another column)
film-coated tablet RECD outcome AUBAGIO IAIN-21 AoR.txt
solution for injection 093 Acceptance NO Safety profil.txt
I want to output this to a csv file with one column as product_patterns and another as filename. I wrote the below code but only the last row gets appended. Can anyone please help me with the looping here. The code I wrote is:
with open ('drug_output.csv', 'a') as csvfile:
fieldnames = ['product_patterns', 'filename']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writerow({'product_patterns':product_patterns, 'filename':filename})
enter image description here
Depending on the environment that you can use, it might be more practical to use more dedicated programs to solve your problem.
Especially the pandas package seems useful in your case.
Then you can load the csv using:
import pandas as pd
df=pd.read_csv(file_path)
After doing the necessary manipulations, you can save it again with
df.to_csv(file_path)
This will save you a lot of issues that typically occur when parsing line by line, and it should also increase performance a bit. Pandas is a pretty good package to learn anyway if you need to do some data manipulation.

Compare List against CSV file

I have an RSS feed I want to grab data from, manipulate and then save it to a CSV file. The RSS feed refresh rate is a big window, 1 minute to several hours, and only hold 100 items at a time. So to capture everything, Im looking to have my script run every minute. The problem with this is if the script runs before the feed updates I will be grabbing past data which lead to adding duplicate data to the CSV.
I tried looking at using examples mentioned here but it kept erroring out.
Data Flow:
RSS Feed --> Python Script --> CSV file
Sample data and code below:
Sample Data from CSV:
gandcrab,acad5fc7ebe8c6979d98cb8537e3a247,18bb2c3b82649314dfd45a379058869804954276,bf0ac94c6ae6f1ecfcccc049ae2373bfc659b2efb2e48e824e2e78fb43b6ebef,54,C
Sample Data from list:
zeus,186e84c5fd7da7331a62f1f13b1f4608,3c34aee767859fd75eb0c8c701716cbfd5655437,05c8e4f01ec8d4e6f4595db93bbcc0f85386c9f1b82b5833d983c9092640573a,49,C
Code for comparing:
if trends_f.is_file():
with open('trendsv3.csv', 'r+', newline='') as csv_file:
h_reader = csv.reader(csv_file)
next(h_reader) #skip reading header of csv
#should i load the csv into a list then compare it with diff() against the other list?
#or is there an easier, faster, more efficient way?
I would recommending downloading everything into a CSV, and then deduplicating in batches (eg nightly) that generates a new "clean" CSV for whatever you're working on.
To dedup, load the data in with the pandas library and then you can use the function drop_duplicates on the data.
http://pandas.pydata.org/pandas-docs/version/0.17/generated/pandas.DataFrame.drop_duplicates.html
Adding the ID from the feed seemed to make things the easiest to check against. Thank #blhsing for mentioning that. Ended reading the IDs from the csv into a list and checking the new data's IDs against that. There may be a faster more efficient way, but this works for me.
Code to check csv before saving to it:
if trends_f.is_file():
with open('trendsv3.csv', 'r') as csv_file:
h_reader = csv.reader(csv_file, delimiter=',')
next(h_reader, None)
for row in h_reader:
csv_list.append(row[6])
csv_file.close()
with open('trendsv3.csv', 'a', newline='') as csv_file:
h_writer = csv.writer(csv_file)
for entry in data_list:
if entry[6].strip() not in csv_list:
print(entry[6], ' is not in the list, saving ', entry[6],' to the list')
h_writer.writerow(entry)
else:
print(entry[6], ' is in the list')
csv_file.close()

Writing a new column to CSV

I am attempting to automatically update a list of reddit subscribers to popular crypto subreddits.
I am calling a number from the PRAW API. I am successfully feeding the API with the cryptocurrency I want via a csv, and then printing it to terminal.
I am having trouble then taking that printed array and writing it to a new csv.
I am very new to python, coding, and Stack Overflow so any help or suggestions would be appreciated! I'm in version 3.7.1 and on Windows and using IDLE.
import csv
with open('Reddit.csv') as csvinput:
with open('Reddit2.csv', 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
reader= csv.reader(csvinput)
all = []
rows = next(reader)
rows.append('Test')
all.append(rows)
for rows in reader:
subreddit = reddit.subreddit(rows[0])
print(subreddit.subscribers)
rows.append(rows[0])
all.append(subreddit.subscribers)
writer.writerows(all)
Expecting to see the original list from the 'Reddit' csv in the first column, and a new column for the subscriber count in the second column.
I think the problem could be here:
for rows in reader:
subreddit = reddit.subreddit(rows[0])
print(subreddit.subscribers)
rows.append(rows[0])
all.append(subreddit.subscribers)
Is it better when you try this?
for row in reader:
subreddit = reddit.subreddit(row[0])
row.append(str(subreddit.subscribers))
all.append(row)

Categories

Resources