Basically, I'm trying to repeat the same formula but need to store variables every second. What I did was put all my variables in an excel file and have a reader go through the list. When I try to use the new variable, I'm only able to use it one at a time, not the whole list.
What I would like to do is basically print y1 = 1, y2 = 2, y3 = 3
Below is an example :
csv file :
column1, column2, column3
apple, 1 , appleweight
orange, 2, orangeweight
banana, 3, bananaweight
import csv
with open(r"C:\Users\Admin\Desktop\Untitled.csv", newline='') as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
x = row['column1']
y = row['column2']
z = row['column3']
x = method.get_value(y)
z = x.get_name()
print (y)
time.sleep(1)
The above code, will print:
1
2
3
I would like to print
y1 = 1
y2 = 2
y3 = 3
You can add a counter to iteration of the rows, and use that to print what you want.
with open(r"C:\Users\Admin\Desktop\Untitled.csv", newline='') as csvfile:
reader = csv.DictReader(csvfile)
for i, row in enumerate(reader, 1):
y = row['column2']
print(f'y{i} = {y}')
time.sleep(1)
Related
I have 2 files: fileA is composed of 1 row and file B is 2 rows.
fileA (1 row):
*****s**e**********************************************q*
fileB (2 rows):
Row 1 is the subject
Row 2 is the query
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
I need to produce an output file, where if the fileA string contains an s or *, the subject character at the corresponding index position, will be written to the output file. If there is a q or e the query character will be written to the output file.
Output:
AAAAAAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABA
my code:
ff = open("filea.txt")
gg = open("fileb.txt")
file_as_list = ff.readline()
file_as_last = gg.readlines()
query = file_as_last[0]
subject = file_as_last[1]
for i in file_as_list:
z = -1
while z <= len(file_as_list):
if i == "*":
f = open('output.txt', 'a+', encoding='utf-8')
f.write(subject[z])
z += 1
elif i == "s":
f = open('output.txt', 'a+', encoding='utf-8')
f.write(subject[z])
z += 1
elif i == "e":
f = open('output.txt', 'a+', encoding='utf-8')
f.write(query[z])
z += 1
elif i == "q":
f = open('output.txt', 'a+', encoding='utf-8')
f.write(query[z])
z += 1
break
the things work more or less but not properly: I have always that the loop works only for the first statement and produce an output that is just a copy of the subject
with open is used, so all files will be automatically closed
convert each string into list, then.strip to remove \n & \r
load the lists into a pandas.DataFrame
pandas.DataFrame.apply with axis=1, for row wise operations
np.where to return the correct value
write out, to a list, and convert it into a str
write out, to the output.txt file
Code:
import pandas as pd
import numpy as np
with open('fileA.txt', 'r') as filA:
with open('fileB.txt', 'r') as filB:
with open('output.txt', 'w', newline='\n') as output:
fil_a = filA.readline()
fil_b = filB.readlines()
sub = [x for x in fil_b[0].strip()]
que = [x for x in fil_b[1].strip()]
line = [x for x in fil_a.strip()]
df = pd.DataFrame({'A': line, 'sub': sub, 'que': que})
df['out'] = df.apply(lambda x: str(np.where(x[0] in ['*', 's'], x[1], x[2])), axis=1)
out = df.out.to_list()
out = ''.join(x for x in out)
output.write(out)
I have files with hundreds and thousands rows of data but they are without any column.
I am trying to go to every file and make them row by row and store them in list after that I want to assign values by columns. But here I am confused what to do because values are around 60 in every row and some extra columns with value assigned and they should be added in every row.
Code so for:
import re
import glob
filenames = glob.glob("/home/ashfaque/Desktop/filetocsvsample/inputfiles/*.txt")
columns = []
with open("/home/ashfaque/Downloads/coulmn names.txt",encoding = "ISO-8859-1") as f:
file_data = f.read()
lines = file_data.splitlines()
for l in lines:
columns.append(l.rstrip())
total = {}
for name in filenames:
modified_data = []
with open(name,encoding = "ISO-8859-1") as f:
file_data = f.read()
lines = file_data.splitlines()
for l in lines:
if len(l) >= 1:
modified_data.append(re.split(': |,',l))
rows = []
i = len(modified_data)
x = 0
while i > 60:
r = lines[x:x+59]
x = x + 60
i = i - 60
rows.append(r)
z = len(modified_data)
while z >= 60:
z = z - 60
if z > 1:
last_columns = modified_data[-z:]
x = []
for l in last_columns:
if len(l) > 1:
del l[0]
x.append(l)
elif len(l) == 1:
x.append(l)
for row in rows:
for vl in x:
row.append(vl)
for r in rows:
for i in range(0,len(r)):
if len(r) >= 60:
total.setdefault(columns[i],[]).append(r[i])
In other script I have separated both row with 60 values and last 5 to 15 columns which should be added with row are separate but again I am confused how to bind all the data.
Data Should look like this after binding.
outputdata.xlsx
Data Input file:
inputdata.txt
What Am I missing here? any tool ?
I believe that your issue can be resolved by taking the input file and turning it into a CSV file which you can then import into whatever program you like.
I wrote a small generator that would read a file a line at a time and return a row after a certain number of lines, in this case 60. In that generator, you can make whatever modifications to the data as you need.
Then with each generated row, I write it directly to the csv. This should keep the memory requirements for this process pretty low.
I didn't understand what you were doing with the regex split, but it would be simple enough to add it to the generator.
import csv
OUTPUT_FILE = "/home/ashfaque/Desktop/File handling/outputfile.csv"
INPUT_FILE = "/home/ashfaque/Desktop/File handling/inputfile.txt"
# This is a generator that will pull only num number of items into
# memory at a time, before it yields the row.
def get_rows(path, num):
row = []
with open(path, "r", encoding="ISO-8859-1") as f:
for n, l in enumerate(f):
# apply whatever transformations that you need to here.
row.append(l.rstrip())
if (n + 1) % num == 0:
# if rows need padding then do it here.
yield row
row = []
with open(OUTPUT_FILE, "w") as output:
csv_writer = csv.writer(output)
for r in get_rows(INPUT_FILE, 60):
csv_writer.writerow(r)
I have a code that is basically doing this:
row1 = []
count = 0
writer = csv.writer(myFile)
row = []
for j in range(0, 2):
for i in range(0, 4):
row1.append(i+count)
count = count + 1
print(row1)
writer.writerows(row1)
row1[:] = []
I'm creating some lists and I want to map each value to a column, like this
This error showed it up iterable expected. How can I do that?
#roganjosh is right, what you need to write one row at a time is writerow:
import csv
myFile = open("aaa.csv", "w", newline="")
row1 = []
count = 0
writer = csv.writer(myFile)
row = []
for j in range(0, 2):
for i in range(0, 4):
row1.append(i+count)
count = count + 1
print(row1)
writer.writerow(row1)
row1[:] = []
myFile.close() # Don't forget to close your file
You probably need to call the method .writerow() instead of the plural .writerows(), because you write a single line to the file on each call.
The other method is to write multiple lines at once to the file.
Or you could also restructure your code like this to write all the lines at the end:
import csv
row_list = []
for j in range(2):
row = [j+i for i in range(4)]
row_list.append(row)
# row_list = [
# [j+i for i in range(4)]
# for j in range(2)]
with open('filename.csv', 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(row_list)
It's much simpler and easier to manipulate tabular data in pandas -- is there a reason you don't want to use pandas?
import pandas as pd
df = pd.DataFrame()
for i in range(4):
df[i] = range(i, i+4)
# Any other data wrangling
df.to_csv("file.csv")
The script needs to read input from a text/csv file but as soon as I try and implement the functionality, everything breaks.
Here is my code:
from collections import defaultdict
#from csv import reader
data = """Lions 3, Snakes 3
Tarantulas 1, FC Awesome 0
Lions 1, FC Awesome 1
Tarantulas 3, Snakes 1
Lions 4, Grouches 0"""
# with open('sample_input.csv') as data:
# csv = reader(data)
# list_csv = [line.rstrip('\n') for line in data]
data_list = data.splitlines()
def splitter(row):
left_team, right_team = row.split(',')
return {
'left': left_team[:-2].strip(),
'left_score': int(left_team[-2:].strip()),
'right': right_team[:-2].strip(),
'right_score': int(right_team[-2:].strip())
}
data_dicts = [splitter(row) for row in data_list]
team_scores = defaultdict(int)
for game in data_dicts:
if game['left_score'] == game['right_score']:
team_scores[game['left']] += 1
team_scores[game['right']] += 1
elif game ['left_score'] > game['right_score']:
team_scores[game['left']] += 3
else:
team_scores[game['right']] += 3
print(team_scores)
teams_sorted = sorted(team_scores.items(), key=lambda team: team[1], reverse=True)
# for line in teams_sorted:
# print(line)
Also, the expected output that I need to have is:
1. Tarantulas, 6 pts
2. Lions, 5 pts
3. FC Awesome, 1 pt
3. Snakes, 1 pt
4. Grouches, 0 pts
I just can't seem to figure out how to get to this step. I checked most parts of my code with print statements and it seems the dictionary is working correctly but it is not printing the last team and it's score (Grouches, 0 pts).
I am currently getting this output:
('Tarantulas', 6)
('Lions', 5)
('Snakes', 1)
('FC Awesome', 1)
Any help would be greatly appreciated!
Well done for getting this far. You have managed to implement the logic, but have got stuck with a specific behaviour of defaultdict. There are 2 main points to note:
If a key is not initialized with defaultdict, it won't be added to the dictionary. You can do this simply by adding 0 to a non-initialized key.
For the specific formatting you require, you can use enumerate in a loop after sorting.
Putting these together, amend your loop as below:
for game in data_dicts:
if game['left_score'] == game['right_score']:
team_scores[game['left']] += 1
team_scores[game['right']] += 1
elif game ['left_score'] > game['right_score']:
team_scores[game['left']] += 3
team_scores[game['right']] += 0
else:
team_scores[game['left']] += 0
team_scores[game['right']] += 3
Then use enumerate in a loop. You can use operator.itemgetter and f-strings (the latter in Python 3.6+) to make your logic cleaner:
from operator import itemgetter
teams_sorted = sorted(team_scores.items(), key=itemgetter(1), reverse=True)
for idx, (team, score) in enumerate(teams_sorted, 1):
print(f'{idx}. {team} {score} pts')
1. Tarantulas 6 pts
2. Lions 5 pts
3. Snakes 1 pts
4. FC Awesome 1 pts
5. Grouches 0 pts
have you tried the CSV python lib? Extracted from the doc (https://docs.python.org/3/library/csv.html):
import csv
with open('data.csv', newline='') as csvfile:
spamreader = csv.reader(csvfile, delimiter=' ', quotechar='|')
for row in spamreader:
print(', '.join(row))
To the code breaking on adding CSV. CSV reader does the split(',') for you. So your left_team = row[0] and right_team = row[1]
So your code changes to something like
def spliter(row):
left_team, right_team = row
return {
'left': left_team[:-2].strip(),
'left_score': int(left_team[-2:].strip()),
'right': right_team[:-2].strip(),
'right_score': int(right_team[-2:].strip())
}
with open('data.csv') as data_obj:
reader = csv.reader(data_obj)
data_dicts = [splitter(row) for row in reader]
You can go for plaintext reading if you want to manually split(',').
with open('data.csv') as data_obj:
data_list = [line.rstrip('\n') for line in data_obj.readlines()]
So this block of code is supposed to open the csv file, get the values from column 1-3 (not 0). Once it has got the values for each row and their 3 columns, it is supposed to add these values up and divide by 3. I thought this code would work however the addition of the 3 columns in each row doesn't seem to be working. If anyone could tell me why and how i can fix this, that would be great, thank you. I'm pretty certain the problem lies at the for index, summedValue in enumerate (sums): Specifically, the "summedValue" value.
if order ==("average score"):
askclass = str(input("what class?"))
if askclass == ('1'):
with open("Class1.csv") as f:
columns = f.readline().strip().split(" ")
sums = [1] * len(columns)
for line in f:
# Skip empty lines
if not line.strip():
continue
values = line.split(" ")
for i in range(1,len(values)):
sums[i] += int(values[i])
for index, summedValues in enumerate (sums):
print (columns[index], 1.0 * (summedValues) / 3)
from statistics import mean
import csv
with open("Class1.csv") as f:
# create reader object
r = csv.reader(f)
# skip headers
headers = next(r)
# exract name from row and use statistics.mean to average from row[1..
# mapping scores to ints
avgs = ((row[0], mean(map(int, row[1:]))) for row in r)
# unpack name and average and print
for name, avg in avgs:
print(name,avg)
Unless you have written empty lines to your csv file there won't be any, not sure how the header fits into it but you can use it if necessary.
You can also unpack with the * syntax in python 3 which I think is a bit nicer:
avgs = ((name, mean(map(int, row))) for name, *row in r)
for name, avg in avgs:
print(name,avg)
To order just sort by the average using reverse=True to sort from highest to lowest:
from statistics import mean
import csv
from operator import itemgetter
with open("Class1.csv") as f:
r = csv.reader(f)
avgs = sorted(((name, mean(map(int, row))) for name, *row in r),key=itemgetter(1),reverse=True)
for name, avg in avgs:
print(name,avg)
Passing key=itemgetter(1) means we sort by the second subelement which is the average in each tuple.
using
1, 2, 3
4, 2, 3
4, 5, 3
1, 6, 3
1, 6, 6
6, 2, 3
as Class1.csv
and
askclass = str(input("what class?"))
if askclass == ('1'):
with open("Class1.csv") as f:
columns = f.readline().strip().split(",")
sums = [1] * len(columns)
for line in f:
# Skip empty lines
if not line.strip():
continue
values = line.split(",")
for i in range(1,len(values)):
sums[i] += int(values[i])
for index, summedValues in enumerate (sums):
print (columns[index], 1.0 * (summedValues) / 3)
I obtain the expected result:
what class?1
('1', 0.3333333333333333)
(' 2', 7.333333333333333)
(' 3', 6.333333333333333)
[update] Observations:
sums defined ad sums = [1] * len(columns) has length columns, but you ignore first column in you operations so value for sum[0] will always be 1, do not seems necessary.
for float division it is sufficient summedValues / 3.0 instead of 1.0 * (summedValues) / 3
Maybe this is what you want
for line in f:
# Skip empty lines
if not line.strip():
continue
values = line.split(" ")
for i in range(1,len(values)):
sums[i] += int(values[i])
for index, summedValues in enumerate (sums):
print (columns[index], 1.0 * (summedValues) / 3)