I am trying to read a file that has a list of tasks. The usernames are displayed on index position 0 on every line of the file. Opening the file, reading the lines and extracting that data I can do. I can get to the point of indexing the data and printing it. What I can't do is create a code to count how many times that user is present in the file. For example if a user is displayed 7 times they must have 7 tasks to complete.
The code I have so far is:
user_global = []
def disp_stats():
with open ("tasks.txt", "r", encoding='cp1252') as tu:
for lines in tu:
data_list = lines.strip("\n").split(", ")
data_list = data_list[0] #this is the data I need to count for how many tasks a user has
user_global.append(data_list)
print(user_global)
My output for this when I print is not what is as expected. What I wanted to achieve is maybe use something like Counter to count how many times a name appears in a global list. That also didn't work out too well.
Related
So I have a code below using a text file to find the average of trade-to-gdp for each year from (1990-2019). So far I have created a code as well to get the specific range of years from the text file from these years and I want to try to call upon them as my overall mission is to create a time plot using both elements. Here is the first code creating the list:
year_list = range(1990,2020)
filenames = ["{0}".format(year) for year in year_list]
print(filenames)
The second code here is calling upon the textfile I have and using the statisitcs function to create the list of averages of the file. However, I am having trouble grabbing the specific information from the file and using the year_list variable created to put it all together. As when this is ran it gives me the error "mean requires at least one data point"
import statistics
with open('QUESTION2DATA.txt') as f:
items = []
for line in f.readlines():
line = line.strip()
if line.isdigit():
items.append(float(line))
else:
print('{0}: {1}'.format(line, statistics.mean(items)))
items = []
Here is a tabluar view of the data to show what I am specifcally I am looking for.
I have a file with user's names, one per line, and I need to compare each name in the file to all values in a csv file and make note each time the user name appears in the csv file. I need to make the search as efficient as possible as the csv file is 40K lines long
My example persons.txt file:
Smith, Robert
Samson, David
Martin, Patricia
Simpson, Marge
My example locations.csv file:
GreaterLocation,LesserLocation,GroupName,DisplayName,InBook
NorthernHemisphere,UnitedStates,Pilots,"Wilbur, Andy, super pilot",Yes
WesternHemisphere,China,Pilots,"Kirby, Mabry, loves pizza",Yes
WesternHemisphere,Japan,Drivers,"Samson, David, big kahuna",Yes
NortherHemisphere,Canada,Drivers,"Randos, Jorge",Yes
SouthernHemispher,Australia,Mechanics,"Freeman, Gordon",Yes
NortherHemisphere,Mexico,Pilots,"Simpson, Marge",Yes
SouthernHemispher,New Zealand,Mechanics,"Samson, David",Yes
My Code:
import csv
def parse_files():
with open('data_file/persons.txt', 'r') as user_list:
lines = user_list.readlines()
for user_row in lines:
new_user = user_row.strip()
per = []
with open('data_file/locations.csv', newline='') as target_csv:
DictReader_loc = csv.DictReader(target_csv)
for loc_row in DictReader_loc:
if new_user.lower() in loc_row['DisplayName'].lower():
per.append(DictReader_loc.line_num)
print(DictReader_loc.line_num, loc_row['DisplayName'])
if len(per) > 0:
print("\n"+new_user, per)
print("Parse Complete")
def main():
parse_files()
main()
My code currently works. Based on the sample data in the example files, the code matches the 2 instances of "Samson, David" and 1 instance of "Simpson, Marge" in the locations.csv file. I'm hoping that someone can give me guidance on how I might transform either the persons.txt file or the locations.csv file (40K+ lines) so that the process is as efficient as it can be. I think it currently takes 10-15 minutes. I know looping isn't the most efficient, but I do need to check each name and see where it appears in the csv file.
I think #Tomalak's solution with SQLite is very useful, but if you want to keep it closer to your original code, see the version below.
Effectively, it reduces the amount of file opening/closing/reading that is going on, and hopefully will speed things up.
Since your sample is very small, I could not do any real measurements.
Going forward, you can consider using pandas for these kind of tasks - it can be very convenient working with CSVs and more optimized than the csv module.
import csv
def parse_files():
with open('persons.txt', 'r') as user_list:
# sets are faster to match against than lists
# do the lower() here to avoid repetition
user_set = set([u.strip().lower() for u in user_list.readlines()])
# open file at beginning, close after done
# you could also encapsulate the whole thing into a `with` clause if
# desired
target_csv = open("locations.csv", "r", newline='')
DictReader_loc = csv.DictReader(target_csv)
for user in user_set:
per = []
for loc_row in DictReader_loc:
if user in loc_row['DisplayName'].lower():
per.append(DictReader_loc.line_num)
print(DictReader_loc.line_num, loc_row['DisplayName'])
if len(per) > 0:
print("\n"+user, per)
print("Parse Complete")
target_csv.close()
def main():
parse_files()
main()
I am very new to programing and trying to learn by doing creating a text adventure game and reading Python documentation/blogs.
My issue is I'm attempting to save/load data in a text game to create some elements which carry over from game to game and are passed as arguments. Specifically with this example my goal recall, update and load an incrementing iteration each time the game is played past the intro. Specially my intention here is to import the saved march_iteration number, display it to the user as a default name suggestion, then iterate the iteration number and save the updated saved march_iteration number.
From my attempts at debugging this I seem to be updating the value and saving the updated value of 2 to the game.sav file correctly, so I believe my issues is either I'm failing to load the data properly or overwriting the saved value with the static one somehow. I've read as much documentation as I can find but from the articles I've read on saving and loading to json I cannot identify where my code is wrong.
Below is a small code snippet I wrote just to try and get the save/load working. Any insight would be greatly appreciated.
import json
def _save(dummy):
f = open("game.sav", 'w+')
json.dump(world_states, f)
f.close
def _continue(dummy):
f = open("game.sav", 'r+')
world_states = json.load(f)
f.close
world_states = {
"march_iteration" : 1
}
def _resume():
_continue("")
_resume()
print ("world_states['march_iteration']", world_states['march_iteration'])
current_iteration = world_states["march_iteration"]
def name_the_march(curent_iteration=world_states["march_iteration"]):
march_name = input("\nWhat is the name of your march? We suggest TrinMar#{}. >".format(current_iteration))
if len(march_name) == 0:
print("\nThe undifferentiated units shift nerviously, unnerved and confused, perhaps even angry.")
print("\nPlease give us a proper name executor. The march must not be nameless, that would be chaos.")
name_the_march()
else:
print("\nThank you Executor. The {} march begins its long journey.".format(march_name))
world_states['march_iteration'] = (world_states['march_iteration'] +1)
print ("world_states['march_iteration']", world_states['march_iteration'])
#Line above used only for debugging purposed
_save("")
name_the_march()
I seem to have found a solution which works for my purposes allowing me to load, update and resave. It isn't the most efficient but it works, the prints are just there to display the number being properly loaded and updated before being resaved.
Pre-requisite: This example assumes you've already created a file for this to open.
import json
#Initial data
iteration = 1
#Restore from previously saved from a file
with open('filelocation/filename.json') as f:
iteration = json.load(f)
print(iteration)
iteration = iteration + 1
print(iteration)
#save updated data
f = open("filename.json", 'w')
json.dump(iteration, f)
f.close
QUESTION:
I am finding issues with the syntax of the code, in particular the for loop which i use to loop through the external file.
My program is a dice game which is supposed to register users, and the allow them to login to the game afterwards. In the end it must access the external file, which has previously been used to store the winner name (keep in mind the authorised names have a separate file), and loops through it and outputs the top 5 winners names and scores to the shell
I used a for loop to loop through the file and append it to an array called 'Top 5 Winners' however I seem to struggle with the syntax of the code as I am quite new Python.
The code that accesses the file.
with open("Top 5 Winners.txt","r") as db:
top5Winners=[]
for i in db(0,len([db])):
top5Winners.append(line)
top5Winners.sort()
top5Winners.reverse()
for i in range(5):
print(top5Winners[i])
Error Code:
for i in db(0,len([db])):
The len() part of the code is the issue
NOTE:
I also wouldn't mind any tips as to how i make this bit of code more efficient so i can apply it in my later projects.
Your indentation isn't as it should be. You indeed opened a file and made it readable, but after that you didn't do anything with it. See the following example:
with open(file, 'r') as db:
#code with file (db)
#rest of the code
So you can combine with your code like this:
top5winners = [] #Make a list variable
with open("Top 5 Winners.txt","r") as db: #Open your file
for i in db: #Loop trough contents of file
top5winners.append(i) #Append iterable to list
top5winners.sort(reverse=True) #Sort list and use reverse option
for i in range(0, 5): #Loop trough range
print(top5winners[i]) #Print items from list
Please note that StackOverflow is intended for help with specific cases, not a site to ask others to write a piece of code.
Sincerly, Chris Fowl.
This is my function to build a record of user's performed action in python csv. It will get the username from the global and perform increment given in the amount parameter to the specific location of the csv, matching the user's row and current date.
In brief, the function will read the csv in a list, and do any modification on the data before rewriting the whole list back into the csv file.
Every first item on rows is the username, and the header has the dates.
Accs\Dates,12/25/2016,12/26/2016,12/27/2016
user1,217,338,653
user2,261,0,34
user3,0,140,455
However, I'm not sure why sometimes, the header get's pushed down to the second row, and data gets wiped entirely when it crashes.
Also, I need to point out that there maybe multiple script running this function and writing on the same file, not sure if that causing the issue.
I'm thinking maybe I can write the stats separately and uniquely to each users and combine later, hence eliminating the possible clashing in writing. Although would be great if I could just improve from what I have here and read/write everything on a file.
Any fail-safe way to do what I'm trying to do here?
# Search current user in first rows and updating the count on the column (today's date)
# 'amount' will be added to the respective position
def dailyStats(self, amount, code = None):
def initStats():
# prepping table
with open(self.stats, 'r') as f:
reader = csv.reader(f)
for row in reader:
if row:
self.statsTable.append(row)
self.statsNames.append(row[0])
def getIndex(list, match):
# get the index of the matched date or user
for i, j in enumerate(list):
if j == match:
return i
self.statsTable = []
self.statsNames = []
self.statsDates = None
initStats()
today = datetime.datetime.now().strftime('%m/%d/%Y')
user_index = None
today_index = None
# append header if the csv is empty
if len(self.statsTable) == 0:
self.statsTable.append([r'Accs\Dates'])
# rebuild updated table
initStats()
# add new user/date if not found in first row/column
self.statsDates = self.statsTable[0]
if getIndex(self.statsNames, self.username) is None:
self.statsTable.append([self.username])
if getIndex(self.statsDates, today) is None:
self.statsDates.append(today)
# rebuild statsNames after table appended
self.statsNames = []
for row in self.statsTable:
self.statsNames.append(row[0])
# getting the index of user (row) and date (column)
user_index = getIndex(self.statsNames, self.username)
today_index = getIndex(self.statsDates, today)
# the row where user is matched, if there are previous dates than today which
# has no data, append 0 (e.g. user1,0,0,0,) until the column where today's date is match
if len(self.statsTable[user_index]) < today_index + 1:
for i in range(0,today_index + 1 - len(self.statsTable[user_index])):
self.statsTable[user_index].append(0)
# insert pv or tb code if found
if code is None:
self.statsTable[user_index][today_index] = amount + int(re.match(r'\b\d+?\b', str(self.statsTable[user_index][today_index])).group(0))
else:
self.statsTable[user_index][today_index] = str(re.match(r'\b\d+?\b', str(self.statsTable[user_index][today_index])).group(0)) + ' - ' + code
# Writing final table
with open(self.stats, 'w', newline='') as f:
writer = csv.writer(f)
writer.writerows(self.statsTable)
# return the summation of the user's total count
total_follow = 0
for i in range(1, len(self.statsTable[user_index])):
total_follow += int(re.match(r'\b\d+?\b', str(self.statsTable[user_index][i])).group(0))
return total_follow
As David Z says, concurrency is more likely the cause of your problem.
I will add that CSV format is not suitable for Database storing, indexing, sorting, because it is plain/text and sequential.
You could handle it using a RDBMS for storing and updating your data and periodically processing your stats. Then your CSV format is just an import/export format.
Python offers a SQLite binding in its Standard Library, if you build a connector that import/update CSV content in a SQLite schema and then dump results as CSV you will be able to handle concurency and keep your native format without worring about installing a database server and installing new packages in Python.
Also, I need to point out that there maybe multiple script running this function and writing on the same file, not sure if that causing the issue.
More likely than not that is exactly your issue. When two things are trying to write to the same file at the same time, the outputs from the two sources can easily get mixed up together, resulting in a file full of gibberish.
An easy way to fix this is just what you mentioned in the question, have each different process (or thread) write to its own file and then have separate code to combine all those files in the end. That's what I would probably do.
If you don't want to do that, what you can do is have different processes/threads send their information to an "aggregator process", which puts everything together and writes it to the file - the key is that only the aggregator ever writes to the file. Of course, doing that requires you to build in some method of interprocess communication (IPC), and that in turn can be tricky, depending on how you do it. Actually, one of the best ways to implement IPC for simple programs is by using temporary files, which is just the same thing as in the previous paragraph.