PRAW Limit number of comments - python

I am making a program which fetches comments from a specific submission on Reddit. I need to limit a number of comments which I get in order to optimize program. Here is the only solution which PRAW documentation provides; however, obviously it does not make program faster, just allows to operate later with smaller number of comments.
def get_comments(reddit: reddit, submission_id: str, limit:int = 10) -> list:
submission = reddit.submission(id=submission_id)
#get rid of MoreComments class in submission
submission.comments.replace_more(limit=None)
submission.comment_sort = "top"
comments = submission.comments.list()
return comments[:limit]
I suppose, it should be a solution to limit the number of comments directly in request.

Try this using limit.
def get_comments(reddit: reddit, submission_id: str, limit:int = 10) -> list:
submission = reddit.submission(id=submission_id)
submission.comment_sort = "top"
comments = submission.comments.list(limit=limit)
return comments
alternative
def get_comments(reddit: reddit, submission_id: str, limit:int = 10) -> list:
submission = reddit.submission(id=submission_id)
submission.comments.replace_more(limit=None)
submission.comment_sort = "top"
comments = submission.comments.list(limit=limit)
return comments

Finally, I found a solution in Reddit post. Hope it will help somebody.
def get_comments(reddit: reddit, submission_id: str, limit:int = 10) -> list:
submission = reddit.submission(id=submission_id)
submission.comment_limit = limit
submission.comments.replace_more(limit=0)
submission.comment_sort = "top"
comments = submission.comments.list()
return comments
Submission has attribute comment_limit, so you can assign value to it. Make sure you put it before submission.comments.replace_more(limit=0); otherwise it would not work.

Related

How to use other def values?

I want to use other def values.
For example, I added a 'pt' in the 'clean_beds_process' definition and add 'Patients' in the 'run' definition.
I want to patient information when the 'clean_beds_process' function is called.
However, this makes this error 'AttributeError: type object 'Patients' has no attribute 'id''
I don't know why this happen.
Maybe I have something wrong understanding of mechanism of simpy.
Please let me know how can I use a patient information when 'clean_beds_process' function is called.
Thank you.
import simpy
import random
class Patients:
def __init__(self, p_id):
self.id = p_id
self.bed_name = ""
self.admission_decision = ""
def admin_decision(self):
admin_decision_prob = random.uniform(0, 1)
if admin_decision_prob <= 0.7:
self.admission_decision = "DIS"
else:
self.dmission_decision = "IU"
return self.admission_decision
class Model:
def __init__(self, run_number):
self.env = simpy.Environment()
self.pt_ed_q = simpy.Store(self.env )
self.pt_counter = 0
self.tg = simpy.Resource(self.env, capacity = 4)
self.physician = simpy.Resource(self.env, capacity = 4)
self.bed_clean = simpy.Store(self.env)
self.bed_dirty = simpy.Store(self.env)
self.IU_bed = simpy.Resource(self.env, capacity = 50)
def generate_beds(self):
for i in range(77):
yield self.env.timeout(0)
yield self.bed_clean.put(f'bed{i}')
def generate_pt_arrivals(self):
while True:
self.pt_counter += 1
pt = Patients(self.pt_counter)
yield self.env.timeout(5)
self.env.process(self.process(pt))
def clean_beds_process(self, cleaner_id, pt):
while True:
print(pt.id)
bed = yield self.bed_dirty.get()
yield self.env.timeout(50)
yield self.bed_clean.put(bed)
def process(self, pt):
with self.tg.request() as req:
yield req
yield self.env.timeout(10)
bed = yield self.bed_clean.get()
pt.bed_name = bed
pt.admin_decision()
if pt.admission_decision == "DIS":
with self.IU_bed.request() as req:
dirty_bed_name = pt.bed_name
yield self.bed_dirty.put(dirty_bed_name)
yield self.env.timeout(600)
else:
dirty_bed_name = pt.bed_name
yield self.bed_dirty.put(dirty_bed_name)
def run(self):
self.env.process(self.generate_pt_arrivals())
self.env.process(self.generate_beds())
for i in range(2):
self.env.process(self.clean_beds_process(i+1, Patients))
self.env.run(until = 650)
run_model = Model(0)
run_model.run()
So if a patient can use either a clean bed or a dirty bed then the patient needs to make two request (one for each type of bed) and use env.any_of to wait for the first request to fire. You also need to deal with the case that both events fire at the same time. Don't forget to cancel the request you do not use. If the request that fires is for a clean bed, things stay mostly the same. But if the request is for a dirty bed, then you need to add a step to clean the bed. For this I would make the cleaners Resources instead of processes. So the patient would request a cleaner, and do a timeout for the cleaning time, release the cleaner. To collect patient data I would create a log with the patient id, key event, time, and crunch these post sim to get the stats I need. To process the log I often create a dataframe that filters the log for the first, a second dataframe that filters for the second envent, join the two dataframes on patient id. Now both events for a patient is on one row so I can get the delta. once I have have the delta I can do a sum and count. For example, if my two events are when a patient arrives, and when a patient gets a bed, get the sum of deltas and divide by the count and I have the average time to bed.
If you remember, one of the first answers I gave you awhile ago had a example to get the first available bed from two different queues
I do not have a lot of time right know, but I hope this dissertation helps a bit

multiprocessing a function with parameters that are iterated through

I'm trying to improve the speed of my program and I decided to use multiprocessing!
the problem is I can't seem to find any way to use the pool function (i think this is what i need) to use my function
here is the code that i am dealing with:
def dataLoading(output):
name = ""
link = ""
upCheck = ""
isSuccess = ""
for i in os.listdir():
with open(i) as currentFile:
data = json.loads(currentFile.read())
try:
name = data["name"]
link = data["link"]
upCheck = data["upCheck"]
isSuccess = data["isSuccess"]
except:
print("error in loading data from config: improper naming or formating used")
output[name] = [link, upCheck, isSuccess]
#working
def userCheck(link, user, isSuccess):
link = link.replace("<USERNAME>", user)
isSuccess = isSuccess.replace("<USERNAME>", user)
html = requests.get(link, headers=headers)
page_source = html.text
count = page_source.count(isSuccess)
if count > 0:
return True
else:
return False
I have a parent function to run these two together but I don't think i need to show the whole thing, just the part that gets the data iteratively:
for i in configData:
data = configData[i]
link = data[0]
print(link)
upCheck = data[1] #just for future use
isSuccess = data[2]
if userCheck(link, username, isSuccess) == True:
good.append(i)
you can see how I enter all of the data in there, how would I be able to use multiprocessing to do this when I am iterating through the dictionary to collect multiple parameters?
I like to use mp.Pool().map. I think it is easiest and most straight forward and handles most multiprocessing cases. So how does map work? For starts, we have to keep in mind that mp creates workers, each worker receives a copy of the namespace (ya the whole thing), then each worker works on what they are assigned and returns. Hence, doing something like "updating a global variable" while they work, doesn't work; since they are each going to receive a copy of the global variable and none of the workers are going to be communicating. (If you want communicating workers you need to use mp.Queue's and such, it gets complicated). Anyway, here is using map:
from multiprocessing import Pool
t = 'abcd'
def func(s):
return t[int(s)]
results = Pool().map(func,range(4))
Each worker received a copy of t, func, and the portion of range(4) they were assigned. They are then automatically tracked and everything is cleaned up in the end by Pool.
Something like your dataLoading won't work very well, we need to modify it. I also cleaned the code a little.
def loadfromfile(file):
data = json.loads(open(file).read())
items = [data.get(k,"") for k in ['name','link','upCheck','isSuccess']]
return items[0],items[1:]
output = dict(Pool().map(loadfromfile,os.listdir()))

Python Telegram Bot how to wait for user answer to a question And Return It

Context:
I am using PyTelegramBotAPi or Python Telegram Bot
I have a code I am running when a user starts the conversation.
When the user starts the conversation I need to send him the first picture and a question if He saw something in the picture, the function needs to wait for the user input and return whether he saw it or not.
After that, I will need to keep sending the picture in a loop and wait for the answer and run a bisection algorithm on it.
What I have tried so far:
I tried to use reply markup that waits for a response or an inline keyboard with handlers but I am stuck because my code is running without waiting for the user input.
The code:
#bot.message_handler(func=lambda msg: msg in ['Yes', 'No'])
#bot.message_handler(commands=['start', 'help'])
def main(message):
"""
This is my main function
"""
chat_id = message.chat.id
try:
reply_answer = message.reply_to_message.text
except AttributeError:
reply_answer = '0'
# TODO : should wait for the answer asynchnonossly
def tester(n, reply_answer):
"""
Displays the current candidate to the user and asks them to
check if they see wildfire damages.
"""
print('call......')
bisector.index = n
bot.send_photo(
chat_id=chat_id,
photo=bisector.image.save_image(),
caption=f"Did you see it Yes or No {bisector.date}",
reply_markup=types.ForceReply(selective=True))
# I SHOUL WAIT FOR THE INPUT HERE AND RETURN THE USER INPUT
return eval(reply_answer)
culprit = bisect(bisector.count, lambda x: x, partial(tester, reply_answer=reply_answer) )
bisector.index = culprit
bot.send_message(chat_id, f"Found! First apparition = {bisector.date}")
bot.polling(none_stop=True)
The algorithm I am running on the user input is something like this :
def bisect(n, mapper, tester):
"""
Runs a bisection.
- `n` is the number of elements to be bisected
- `mapper` is a callable that will transform an integer from "0" to "n"
into a value that can be tested
- `tester` returns true if the value is within the "right" range
"""
if n < 1:
raise ValueError('Cannot bissect an empty array')
left = 0
right = n - 1
while left + 1 < right:
mid = int((left + right) / 2)
val = mapper(mid)
tester_values = tester(val) # Here is where I am using the ouput from Telegram bot
if tester_values:
right = mid
else:
left = mid
return mapper(right)
I hope I was clear explaining the problem, feel free to ask any clarification.
If you know something that can point me in the right direction in order to solve this problem, let me know.
I have tried a similar question but I am not getting answers.
You should save your user info in a database. Basic fields would be:
(id, first_name, last_name, username, menu)
What is menu?
Menu keeps user's current state. When a user sends a message to your bot, you check the database to find out about user's current sate.
So if the user doesn't exist, you add them to your users table with menu set to MainMenu or WelcomeMenu or in your case PictureMenu.
Now you're going to have a listener for update function, let's assume each of these a menu.
#bot.message_handler(commands=['start', 'help'])
so when the user sends start you're going to check user's menu field inside the function.
#bot.message_handler(commands=['start', 'help'])
def main(message):
user = fetch_user_from_db(chat_id)
if user.menu == "PictureMenu":
if message.photo is Not None:
photo = message.photo[0].file_id
photo_file = download_photo_from_telegram(photo)
do_other_things()
user.menu = "Picture2Menu";
user.save();
else:
send_message("Please send a photo")
if user.menu == "Picture2Menu":
if message.photo is Not None:
photo = message.photo[0].file_id
photo_file = download_photo_from_telegram(photo)
do_other_things()
user.menu = "Picture3Menu";
user.save();
else:
send_message("Please send a photo")
...
I hope you got it.
I have found the answer:
the trick was to use next_step_handler, and message_handler_function to handle command starting with start and help
Then as suggested by #ALi in his answer, I will be saving the user input answer as well as the question id he replied to in a dictionary where keys are questions and id are the answer.
Once the user has answered all questions, I can run the algorithms on his answer
Here is how it looks like in the code :
user_dict = {}
# Handle '/start' and '/help'
#bot.message_handler(commands=['help', 'start'])
def send_welcome(message):
# initialise the the bisector and
bisector = LandsatBisector(LON, LAT)
indice = 0
message = send_current_candidate(bot, message, bisector, indice)
bot.register_next_step_handler(
message, partial(
process_step, indice, bisector))
def process_step(indice, bisector, message):
# this run a while loop and will that send picture and will stop when the count is reached
response = message.text
user = User.create_get_user(message, bisector=bisector)
if indice < bisector.count - 1:
indice += 1
try:
# get or create
user.responses[bisector.date] = response # save the response
message = send_current_candidate(bot, message, bisector, indice)
bot.register_next_step_handler(
message, partial(
process_step, indice, bisector))
except Exception as e:
print(e)
bot.reply_to(message, 'oooops')
else:
culprit = bisect(bisector.count,
lambda x: x,
partial(
tester_function,
responses=list(user.responses.values())))
bisector.index = culprit
bot.reply_to(message, f"Found! First apparition = {bisector.date}")

Python issues with objects inside of dictionary? Cant get all data back out

Sorry this is the second post in two days.. I am pulling my hair out with this. I am attempting to take data from reddit and put it into an array in a way I can pull the data out later for tensorflow to parse it. Now the issue is my second object inside of the other object is not giving me whats inside it... "<main.Submission" why am I getting this back?
Goals of this post:
1: Why am I getting <main.Submission> and how should I be doing this.
File "C:/automation/git/tensorflow/untitled0.py", line 35, in <module>
submissions[sm.id].addSubSubmission(Submission.addComment(cmt.id, cmt.author.name, cmt.body))
TypeError: addComment() missing 1 required positional argument: 'body'
Sorry for the long winded and most likely basic questions. Going from powershell to python was not as straight forward as I thought..
Thanks
Cody
import praw
# sets log in data for session
reddit = praw.Reddit(client_id='bY',
client_secret='v9',
user_agent='android:com.example.myredditapp:'
'v1.2.3 (by /u/r)')
class Submission(object):
def __init__(self, id, title, author):
self.id = id
self.title = title
self.subSubmission = {}
self.author = author
def addComment(self, id, author, body):
self.id = id
self.author = author
self.body = body
def addSubSubmission(self,submission):
self.subSubmission[submission,id] = submission
def getSubSubmission(self,id):
return self.subSubmission[id]
submissions = {}
for sm in reddit.subreddit('redditdev').hot(limit=2):
# pulls the ID and makes that the head of each
submissions[sm.id] = Submission(sm.id, sm.title, sm.author.name)
mySubmission = reddit.submission(id=sm.id)
mySubmission.comments.replace_more(limit=0)
# Get all the comments and first post and list their id author and body(comment)
for cmt in mySubmission.comments.list():
submissions[sm.id].addSubSubmission(Submission.addComment(cmt.id, cmt.author.name, cmt.body))
# My trying to read what all there??!? ##
for key in submissions.keys():
value = submissions[key]
print(key, "=", value)
for key, value in submissions.items():
print(key, "=", value)
expecting to see:
{Title = test {comment.id = 1111 {Comment = 'blah', Author = 'Bob'}}
{comment.id = 1112 {Comment = 'blah2', Author = 'Bob2'}}
}
It is giving you back the entire Submission object - but then you're printing it. How should a submission object look on screen when printed? This is something you can define in the Submission class - check out the first answer in this post: Difference between __str__ and __repr__ in Python
To explain this further: python doesn't know how to represent a class on screen. Sure, the class has attributes that are strings, lists, dicts etc, but python knows how to print those. Your class you just created? What's important? What should be printed? python doesn't know this, and is smart enough not to make any assumptions.
If you add a __repr__ function to your class, python will call it and print whatever that function returns.

Creating Rest Api in python using JSON?

I'm new to python REST API and so facing some particular problems. I want that when I enter the input as pathlabid(primary key), I want the corresponding data assigned with that key as output. When I run the following code i only get the data corresponding to the first row of table in database even when the id i enter belong to some other row.
This is the VIEWS.PY
class pathlabAPI(View):
#csrf_exempt
def dispatch(self, *args, **kwargs):
# dont worry about the CSRF here
return super(pathlabAPI, self).dispatch(*args, **kwargs)
def post(self, request):
post_data = json.loads(request.body)
Pathlabid = post_data.get('Pathlabid') or ''
lablist = []
labdict = {}
lab = pathlab()
labs = lab.apply_filter(Pathlabid = Pathlabid)
if Pathlabid :
for p in labs:
labdict["Pathlabid"] = p.Pathlabid
labdict["name"] = p.Name
labdict["email_id"] = p.Emailid
labdict["contact_no"] = p.BasicContact
labdict["alternate_contact_no"] = p.AlternateContact
labdict["bank_account_number"] = p.Accountnumber
labdict["ifsccode"] = p.IFSCcode
labdict["country"] = p.Country
labdict["homepickup"] = p.Homepickup
lablist.append(labdict)
return HttpResponse(json.dumps(lablist))
else:
for p in labs:
labdict["bank_account_number"] = p.Accountnumber
lablist.append(labdict)
return HttpResponse(json.dumps(lablist))
There are a number of issues with the overall approach and code but to fix the issue you're describing, but as a first fix I agree with the other answer: you need to take the return statement out of the loop. Right now you're returning your list as soon as you step through the loop one time, which is why you always get a list with one element. Here's a fix for that (you will need to add from django.http import JsonResponse at the top of your code):
if Pathlabid:
for p in labs:
labdict["Pathlabid"] = p.Pathlabid
labdict["name"] = p.Name
labdict["email_id"] = p.Emailid
labdict["contact_no"] = p.BasicContact
labdict["alternate_contact_no"] = p.AlternateContact
labdict["bank_account_number"] = p.Accountnumber
labdict["ifsccode"] = p.IFSCcode
labdict["country"] = p.Country
labdict["homepickup"] = p.Homepickup
lablist.append(labdict)
else:
for p in labs:
labdict["bank_account_number"] = p.Accountnumber
lablist.append(labdict)
return JsonResponse(json.dumps(lablist))
As suggested in the comments, using Django Rest Framework or a similar package would be an improvement. As a general rule, in Django or other ORMs, you want to avoid looping over a queryset like this and adjusting each element. Why not serialize the queryset itself and do the logic that's in this loop in your template or other consumer?
You are return the response in for loop so that loop break on 1st entry
import json
some_list = []
for i in data:
some_list.append({"key": "value"})
return HttpResponse(json.dumps({"some_list": some_list}), content_type="application/json")
Try above example to solve your problem

Categories

Resources