get URL of any submission for subreddits - python

I am trying to use PRAW to get new posts from subreddits on Reddit. The following code snippet shows how I get new items on a particular subreddit.
Is there a way to also get the URL of the particular submission?
submissions = r.get_subreddit('todayilearned')
submission = submissions.get_new(limit=1)
sub = [str(x) for x in submission]
print sub

PRAW allows you to do this:
To get the submitted link you can use submission.url
[submission] = submissions.get_new(limit=1)
print submission.url
Or if you're looking for the URL for the actual post to Reddit then you can use permalink
[submission] = submissions.get_new(limit=1)
print submission.permalink

The documentation lists a short_link property that returns a shortened version of the url to the submission. It does not appear that the full url is similarly provided, though it seems that it could be reconstructed from the subreddit name and the submission's id, which is stored in submission.id.
In summary, use:
[submission] = submissions.get_new(limit=1)
submission.short_link
to get a link to the submission.

Related

How to get Last Updated date for a document from confluence page through api?

I am trying to get Last Updated date from confluence for a document by using the api but was not able to get it. Can someone point me in the right direction? One solution recommended was to use requests library along with Beautiful Soup and parse the html but I am looking to get this done via an api but so far did not have much success.
I am using this:
https://atlassian-python-api.readthedocs.io/confluence.html
and this:
https://github.com/atlassian-api/atlassian-python-api/blob/master/atlassian/confluence.py
I saw the following in the first link i provided:
#Compare content and check is already updated or not
confluence.is_page_content_is_already_updated(page_id, body)
But what I want is to grab the date a document was last updated. This date is present in our confluence docs and the title is “Last Updated” . This last updated date is present in front of every document.
I see that this has not been answered, so here's my take after some research:
goOn = True
pages = list()
startAt = 0
while(goOn):
batch = confluence.get_all_pages_from_space('<ConfluenceSpaceName>', start=startAt, limit=100, status=None, expand='title,history.lastUpdated')
pages.extend(batch)
startAt = len(pages)
if (len(batch) == 0):
goOn = False
for p in pages:
if (debug): print('Page {}; updated: {}'.format(p['title'], p['history']['lastUpdated']['when']))
Thanks, Kirill

Is there any way to parse through reddit comments and replies through JSON and not through PRAW?

I am trying to parse through reddit comments and the replies to each comment made. However, I am trying to avoid using PRAW. This is the code I have right now to display the titles of each post within a subreddit. But, how do I access the comment field and its replies?
import requests
import json
r = requests.get('http://www.reddit.com/r/wallstreetbets/new.json?count=500', headers = {'User-agent': 'Chrome'})
r_comments = requests.get('https://www.reddit.com/r/wallstreetbets/comments.json')
theJSON = json.loads(r.text)
theJSON_comments = json.loads(r_comments.text)
titles = []
#print(theJSON)
#prints the titles
for child in theJSON['data']['children']:
titles.append(child['data']['title'])
#print(child['data']['title'])
for child2 in theJSON_comments['data']['children']:
print(child2['data'][0])
If you're using praw and want to get all the comments you can get all of the comments like this:
submission = reddit.submission(id=<submission_id>)
submission.comments.replace_more(limit=None)
all_comments = submission.comments.list()
And then all_comments is a dict you can use. Not json, but can be saved as a json file.
If you don't want to use praw, you can manually use the reddit api with any language you want. However, I have a blog post which says how to set this up in Javascript.
I think PRAW would really help you out here, but otherwise the reddit api is your best bet.

Unable to get Facebook Group members after first page using Python

I am trying to get the names of members of a group I am a member of. I am able to get the names in the first page but not sure how to go to the next page:
My Code:
url = 'https://graph.facebook.com/v2.5/1671554786408615/members?access_token=<MY_CUSTOM_ACCESS_CODE_HERE>'
json_obj = urllib2.urlopen(url)
data = json.load(json_obj)
for each in data['data']:
print each['name']
Using the code above I am successfully getting all names on the first page but question is -- how do I go to the next page?
In the Graph API Explorer Output screen I see this:
What change does my code need to keep going to next pages and get names of ALL members of the group?
The JSON returned by the Graph API is telling you where to get the next page of data, in data['paging']['next']. You could give something like this a try:
def printNames():
json_obj = urllib2.urlopen(url)
data = json.load(json_obj)
for each in data['data']:
print each['name']
return data['paging']['next'] # Return the URL to the next page of data
url = 'https://graph.facebook.com/v2.5/1671554786408615/members?access_token=<MY_CUSTOM_ACCESS_CODE_HERE>'
url = printNames()
print "====END OF PAGE 1===="
url = printNames()
print "====END OF PAGE 2===="
You would need to add checks, for instance ['paging']['next'] will only be available in your JSON object if there is a next page, so you might want to modify your function to return a more complex structure to convey this information, but this should give you the idea.

How can I get the current URL or the URL clicked on and save it as a string in python?

How can I get the current URL and save it as a string in python?
I have some code that uses encodedURL = urllib.quote_plus to change the URL in a for loop going through a list. I cannot save encodedURL as a new variable because it's in a for loop and will always return the last item in the list.
My end goal is that I want to get the URL of a hyperlink that the user clicks on, so I can display certain content on that specific URL.
Apologies if I have left out important information. There is too much code and too many modules to post it all here. If you need anything else please let me know.
EDIT: To add more description:
I have a page which has a list of user comments about a website. The website is hyperlinked to that actual website, and there is a "list all comments about this website" link. My goal is that when the user clicks on list all comments about this website, it will open another page showing every comment that is about that website. The problem is I cannot get the website they are referring to when clicking 'all comments about this website'
Don't know if it helps but this is what I am using:
z=[ ]
for x in S:
y = list(x)
z.append(y)
for coms in z:
url = urllib.quote_plus(coms[2])
coms[2] = "'Commented on:' <a href='%s'> %s</a> (<a href = 'conversation?page=%s'> all </a>) " % (coms[2],coms[2], url)
coms[3] += "<br><br>"
deCodedURL = urllib.unquote_plus(url)
text2 = interface.list_comments_page(db, **THIS IS THE PROBLEM**)
page_comments = {
'comments_page':'<p>%s</p>' % text2,
}
if environ['PATH_INFO'] == '/conversation':
headers = [('content-type' , 'text/html')]
start_response("200 OK", headers)
return templating.generate_page(page_comments)
So your problem is you need to parse the URL for the query string, and urllib has some helpers for that:
>>> i
'conversation?page=http://www.google.com/'
>>> urllib.splitvalue(urllib.splitquery(i)[1])
('page', 'http://www.google.com/')

Using the Python GData API, cannot get editable video entry

I am having trouble getting a video entry which includes a link rel="edit". I need such an entry in order to be able to call DeleteVideoEntry(...) on it.
I am retrieving the video using GetYouTubeVideoEntry(youtube_id=XXXXXXX). My yt_service is initialized with a username, password, and a developer key. I use ProgrammaticLogin. This part seems to work fine. I use the same yt_service to upload said video earlier. Also, if I change the developer key to something bogus (during debugging) and try to authenticate, I get a 403 error. This leads me to believe that authentication works OK.
Needsless to say, the video entry retrieved with GetYouTubeVideoEntry(youtube_id=XXXXXXX) does not contain the edit link and I cannot use the entry in a DeleteVideoEntry(...) call.
Is there some special way to get a video entry which will contain a link element with a rel="edit"? Can anyone suggest some way to resolve my issue? Could this possibly be a bug?
Update:
For the records, when I tried getting the feed of all my uploads, and then looping through the video entries, the video entries do have an edit link. So using this works:
uri = 'http://gdata.youtube.com/feeds/api/users/%s/uploads' % username
feed = yt_service.GetYouTubeVideoFeed(uri)
for entry in feed.entry:
yt_service.DeleteVideoEntry(entry)
But this does not:
entry = yt_service.GetYouTubeVideoEntry(video_id = video.youtube_id)
yt_service.DeleteVideoEntry(entry)
Using the same yt_service.
I've just deleted youtube video using gdata and ProgrammaticLogin()
Here is some steps to reproduce:
import gdata.youtube.service
yt_service = gdata.youtube.service.YouTubeService()
yt_service.developer_key = 'developer_key'
yt_service.email = 'email'
yt_service.password = 'password'
yt_service.ProgrammaticLogin()
# video_id should looks like 'iu6Gq-tUsTc'
uri = 'https://gdata.youtube.com/feeds/api/users/%s/uploads/%s' % (username, video_id)
entry = yt_service.GetYouTubeUserEntry(uri=uri)
response = yt_service.DeleteVideoEntry(entry)
print response # True
yt_service.GetYouTubeVideoFeed(uri) works because GetYouTubeVideoFeed doesn't check uri and just calls self.Get(uri, ...) but originaly, I think, it expected 'https://gdata.youtube.com/feeds/api/videos' uri.
vice versa yt_service.GetYouTubeVideoEntry() use YOUTUBE_VIDEO_URI = 'https://gdata.youtube.com/feeds/api/videos' but this entry doesn't contains rel="edit"
Hope that helps you out
You can view the HTTP headers of the generated requests by setting the debug flag to true. This is as simple as:
yt_service = gdata.youtube.service.YouTubeService()
yt_service.debug = True
You can read about this in the documentation here.

Categories

Resources