I've been using Python to access the Rdio API a fair bit so decided to add a couple methods to the Rdio module to make life easier. I keep getting stymied.
Here, as background, is some of the Rdio Python module provided by the company:
class Rdio:
def __init__(self, consumer, token=None):
self.__consumer = consumer
self.token = token
def __signed_post(self, url, params):
auth = om(self.__consumer, url, params, self.token)
req = urllib2.Request(url, urllib.urlencode(params), {'Authorization': auth})
res = urllib2.urlopen(req)
return res.read()
def call(self, method, params=dict()):
# make a copy of the dict
params = dict(params)
# put the method in the dict
params['method'] = method
# call to the server and parse the response
return json.loads(self.__signed_post('http://api.rdio.com/1/', params))
Okay, all well and good. Those functions work fine. So I decided to create a method that would copy a playlist with key1 into a playlist with key2. Here's the code:
def copy_playlist(self, key1, key2):
#get track keys from first playlist
playlist = self.call('get', {'keys': key1, 'extras' : 'tracks'})
track_keys = []
for track in tracks:
key = track['key']
track_keys.append(key)
#convert track list into single, comma-separated string (which the API requires)
keys_string = ', '.join(track_keys)
#add the tracks to the second playlist
self.call('addToPlaylist', {'playlist' : key2, 'tracks' : keys_string})
This code works fine if I do it from the terminal or in an external Python file, but for some reason when I include it as part of the Rdio class, then initiate the Rdio object as rdio and call the playlist method, I always get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "rdio_extended.py", line 83, in copy_playlist
NameError: global name 'rdio' is not defined
I can't seem to get around this. There's probably a simple answer - I'm pretty new to programming - but I'm stumped.
UPDATE: Updated code formatting, and here's the actual code that creates the Rdio object:
rdio = Rdio((RDIO_CONSUMER_KEY, RDIO_CONSUMER_SECRET), (RDIO_TOKEN, RDIO_TOKEN_SECRET))
And then this is the line to call the playlist-copying function:
rdio.copy_playlist(key1, key2)
That results in the NameError described above.
Related
I have the following python code that is working ok to use reddit's api and look up the front page of different subreddits and their rising submissions.
from pprint import pprint
import requests
import json
import datetime
import csv
import time
subredditsToScan = ["Arts", "AskReddit", "askscience", "aww", "books", "creepy", "dataisbeautiful", "DIY", "Documentaries", "EarthPorn", "explainlikeimfive", "food", "funny", "gaming", "gifs", "history", "jokes", "LifeProTips", "movies", "music", "pics", "science", "ShowerThoughts", "space", "sports", "tifu", "todayilearned", "videos", "worldnews"]
ofilePosts = open('posts.csv', 'wb')
writerPosts = csv.writer(ofilePosts, delimiter=',')
ofileUrls = open('urls.csv', 'wb')
writerUrls = csv.writer(ofileUrls, delimiter=',')
for subreddit in subredditsToScan:
front = requests.get(r'http://www.reddit.com/r/' + subreddit + '/.json')
rising = requests.get(r'http://www.reddit.com/r/' + subreddit + '/rising/.json')
front.text
rising.text
risingData = rising.json()
frontData = front.json()
print(len(risingData['data']['children']))
print(len(frontData['data']['children']))
for i in range(0, len(risingData['data']['children'])):
author = risingData['data']['children'][i]['data']['author']
score = risingData['data']['children'][i]['data']['score']
subreddit = risingData['data']['children'][i]['data']['subreddit']
gilded = risingData['data']['children'][i]['data']['gilded']
numOfComments = risingData['data']['children'][i]['data']['num_comments']
linkUrl = risingData['data']['children'][i]['data']['permalink']
timeCreated = risingData['data']['children'][i]['data']['created_utc']
writerPosts.writerow([author, score, subreddit, gilded, numOfComments, linkUrl, timeCreated])
writerUrls.writerow([linkUrl])
for j in range(0, len(frontData['data']['children'])):
author = frontData['data']['children'][j]['data']['author'].encode('utf-8').strip()
score = frontData['data']['children'][j]['data']['score']
subreddit = frontData['data']['children'][j]['data']['subreddit'].encode('utf-8').strip()
gilded = frontData['data']['children'][j]['data']['gilded']
numOfComments = frontData['data']['children'][j]['data']['num_comments']
linkUrl = frontData['data']['children'][j]['data']['permalink'].encode('utf-8').strip()
timeCreated = frontData['data']['children'][j]['data']['created_utc']
writerPosts.writerow([author, score, subreddit, gilded, numOfComments, linkUrl, timeCreated])
writerUrls.writerow([linkUrl])
It works well and scrapes the data accurately but it constantly gets interrupted, seemingly randomly, and has a run time crash, saying:
Traceback (most recent call last):
File "dataGather1.py", line 27, in <module>
for i in range(0, len(risingData['data']['children'])):
KeyError: 'data'
I have no idea why this error is occuring on and off and not consistently. I thought maybe I am calling the API too much so it stops me from accessing it so I threw a sleep in my code but that did not help. Any ideas?
When there are no data on the response from the API there are is no key data on the dictionary so you get a keyError on some subreddits. You need to use a try catch
The json you are parsing doesn't contain the 'data' element. Thus you get an error. I think your hunch is correct though. It is probably rate limiting, or that you're asking for hidden/deleted entries.
Reddit is very strict about accessing their API without playing nice. Meaning you should register your app and use a meaningful user-agent to your requets, and you should probably use the python library for this kind of thing: https://praw.readthedocs.io/en/latest/
Without registering it seems to my experience that the direct REST reddit API is even more strict than the 1 request per 2 seconds rule they have (had?).
Python raises a KeyError whenever a dict() object is requested (using the format a = adict[key]) and the key is not in the dictionary.
It seems like when you are getting this error, your data value is empty.
You might just try to get the length of the dictionary before you execute the for loop. If it’s empty, it will just not run. Some interesting error checking here might help.
size = len(risingData)
if size:
for i in range(0,size):
…
I'm not too familiar with Python but I have setup a BDD framework using Python behave, I now want to create a World map class that holds data and is retrievable throughout all scenarios.
For instance I will have a world class where I can use:
World w
w.key.add('key', api.response)
In one scenario and in another I can then use:
World w
key = w.key.get('key').
Edit:
Or if there is a built in way of using context or similar in behave where the attributes are saved and retrievable throughout all scenarios that would be good.
Like lettuce where you can use world http://lettuce.it/tutorial/simple.html
I've tried this between scenarios but it doesn't seem to be picking it up
class World(dict):
def __setitem__(self, key, item):
self.__dict__[key] = item
print(item)
def __getitem__(self, key):
return self.__dict__[key]
Setting the item in one step in scenario A: w.setitem('key', response)
Getting the item in another step in scenario B: w.getitem('key',)
This shows me an error though:
Traceback (most recent call last):
File "C:\Program Files (x86)\Python\lib\site-packages\behave\model.py", line 1456, in run
match.run(runner.context)
File "C:\Program Files (x86)\Python\lib\site-packages\behave\model.py", line 1903, in run
self.func(context, *args, **kwargs)
File "steps\get_account.py", line 14, in step_impl
print(w.__getitem__('appToken'))
File "C:Project\steps\world.py", line 8, in __getitem__
return self.__dict__[key]
KeyError: 'key'
It appears that the World does not hold values here between steps that are run.
Edit:
I'm unsure how to use environment.py but can see it has a way of running code before the steps. How can I allow my call to a soap client within environment.py to be called and then pass this to a particular step?
Edit:
I have made the request in environment.py and hardcoded the values, how can I pass variables to environment.py and back?
It's called "context" in the python-behave jargon. The first argument of your step definition function is an instance of the behave.runner.Context class, in which you can store your world instance. Please see the appropriate part of the tutorial.
Have you tried the
simple approach, using global var, for instance:
def before_all(context):
global response
response = api.response
def before_scenario(context, scenario):
global response
w.key.add('key', response)
Guess feature can be accessed from context, for instance:
def before_feature(context, feature):
feature.response = api.response
def before_scenario(context, scenario):
w.key.add('key', context.feature.response)
You are looking for:
Class variable: A variable that is shared by all instances of a class.
Your code in Q uses Class Instance variable.
Read about: python_classes_objects
For instance:
class World(dict):
__class_var = {}
def __setitem__(self, key, item):
World.__class_var[key] = item
def __getitem__(self, key):
return World.__class_var[key]
# Scenario A
A = World()
A['key'] = 'test'
print('A[\'key\']=%s' % A['key'] )
del A
# Scenario B
B = World()
print('B[\'key\']=%s' % B['key'] )
Output:
A['key']=test
B['key']=test
Tested with Python:3.4.2
Come back and Flag your Question as answered if this is working for you or comment why not.
Defining global var in before_all hook did not work for me.
As mentioned by #stovfl
But defining global var within one of my steps worked out.
Instead, as Szabo Peter mentioned use the context.
context.your_variable_name = api.response
and just use
context.your_variable_name anywhere the value is to be used.
For this I actually used a config file [config.py] I then added the variables in there and retrieved them using getattr. See below:
WSDL_URL = 'wsdl'
USERNAME = 'name'
PASSWORD = 'PWD'
Then retrieved them like:
import config
getattr(config, 'USERNAME ', 'username not found')
I'm trying to process URL's in a pyspark dataframe using a class that I've written and a udf. I'm aware of urllib and other url parsing libraries but for this case I need to use my own code.
In order to get the tld of a url I cross check it against the iana public suffix list.
Here's a simplification of my code
class Parser:
# list of available public suffixes for extracting top level domains
file = open("public_suffix_list.txt", 'r')
data = []
for line in file:
if line.startswith("//") or line == '\n':
pass
else:
data.append(line.strip('\n'))
def __init__(self, url):
self.url = url
#the code here extracts port,protocol,query etc.
#I think this bit below is causing the error
matches = [r for r in self.data if r in self.hostname]
#extra functionality in my actual class
i = matches.index(self.string)
try:
self.tld = matches[i]
# logic to find tld if no match
The class works in pure python so for example I can run
import Parser
x = Parser("www.google.com")
x.tld #returns ".com"
However when I try to do
import Parser
from pyspark.sql.functions import udf
parse = udf(lambda x: Parser(x).url)
df = sqlContext.table("tablename").select(parse("column"))
When I call an action I get
File "<stdin>", line 3, in <lambda>
File "<stdin>", line 27, in __init__
TypeError: 'in <string>' requires string as left operand
So my guess is that it's failing to interpret the data as a list of strings?
I've also tried to use
file = sc.textFile("my_file.txt")\
.filter(lambda x: not x.startswith("//") or != "")\
.collect()
data = sc.broadcast(file)
to open my file instead, but that causes
Exception: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion. SparkContext can only be used on the driver, not in code that it run on workers. For more information, see SPARK-5063.
Any ideas?
Thanks in advance
EDIT: Apologies, I didn't have my code to hand so my test code didn't explain very well the problems I was having. The error I initially reported was a result of the test data I was using.
I've updated my question to be more reflective of the challenge I'm facing.
Why do you need a class in this case (the code for defining your class is incorrect, you never declared self.data before using it in the init method) the only relevant line that affects the output you want is self.string=string, so you are basically passing the identity function as udf.
The UnicodeDecodeError is due to an encoding issue in your file, it has nothing to do with your definition of the class.
The second error is in the line sc.broadcast(file) , details of which can be found here : Spark: Broadcast variables: It appears that you are attempting to reference SparkContext from a broadcast variable, action, or transforamtion
EDIT 1
I would redefine your class structure as follows. You basically need to create the instance self.data by calling self.data = data before you can use it. Also anything that you write before the init method is executed irrespective of whether you call that class or not. So moving out the file parsing part will not have any effect.
# list of available public suffixes for extracting top level domains
file = open("public_suffix_list.txt", 'r')
data = []
for line in file:
if line.startswith("//") or line == '\n':
pass
else:
data.append(line.strip('\n'))
class Parser:
def __init__(self, url):
self.url = url
self.data = data
#the code here extracts port,protocol,query etc.
#I think this bit below is causing the error
matches = [r for r in self.data if r in self.hostname]
#extra functionality in my actual class
i = matches.index(self.string)
try:
self.tld = matches[i]
# logic to find tld if no match
I am facing issue with the Code which is framed with Class.
Basically before It was simply a file with functions defined in it.
So When I was trying to execute the file using command python filename.py, It is working fine as needed.
Code sample is as follows:
# Getting the tenant list
# Fetch the creation_date of tenant if exists
def get_tenants():
# Fetch tenant list
tenants_list = keystone.tenants.list()
# Fetch tenant ID
for tenant in tenants_list:
tenant_id = tenant.id
.
.
.
get_tenants()
So as shown in aboce code in the file I am trying to call get_tenants function, Also it is working fine as needed with no error.
Now I have Created the Class then moved all the functions in to the same.
Above function is Rewritten as follows now.
def get_tenants(self):
# Fetch tenant list
tenants_list = keystone.tenants.list()
# Fetch tenant ID
for tenant in tenants_list:
tenant_id = tenant.id
Then I have called the Function as follows:
billing = BillingEngine()
billing.get_tenants()
But, now I am getting the error as follows:
root#devstack:/opt/open-stack-tools/billing# python new_class.py
Traceback (most recent call last):
File "new_class.py", line 281, in <module>
BillingEngine().get_tenants()
File "new_class.py", line 75, in get_tenants
tenants_list = keystone.tenants.list()
NameError: global name 'keystone' is not defined
Note: Will provide the full file if needed.
May be you must define this?
class Example(object):
keystone = Keystone()
def get_tenants(self):
self.keystone.do_something()
I'm working on a twitch irc bot and one of the components I wanted to have available was the ability for the bot to save quotes to a pastebin paste on close, and then retrieve the same quotes on start up.
I've started with the saving part, and have hit a road block where I can't seem to get a valid post, and I can't figure out a method.
#!/usr/bin/env python3
import urllib.parse
import urllib.request
# --------------------------------------------- Pastebin Requisites --------------------------------------------------
pastebin_key = 'my pastebin key' # developer api key, required. GET: http://pastebin.com/api
pastebin_password = 'password' # password for pastebin_username
pastebin_postexp = 'N' # N = never expire
pastebin_private = 0 # 0 = Public 1 = unlisted 2 = Private
pastebin_url = 'http://pastebin.com/api/api_post.php'
pastebin_username = 'username' # user corresponding with key
# --------------------------------------------- Value clean up --------------------------------------------------
pastebin_password = urllib.parse.quote(pastebin_password, safe='/')
pastebin_username = urllib.parse.quote(pastebin_username, safe='/')
# --------------------------------------------- Pastebin Functions --------------------------------------------------
def post(title, content): # used for posting a new paste
pastebin_vars = {'api_option': 'paste', 'api_user_key': pastebin_username, 'api_paste_private': pastebin_private,
'api_paste_name': title, 'api_paste_expire_date': pastebin_postexp, 'api_dev_key': pastebin_key,
'api_user_password': pastebin_password, 'api_paste_code': content}
try:
str_to_paste = ', '.join("{!s}={!r}".format(key, val) for (key, val) in pastebin_vars.items()) # dict to str :D
str_to_paste = str_to_paste.replace(":", "") # remove :
str_to_paste = str_to_paste.replace("'", "") # remove '
str_to_paste = str_to_paste.replace(")", "") # remove )
str_to_paste = str_to_paste.replace(", ", "&") # replace dividers with &
urllib.request.urlopen(pastebin_url, urllib.parse.urlencode(pastebin_vars)).read()
print('did that work?')
except:
print("post submit failed :(")
print(pastebin_url + "?" + str_to_paste) # print the output for test
post("test", "stuff")
I'm open to importing more libraries and stuff, not really sure what I'm doing wrong after working on this for two days straight :S
import urllib.parse
import urllib.request
PASTEBIN_KEY = 'xxx'
PASTEBIN_URL = 'https://pastebin.com/api/api_post.php'
PASTEBIN_LOGIN_URL = 'https://pastebin.com/api/api_login.php'
PASTEBIN_LOGIN = 'my_login_name'
PASTEBIN_PWD = 'yyy'
def pastebin_post(title, content):
login_params = dict(
api_dev_key=PASTEBIN_KEY,
api_user_name=PASTEBIN_LOGIN,
api_user_password=PASTEBIN_PWD
)
data = urllib.parse.urlencode(login_params).encode("utf-8")
req = urllib.request.Request(PASTEBIN_LOGIN_URL, data)
with urllib.request.urlopen(req) as response:
pastebin_vars = dict(
api_option='paste',
api_dev_key=PASTEBIN_KEY,
api_user_key=response.read(),
api_paste_name=title,
api_paste_code=content,
api_paste_private=2,
)
return urllib.request.urlopen(PASTEBIN_URL, urllib.parse.urlencode(pastebin_vars).encode('utf8')).read()
rv = pastebin_post("This is my title", "These are the contents I'm posting")
print(rv)
Combining two different answers above gave me this working solution.
First, your try/except block is throwing away the actual error. You should almost never use a "bare" except clause without capturing or re-raising the original exception. See this article for a full explanation.
Once you remove the try/except, and you will see the underlying error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "paste.py", line 42, in post
urllib.request.urlopen(pastebin_url, urllib.parse.urlencode(pastebin_vars)).read()
File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 461, in open
req = meth(req)
File "/usr/lib/python3.4/urllib/request.py", line 1112, in do_request_
raise TypeError(msg)
TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.
This means you're trying to pass a unicode string into a function that's expecting bytes. When you do I/O (like reading/writing files on disk, or sending/receiving data over HTTP) you typically need to encode any unicode strings as bytes. See this presentation for a good explanation of unicode vs. bytes and when you need to encode and decode.
Next, this line:
urllib.request.urlopen(pastebin_url, urllib.parse.urlencode(pastebin_vars)).read()
Is throwing away the response, so you have no way of knowing the result of your API call. Assign this to a variable or return it from your function so you can then inspect the value. It will either be a URL to the paste, or an error message from the API.
Next, I think your code is sending a lot of unnecessary parameters to the API and your str_to_paste statements aren't necessary.
I was able to make a paste using the following, much simpler, code:
import urllib.parse
import urllib.request
PASTEBIN_KEY = 'my-api-key' # developer api key, required. GET: http://pastebin.com/api
PASTEBIN_URL = 'http://pastebin.com/api/api_post.php'
def post(title, content): # used for posting a new paste
pastebin_vars = dict(
api_option='paste',
api_dev_key=PASTEBIN_KEY,
api_paste_name=title,
api_paste_code=content,
)
return urllib.request.urlopen(PASTEBIN_URL, urllib.parse.urlencode(pastebin_vars).encode('utf8')).read()
Here it is in use:
>>> post("test", "hello\nworld.")
b'http://pastebin.com/v8jCkHDB'
I didn't know about pastebin until now. I read their api and tried it for the first time, and it worked perfectly fine.
Here's what I did:
I logged in to fetch the api_user_key.
Included that in the posting along with api_dev_key.
Checked the website, and the post was there.
Here's the code:
import urllib.parse
import urllib.request
def post(url, params):
data = urllib.parse.urlencode(login_params).encode("utf-8")
req = urllib.request.Request(login_url, data)
with urllib.request.urlopen(req) as response:
return response.read()
# Logging in to fetch api_user_key
login_url = "http://pastebin.com/api/api_login.php"
login_params = {"api_dev_key": "<the dev key they gave you",
"api_user_name": "<username goes here>",
"api_user_password": "<password goes here>"}
api_user_key = post(login_url, login_params)
# Posting some random text
post_url = "http://pastebin.com/api/api_post.php"
post_params = {"api_dev_key": "<the dev key they gave you",
"api_option": "paste",
"api_paste_code": "<head>Testing</head>",
"api_paste_private": "0",
"api_paste_name": "testing.html",
"api_paste_expire_date": "10M",
"api_paste_format": "html5",
"api_user_key": api_user_key}
response = post(post_url, post_params)
Only the first three parameters are needed for posting something, the rest are optional.
fwy the API doesn't seem to accept http requests as of writing this, so make sure to have the urls in the format of https://pas...