I was wondering if i could get some input from some season python exports, i have a couple questions
I am extracting data from an api request and calculating the total vulnerabilities,
what is the best way i can return this data so that i can call it in another function
what is the way i can add up all the vulnerabilities (right now its just adding it per 500 at a time, id like to do the sum of every vulnerability
def _request():
third_party_patching_filer = {
"asset": "asset.agentKey IS NOT NULL",
"vulnerability" : "vulnerability.categories NOT IN ['microsoft patch']"}
headers = _headers()
print(headers)
url1 = f"https://us.api.insight.rapid7.com/vm/v4/integration/assets"
resp = requests.post(url=url1, headers=headers, json=third_party_patching_filer, verify=False).json()
jsonData = resp
#print(jsonData)
has_next_cursor = False
nextKey = ""
if "cursor" in jsonData["metadata"]:
has_next_cursor = True
nextKey = jsonData["metadata"]["cursor"]
while has_next_cursor:
url2 = f"https://us.api.insight.rapid7.com/vm/v4/integration/assets?&size=500&cursor={nextKey}"
resp2 = requests.post(url=url2, headers=headers, json=third_party_patching_filer, verify=False).json()
cursor = resp2["metadata"]
print(cursor)
if "cursor" in cursor:
nextKey = cursor["cursor"]
print(f"next key {nextKey}")
#print(desktop_support)
for data in resp2["data"]:
for tags in data['tags']:
total_critical_vul_osswin = []
total_severe_vul_osswin = []
total_modoer_vuln_osswin = []
if tags["name"] == 'OSSWIN':
print("OSSWIN")
critical_vuln_osswin = data['critical_vulnerabilities']
severe_vuln_osswin = data['severe_vulnerabilities']
modoer_vuln_osswin = data['moderate_vulnerabilities']
total_critical_vul_osswin.append(critical_vuln_osswin)
total_severe_vul_osswin.append(severe_vuln_osswin)
total_modoer_vuln_osswin.append(modoer_vuln_osswin)
print(sum(total_critical_vul_osswin))
print(sum(total_severe_vul_osswin))
print(sum(total_modoer_vuln_osswin))
if tags["name"] == 'DESKTOP_SUPPORT':
print("Desktop")
total_critical_vul_desktop = []
total_severe_vul_desktop = []
total_modorate_vuln_desktop = []
critical_vuln_desktop = data['critical_vulnerabilities']
severe_vuln_desktop = data['severe_vulnerabilities']
moderate_vuln_desktop = data['moderate_vulnerabilities']
total_critical_vul_desktop.append(critical_vuln_desktop)
total_severe_vul_desktop.append(severe_vuln_desktop)
total_modorate_vuln_desktop.append(moderate_vuln_desktop)
print(sum(total_critical_vul_desktop))
print(sum(total_severe_vul_desktop))
print(sum(total_modorate_vuln_desktop))
else:
pass
else:
has_next_cursor = False
If you have a lot of parameters to pass, consider using a dict to combine them. Then you can just return the dict and pass it along to the next function that needs that data. Another approach would be to create a class and either access the variables directly or have helper functions that do so. The latter is a cleaner solution vs a dict, since with a dict you have to quote every variable name, and with a class you can easily add additional functionally beyond just being a container for a bunch of instance variables.
If you want the total across all the data, you should put these initializations:
total_critical_vul_osswin = []
total_severe_vul_osswin = []
total_modoer_vuln_osswin = []
before the while has_next_cursor loop (and similarly for the desktop totals). The way your code is currently, they are initialized each cursor (ie, each 500 samples based on the URL).
I am teaching myself how to use python and django to access the google places api to make nearby searches for different types of gyms.
I was only taught how to use python and django with databases you build locally.
I wrote out a full Get request for they four different searches I am doing. I looked up examples but none seem to work for me.
allgyms = requests.get('https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=38.9208,-77.036&radius=2500&type=gym&key=AIzaSyDOwVK7bGap6b5Mpct1cjKMp7swFGi3uGg')
all_text = allgyms.text
alljson = json.loads(all_text)
healthclubs = requests.get('https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=38.9208,-77.036&radius=2500&type=gym&keyword=healthclub&key=AIzaSyDOwVK7bGap6b5Mpct1cjKMp7swFGi3uGg')
health_text = healthclubs.text
healthjson = json.loads(health_text)
crossfit = requests.get('https://maps.googleapis.com/maps/api/place/nearbysearch/json?location=38.9208,-77.036&radius=2500&type=gym&keyword=crossfit&key=AIzaSyDOwVK7bGap6b5Mpct1cjKMp7swFGi3uGg')
cross_text = crossfit.text
crossjson = json.loads(cross_text)
I really would like to be pointed in the right direction on how to have the api key referenced only one time while changing the keywords.
Try this for better readability and better reusability
BASE_URL = 'https://maps.googleapis.com/maps/api/place/nearbysearch/json?'
LOCATION = '38.9208,-77.036'
RADIUS = '2500'
TYPE = 'gym'
API_KEY = 'AIzaSyDOwVK7bGap6b5Mpct1cjKMp7swFGi3uGg'
KEYWORDS = ''
allgyms = requests.get(BASE_URL+'location='+LOCATION+'&radius='+RADIUS+'&type='+TYPE+'&key='+API_KEY) all_text = allgyms.text
alljson = json.loads(all_text)
KEYWORDS = 'healthclub'
healthclubs = requests.get(BASE_URL+'location='+LOCATION+'&radius='+RADIUS+'&type='+TYPE+'&keyword='+KEYWORDS+'&key='+API_KEY)
health_text = healthclubs.text
healthjson = json.loads(health_text)
KEYWORDS = 'crossfit'
crossfit = requests.get(BASE_URL+'location='+LOCATION+'&radius='+RADIUS+'&type='+TYPE+'&keyword='+KEYWORDS+'&key='+API_KEY)
cross_text = crossfit.text
crossjson = json.loads(cross_text)
as V-R suggested in a comment you can go further and define function which makes things more reusable allowing you to use the that function in other places of your application
Function implementation
def makeRequest(location, radius, type, keywords):
BASE_URL = 'https://maps.googleapis.com/maps/api/place/nearbysearch/json?'
API_KEY = 'AIzaSyDOwVK7bGap6b5Mpct1cjKMp7swFGi3uGg'
result = requests.get(BASE_URL+'location='+location+'&radius='+radius+'&type='+type+'&keyword='+keywords+'&key='+API_KEY)
jsonResult = json.loads(result)
return jsonResult
Function invocation
json = makeRequest('38.9208,-77.036', '2500', 'gym', '')
Let me know if there is an issue
I am attempting to query all rows for a column called show_id. I would then like to compare each potential item to be added to the DB with the results. Now the simplest way I can think of doing that is by checking if each show is in the results. If so pass etc. However the results from the below snippet are returned as objects. So this check fails.
Is there a better way to create the query to achieve this?
shows_inDB = Show.query.filter(Show.show_id).all()
print(shows_inDB)
Results:
<app.models.user.Show object at 0x10c2c5fd0>,
<app.models.user.Show object at 0x10c2da080>,
<app.models.user.Show object at 0x10c2da0f0>
Code for the entire function:
def save_changes_show(show_details):
"""
Save the changes to the database
"""
try:
shows_inDB = Show.query.filter(Show.show_id).all()
print(shows_inDB)
for show in show_details:
#Check the show isnt already in the DB
if show['id'] in shows_inDB:
print(str(show['id']) + ' Already Present')
else:
#Add show to DB
tv_show = Show(
show_id = show['id'],
seriesName = str(show['seriesName']).encode(),
aliases = str(show['aliases']).encode(),
banner = str(show['banner']).encode(),
seriesId = str(show['seriesId']).encode(),
status = str(show['status']).encode(),
firstAired = str(show['firstAired']).encode(),
network = str(show['network']).encode(),
networkId = str(show['networkId']).encode(),
runtime = str(show['runtime']).encode(),
genre = str(show['genre']).encode(),
overview = str(show['overview']).encode(),
lastUpdated = str(show['lastUpdated']).encode(),
airsDayOfWeek = str(show['airsDayOfWeek']).encode(),
airsTime = str(show['airsTime']).encode(),
rating = str(show['rating']).encode(),
imdbId = str(show['imdbId']).encode(),
zap2itId = str(show['zap2itId']).encode(),
added = str(show['added']).encode(),
addedBy = str(show['addedBy']).encode(),
siteRating = str(show['siteRating']).encode(),
siteRatingCount = str(show['siteRatingCount']).encode(),
slug = str(show['slug']).encode()
)
db.session.add(tv_show)
db.session.commit()
except Exception:
print(traceback.print_exc())
I have decided to use the method above and extract the data I wanted into a list, comparing each show to the list.
show_compare = []
shows_inDB = Show.query.filter().all()
for item in shows_inDB:
show_compare.append(item.show_id)
for show in show_details:
#Check the show isnt already in the DB
if show['id'] in show_compare:
print(str(show['id']) + ' Already Present')
else:
#Add show to DB
For querying a specific column value, have a look at this question: Flask SQLAlchemy query, specify column names. This is the example code given in the top answer there:
result = SomeModel.query.with_entities(SomeModel.col1, SomeModel.col2)
The crux of your problem is that you want to create a new Show instance if that show doesn't already exist in the database.
Querying the database for all shows and looping through the result for each potential new show might become very inefficient if you end up with a lot of shows in the database, and finding an object by identity is what an RDBMS does best!
This function will check to see if an object exists, and create it if not. Inspired by this answer:
def add_if_not_exists(model, **kwargs):
if not model.query.filter_by(**kwargs).first():
instance = model(**kwargs)
db.session.add(instance)
So your example would look like:
def add_if_not_exists(model, **kwargs):
if not model.query.filter_by(**kwargs).first():
instance = model(**kwargs)
db.session.add(instance)
for show in show_details:
add_if_not_exists(Show, id=show['id'])
If you really want to query all shows upfront, instead of putting all of the id's into a list, you could use a set instead of a list which will speed up your inclusion test.
E.g:
show_compare = {item.show_id for item in Show.query.all()}
for show in show_details:
# ... same as your code
I'm trying to pull all tracks in a certain playlist using the Spotipy library for python.
The user_playlist_tracks function is limited to 100 tracks, regardless of the parameter limit. The Spotipy documentation describes it as:
user_playlist_tracks(user, playlist_id=None, fields=None, limit=100,
offset=0, market=None)
Get full details of the tracks of a playlist
owned by a user.
Parameters:
user
the id of the user playlist_id
the id of the playlist fields
which fields to return limit
the maximum number of tracks to return offset
the index of the first track to return market
an ISO 3166-1 alpha-2 country code.
After authenticating with Spotify, I'm currently using something like this:
username = xxxx
playlist = #fromspotipy
sp_playlist = sp.user_playlist_tracks(username, playlist_id=playlist)
tracks = sp_playlist['items']
print tracks
Is there a way to return more than 100 tracks? I've tried setting the limit=None in the function parameters, but it returns an error.
Many of the spotipy methods return paginated results, so you will have to scroll through them to view more than just the max limit. I've encountered this most often when collecting a playlist's full track listing and consequently created a custom method to handle this:
def get_playlist_tracks(username,playlist_id):
results = sp.user_playlist_tracks(username,playlist_id)
tracks = results['items']
while results['next']:
results = sp.next(results)
tracks.extend(results['items'])
return tracks
I wrote a function that can output Panda's DataFrame where it pulls all the metadata (not all of it because I didn't want to, but you can make some space for that) for playlists over 100 songs. I do it by iterating over every song, finding the metadata for each, saving the metadata to a dictionary, and then concatenating the dictionary to the DataFrame. It takes your username and the Playlist ID as input.
# Function to extract MetaData from a playlist thats longer than 100 songs
def get_playlist_tracks_more_than_100_songs(username, playlist_id):
results = sp.user_playlist_tracks(username,playlist_id)
tracks = results['items']
while results['next']:
results = sp.next(results)
tracks.extend(results['items'])
results = tracks
playlist_tracks_id = []
playlist_tracks_titles = []
playlist_tracks_artists = []
playlist_tracks_first_artists = []
playlist_tracks_first_release_date = []
playlist_tracks_popularity = []
for i in range(len(results)):
print(i) # Counter
if i == 0:
playlist_tracks_id = results[i]['track']['id']
playlist_tracks_titles = results[i]['track']['name']
playlist_tracks_first_release_date = results[i]['track']['album']['release_date']
playlist_tracks_popularity = results[i]['track']['popularity']
artist_list = []
for artist in results[i]['track']['artists']:
artist_list= artist['name']
playlist_tracks_artists = artist_list
features = sp.audio_features(playlist_tracks_id)
features_df = pd.DataFrame(data=features, columns=features[0].keys())
features_df['title'] = playlist_tracks_titles
features_df['all_artists'] = playlist_tracks_artists
features_df['popularity'] = playlist_tracks_popularity
features_df['release_date'] = playlist_tracks_first_release_date
features_df = features_df[['id', 'title', 'all_artists', 'popularity', 'release_date',
'danceability', 'energy', 'key', 'loudness',
'mode', 'acousticness', 'instrumentalness',
'liveness', 'valence', 'tempo',
'duration_ms', 'time_signature']]
continue
else:
try:
playlist_tracks_id = results[i]['track']['id']
playlist_tracks_titles = results[i]['track']['name']
playlist_tracks_first_release_date = results[i]['track']['album']['release_date']
playlist_tracks_popularity = results[i]['track']['popularity']
artist_list = []
for artist in results[i]['track']['artists']:
artist_list= artist['name']
playlist_tracks_artists = artist_list
features = sp.audio_features(playlist_tracks_id)
new_row = {'id':[playlist_tracks_id],
'title':[playlist_tracks_titles],
'all_artists':[playlist_tracks_artists],
'popularity':[playlist_tracks_popularity],
'release_date':[playlist_tracks_first_release_date],
'danceability':[features[0]['danceability']],
'energy':[features[0]['energy']],
'key':[features[0]['key']],
'loudness':[features[0]['loudness']],
'mode':[features[0]['mode']],
'acousticness':[features[0]['acousticness']],
'instrumentalness':[features[0]['instrumentalness']],
'liveness':[features[0]['liveness']],
'valence':[features[0]['valence']],
'tempo':[features[0]['tempo']],
'duration_ms':[features[0]['duration_ms']],
'time_signature':[features[0]['time_signature']]
}
dfs = [features_df, pd.DataFrame(new_row)]
features_df = pd.concat(dfs, ignore_index = True)
except:
continue
return features_df
Another way around it would be to write a for loop and do:
offset +=100
then you could concatenate the tracks at the end, or put them in a data frame.
Function Ref:
playlist_tracks(playlist_id, fields=None, limit=100, offset=0, market=None)
Reference: https://spotipy.readthedocs.io/en/2.7.0/#spotipy.client.Spotify.playlist_tracks
Below is the user_playlist_tracks module used in spotipy. (notice it defaults to 100 limit).
Try setting the limit to 200.
def user_playlist_tracks(self, user, playlist_id = None, fields=None,
limit=100, offset=0):
''' Get full details of the tracks of a playlist owned by a user.
Parameters:
- user - the id of the user
- playlist_id - the id of the playlist
- fields - which fields to return
- limit - the maximum number of tracks to return
- offset - the index of the first track to return
'''
plid = self._get_id('playlist', playlist_id)
return self._get("users/%s/playlists/%s/tracks" % (user, plid),
limit=limit, offset=offset, fields=fields)
When trying the above solutions I got key error messages. I eventually figured it out. Here is my solution. This is only for displaying tracks/artists on the next pages.
id = "5lrkIjzukk65X4ksulpA0H?si=9db60a70278a4fd6"
results = sp.playlist_items(id)
tracks = results['tracks']
next_pages = 14
track_list = []
for i in range(next_pages):
tracks = sp.next(tracks)
for y in range(0,100):
try:
track = tracks['items'][y]['track']['name']
artist = tracks['items'][y]['track']['artists'][0]['name']
track_list.append(artist)
except:
continue
print(track_list)
It's unfortunate SpotiPy makes their API Access to complicated. Try using SpotifyR in r, and you can accomplish this in a few lines of code. No loops, lists, extra variables, or appending required. Then just pop it back into python if you'd like.
library(spotifyr)
df <- get_playlist_audio_features('playlist_owner_username', 'playlist_uri')
And boom, you're done. I'm not sure what the max is, but I know it's over 300 songs because I have pulled that in.
trying to write data to my local datastore like:
drivingJson = json.loads(drivingdata)
for data in drivingJson:
keys = getKey()
index = 1
dataList = list()
for nodeData in data:
self.response.write(keys)
self.response.write("<br>")
lat = nodeData['lat']
lng = nodeData['long']
color = nodeData['color']
timestamp = datetime.datetime.strptime(nodeData['timestamp'], "%Y-%m-%d %H:%M:%S")
saveDrivingData = DrivingObject(
index = index,
lat = float(lat),
lng = float(lng),
timestamp = timestamp,
sessionKey = str(keys),
color = int(color)
)
dataList.append(saveDrivingData)
index +=1
ndb.put_multi_async(dataList)
this doesn't populate the datastore with any detail. But when i use
ndb.put_multi(dataList)
the datatstore populates well. How do I handle the asynchronous call. Thanks
put_multi_async returns a list of Future objects.
You need to call wait_any to make sure the put's complete before you return from the request.
Have a read about async all work has to complete before you return.
https://cloud.google.com/appengine/docs/python/ndb/async#using
All through the document it talks about waiting.