Pychef api for getting info for a single node

Pychef api for getting info for a single node - python

I'm trying to create an API that gets all the information for a single node from a chef server.
def get_nodeInfo(self, name):
the above is the methods head, so I pass the node name here. I have tried a lot of different methods found on internet but I keep getting "ChefServerNotFound: object not found" error. Does anyone have any advice for me on this.
result_set = chef.Search('node', q="name:test*")
for result in result_set:
node = chef.Node(result["name"])
print node
I used the above code.
Thank you in advance

First off, please don't comment on old non-bugs to try and get attention. It pisses off maintainers and is not productive. Second, you don't need to use search at all to get a single node's data, just do Node(name) it will load.

Related

Struggling with how to iterate data

I am learning Python3 and I have a fairly simple task to complete but I am struggling how to glue it all together. I need to query an API and return the full list of applications which I can do and I store this and need to use it again to gather more data for each application from a different API call.
applistfull = requests.get(url,authmethod)
if applistfull.ok:
data = applistfull.json()
for app in data["_embedded"]["applications"]:
print(app["profile"]["name"],app["guid"])
summaryguid = app["guid"]
else:
print(applistfull.status_code)
I next have I think 'summaryguid' and I need to again query a different API and return a value that could exist many times for each application; in this case the compiler used to build the code.
I can statically call a GUID in the URL and return the correct information but I haven't yet figured out how to get it to do the below for all of the above and build a master list:
summary = requests.get(f"url{summaryguid}moreurl",authmethod)
if summary.ok:
fulldata = summary.json()
for appsummary in fulldata["static-analysis"]["modules"]["module"]:
print(appsummary["compiler"])
I would prefer to not yet have someone just type out the right answer but just drop a few hints and let me continue to work through it logically so I learn how to deal with what I assume is a common issue in the future. My thought right now is I need to move my second if up as part of my initial block and continue the logic in that space but I am stuck with that.

You are on the right track! Here is the hint: the second API request can be nested inside the loop that iterates through the list of applications in the first API call. By doing so, you can get the information you require by making the second API call for each application.

import requests
applistfull = requests.get("url", authmethod)
if applistfull.ok:
data = applistfull.json()
for app in data["_embedded"]["applications"]:
print(app["profile"]["name"],app["guid"])
summaryguid = app["guid"]
summary = requests.get(f"url/{summaryguid}/moreurl", authmethod)
fulldata = summary.json()
for appsummary in fulldata["static-analysis"]["modules"]["module"]:
print(app["profile"]["name"],appsummary["compiler"])
else:
print(applistfull.status_code)

Delete Specific branch with Python-Gitlab module

I am extreme beginner in writing python scripts as I am learning it currently.
I am writing a code where I am going to extract the branches which I have which are named something like tobedeleted_branch1 , tobedeleted_branch2 etc with the help of python-gitlab module.
With so much of research and everything, I was able to extract the names of the branch with the script I have given bellow, but now what I want is, I need to delete the branches which are getting printed.
So I had a plan that, I will go ahead and store the print output in a variable and will delete them in a go, but I am still not able to store them in a variable.
Once I store the 'n' number of branches in that variable, I want to delete them.
I went through the documentation but I couldn't figure out how can I make use of it in python script.
Module: https://python-gitlab.readthedocs.io/en/stable/index.html
Delete branch with help of module REF: https://python-gitlab.readthedocs.io/en/stable/gl_objects/branches.html#branches
Any help regarding this is highly appreciated.
import gitlab, os
TOKEN = "MYTOKEN"
GITLAB_HOST = 'MYINSTANCE'
gl = gitlab.Gitlab(GITLAB_HOST, private_token=TOKEN)
# set gitlab group id
group_id = 6
group = gl.groups.get(group_id, lazy=True)
#get all projects
projects = group.projects.list(include_subgroups=True, all=True)
#get all project ids
project_ids = []
for project in projects:
project_ids.append((project.id))
print(project_ids)
for project in project_ids:
project = gl.projects.get(project)
branches = project.branches.list()
for branch in branches:
if "tobedeleted" in branch.attributes['name']:
print(branch.attributes['name'])
Also, I am very sure this is not the clean way to write the script. Can you please drop your suggestions on how to make it better ?
Thanks

Branch objects have a delete method.
for branch in project.branches.list(as_list=False):
if 'tobedeleted' in branch.name:
branch.delete()
You can also delete a branch by name if you know its exact name already:
project.branches.delete('exact-branch-name')
As a side note:
The other thing you'll notice I've done is add the as_list=False argument to .list(). This will make sure that you paginate through all branches. Otherwise, you'll only get the first page (default 20 per page) of branches. The same is true for most list methods.

Updated approach to Google search with python

I was trying to use xgoogle but I has not been updated for 3 years and I just keep getting no more than 5 results even if I set 100 results per page. If anyone uses xgoogle without any problem please let me know.
Now, since the only available(apparently) wrapper is xgoogle, the option is to use some sort of browser, like mechanize, but that is gonna make the code entirely dependant on google HTML and they might change it a lot.
Final option is to use the Custom search API that google offers, but is has a redicolous 100 requests per day limit and a pricing after that.
I need help on which direction should I go, what other options do you know of and what works for you.
Thanks !

It only needs a minor patch.
The function GoogleSearch._extract_result (Line 237 of search.py) calls GoogleSearch._extract_description (Line 258) which fails causing _extract_result to return None for most of the results therefore showing fewer results than expected.
Fix:
In search.py, change Line 259 from this:
desc_div = result.find('div', {'class': re.compile(r'\bs\b')})
to this:
desc_div = result.find('span', {'class': 'st'})
I tested using:
#!/usr/bin/python
#
# This program does a Google search for "quick and dirty" and returns
# 200 results.
#
from xgoogle.search import GoogleSearch, SearchError
class give_me(object):
def __init__(self, query, target):
self.gs = GoogleSearch(query)
self.gs.results_per_page = 50
self.current = 0
self.target = target
self.buf_list = []
def __iter__(self):
return self
def next(self):
if self.current >= self.target:
raise StopIteration
else:
if(not self.buf_list):
self.buf_list = self.gs.get_results()
self.current += 1
return self.buf_list.pop(0)
try:
sites = {}
for res in give_me("quick and dirty", 200):
t_dict = \
{
"title" : res.title.encode('utf8'),
"desc" : res.desc.encode('utf8'),
"url" : res.url.encode('utf8')
}
sites[t_dict["url"]] = t_dict
print t_dict
except SearchError, e:
print "Search failed: %s" % e

I think you misunderstand what xgoogle is. xgoogle is not a wrapper; it's a library that fakes being a human user with a browser, and scrapes the results. It's heavily dependent on the format of Google's search queries and results pages as of 2009, so it's no surprise that it doesn't work the same in 2013. See the announcement blog post for more details.
You can, of course, hack up the xgoogle source and try to make it work with Google's current format (as it turns out, they've only broken xgoogle by accident, and not very badly…), but it's just going to break again.
Meanwhile, you're trying to get around Google's Terms of Service:
Don’t misuse our Services. For example, don’t interfere with our Services or try to access them using a method other than the interface and the instructions that we provide.
They're been specifically asked about exactly what you're trying to do, and their answer is:
Google's Terms of Service do not allow the sending of automated queries of any sort to our system without express permission in advance from Google.
And you even say that's explicitly what you want to do:
Final option is to use the Custom search API that google offers, but is has a redicolous 100 requests per day limit and a pricing after that.
So, you're looking for a way to access Google search using a method other than the interface they provide, in a deliberate attempt to get around their free usage quota without paying. They are completely within their rights to do anything they want to break your code—and if they get enough hits from people doing things kind of thing, they will do so.
(Note that when a program is scraping the results, nobody's seeing the ads, and the ads are what pay for the whole thing.)
Of course nobody's forcing you to use Google. EntireWeb has an free "unlimited" (as in "as long as you don't use too much, and we haven't specified the limit") search API. Bing gives you a higher quota, and amortized by month instead of by day. Yahoo BOSS is flexible and super-cheap (and even offers a "discount server" that provides lower-quality results if it's not cheap enough), although I believe you're forced to type the ridiculous exclamation point. If none of them are good enough for you… then pay for Google.

Elastic Search [PUT] error

I'm having some trouble getting elastic search integrated with an existing application, but it should be a fairly straightforward issue. I'm able to create and destroy indices but for some reason I'm having trouble getting data into elastic search and querying for it.
I'm using the pyes library and honestly finding the documentation to be less than helpful on this front. This is my current code:
def initialize_transcripts(database, mapping):
database.indices.create_index("transcript-index")
def index_course(database, sjson_directory, course_name, mapping):
database.put_mapping(course_name, {'properties': mapping}, "transcript-index")
all_transcripts = grab_transcripts(sjson_directory)
video_counter = 0
for transcript_tuple in all_transcripts:
data_map = {"searchable_text": transcript_tuple[0], "uuid": transcript_tuple[1]}
database.index(data_map, "transcript-index", course_name, video_counter)
video_counter += 1
database.indices.refresh("transcript-index")
def search_course(database, query, course_name):
search_query = TermQuery("searchable_text", query)
return database.search(query=search_query)
I'm first creating the database, and initializing the index, then trying to add data in and search it with the second two methods. I'm currently getting the following error:
raise ElasticSearchException(response.body, response.status, response.body)
pyes.exceptions.ElasticSearchException: No handler found for uri [/transcript-index/test-course] and method [PUT]
I'm not quite sure how to approach it, and the only reference I could find to this error suggested creating your index beforehand which I believe I am already doing. Has anyone run into this error before? Alternatively do you know of any good places to look that I might not be aware of?
Any help is appreciated.

For some reason, adding the ID to the index, despite the fact that it is shown in the starting documentation: (http://pyes.readthedocs.org/en/latest/manual/usage.html) doesn't work, and in fact causes this error.
Once I removed the video_counter argument to index, this worked perfectly.

Parsing atom response from InsertCalendar in Python on GAE (Calendar API)

Using the GData Calendar API via App Engine in Python, when you create an event there are handy little helper methods to parse the response:
new_event = calendar_service.InsertEvent(event, '/calendar/feeds/default/private/full')
helper = new_event.GetEditLink().href
When you create a new calendar:
new_calendar = gd_client.InsertCalendar(new_calendar=calendar)
I was wondering if there might be related methods that I just can't find in the documentation (or that are--perhaps--undocumented)?
I need to store the new calendar's ID in the datastore, so I would like something along the lines of:
new_calendar = gd_client.InsertCalendar(new_calendar=calendar)
new_calendar.getGroupLink().href
In my code, the calendar is being created, and G is returning the Atom response with a 201, but before I get into using elementtree or atom.parse to extract the desired element, I was hoping someone here might be able to help.
Many thanks in advance :)

I've never used the GData API, so I could be wrong, but...
It looks like GetLink() will return the link object for any specified rel. Seems like GetEditLink() just calls GetLink(), passing in the rel of the Edit link. So you should be able to call GetLink() on the response from InsertCalendar(), and pass in the rel of the Group link.
Here's the pydoc info that I used to figure this out: http://gdata-python-client.googlecode.com/svn/trunk/pydocs/gdata.calendar_resource.data.html

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.