I am trying to set ReadTheDocs.com (i.e. the commercial end of ReadTheDocs) versions' active state programatically.
This idea being that a branch, when created, has documentation built for it, and when the branch ends we delete the version documentation (or at least stop building for it).
The latter is, obviously, only cleanup and not that important (want, not need). But we'd strongly like to avoid having to use the project management interface to set each branch/version to active.
I've been trying to use the v2 REST API provided by RTD. I can extract version data from "GET https://readthedocs.com/api/v2/version/" and find the version I want to mess with, but I am unable to either send data back, or find something that lets me set Version.active=True for a given version id in their API.
I'm not hugely up on how to play with these APIs so any help would be much appreciated.
I am using python and the requests library.
I searched a solution for this, because I had the same problem at automating the build process of the documentation in connection with my Git server.
At the end I found two different ways to change the project versions and set them to active with a script. Both scripts emulate the http requests which are sent to the read-the-docs server. I have a local running instance with http (without https) and it works, but I don´t know if it works for https too.
Maybe it is necessary to capture the packets via Wireshark and adapt the script.
First script (using Python):
def set_version_active_on_rtd():
server_addr = "http://192.168.1.100:8000"
project_slug = "myProject"
rtd_user = 'mylogin'
rtd_password = 'mypassword'
with requests.session() as s:
url = server_addr + "/accounts/login/"
# fetch the login page
s.get(url)
if 'csrftoken' in s.cookies:
# Django 1.6 and up
csrftoken = s.cookies['csrftoken']
else:
# older versions
csrftoken = s.cookies['csrf']
login_data = dict(login=rtd_user, password=rtd_password, csrfmiddlewaretoken=csrftoken, next='/')
r = s.post(url, data=login_data, headers=dict(Referer=url))
url = server_addr+"/dashboard/"+project_slug+"/versions/"
if 'csrftoken' in s.cookies:
# Django 1.6 and up
csrftoken = s.cookies['csrftoken']
else:
# older versions
csrftoken = s.cookies['csrf']
'''
These settings which are saved in version_data, are configured normally with help of the webinterface.
To set a version active, it must be configured with
'version-<version_number>':'on'
and its privacy must be set like
'privacy-<version_number>':'public'
To disable a version, its privacy has to be set and no other entry with 'on' has to be supplied
'''
version_data = {'default-version': 'latest', 'version-latest': 'on', 'privacy-latest' : 'public', 'privacy-master':'public','csrfmiddlewaretoken': csrftoken}
r = s.post(url, data = version_data, headers=dict(Referer=url))
Second script (bash and cUrl):
#!/bin/bash
RTD_SERVER='http://192.168.1.100:8000'
RTD_LOGIN='mylogin'
RTD_PASSWORD='mypassword'
RTD_SLUG='myProject'
#fetch login page and save first cookie
curl -c cookie1.txt "$RTD_SERVER"/accounts/login/ > /dev/null
#extract token from first cookie
TOKEN1=$(tail -n1 cookie1.txt | awk 'NF>1{print $NF}')
#login and send first cookie and save second cookie
curl -b cookie1.txt -c cookie2.txt -X POST -d
"csrfmiddlewaretoken=$TOKEN1&login=$RTD_LOGIN&\
password=$RTD_PASSWORD&next=/dashboard/"$RTD_SLUG"/versions/"
"$RTD_SERVER"/accounts/login/ > /dev/null
#extract token from second cookie
TOKEN2=$(tail -n3 cookie2.txt | awk 'NF>1{print $NF}' | head -n1)
# send data for the versions to the rtd server using the second cookie
curl -b cookie2.txt -X POST -d "csrfmiddlewaretoken=$TOKEN2&default-
version=latest&version-master=on&privacy-master=public&\
version-latest=on&privacy-latest=public"
$RTD_SERVER"/dashboard/"$RTD_SLUG"/versions/ > /dev/null
#delete the cookies
rm cookie1.txt cookie2.txt
To set a default-version, it can be necessary to run the script twice, if the version was not set to active. At the first run for activating the version, and at the second run to set it as default-version.
Hope it helps
Related
appreciate your help here, thanks in advance.
My Problem:
I am using Python's Requests module for get/post requests to a Django REST API behind a work proxy. I am unable to get past the proxy and I encounter an error. I have summarised this below:
Using the following code (what I've tried):
s = requests.Session()
s.headers = {
"User-Agent": [someGenericUserAgent]
}
s.trust_env = False
proxies = {
'http': 'http://[domain]\[userName]:[password]#[proxy]:8080',
'https': 'https://[domain]\[userName]:[password]#[proxy]:8080'
}
os.environ['NO_PROXY'] = [APIaddress]
os.environ['no_proxy'] = [APIaddress]
r = s.post(url=[APIaddress], proxies=proxies)
With this I get an error:
... OSError('Tunnel connection failed: 407 Proxy Authentication Required')))
Additional Context:
This is on a windows 10 machine.
Work uses a "automatic proxy setup" script (.pac), looking at the script there are a number of proxies that will be automatically assigned depending on the IP address of the machine. All of these proxies I have tried under [proxy] above, with the same error.
The above works when I am not running through the work network, and I don't use the additional proxy settings (removing proxies=proxies). i.e on my home network.
I have no issues with a get request via my browser via the proxy to the Django REST API view.
Things I am uncertain about:
I don't know if I am using the right [proxy]. Is there a way to verify this? I have tried using [findMyProxy].com sites, using the ip addresses it still doesn't work.
I don't know if I am using [domain]\[userName] correctly. is a \ correct? my work does use a domain.
I'm certain it is not a requests issue, as trying to do pip install --proxy http://[domain]\[userName]:[password]#[proxy]:8080 someModule bares the same 407 error.
Any help appreciated.
How I came to the solution:
I used curl to establish a <200 response>, after a lot of trial and error, success was:
$ curl -U DOMAIN\USER:PW -v -x http://LOCATION_OF_PAC_FILE --proxy-ntlm www.google.com
Where -U is the domain, user name and password.
-V is verbose, made it easier for debugging.
-x is the proxy, in my case the location to the .pac file. Curl automatically determines the proxy IP from the PAC. Requests does not do this by default (that I know of).
I used curl to determine that my proxy was using ntlm.
www.google.com as an external site to test the proxy auth.
NOTE: only 1 off \ between domain and username.
Trying to make request use ntlm, I found was impossible by default and instead used requests-ntlm2.
The PAC file through request-ntlm2 did not work so I used pypac to autodiscover the PAC file then determined the proxy based on the URL.
The working code is as follows:
from pypac import PACSession
from requests_ntlm2 import (
HttpNtlmAuth,
HttpNtlmAdapter,
NtlmCompatibility
)
username = 'DOMAIN\\USERNAME'
password = 'PW'
# Don't need the following thanks to pypacs
# proxy_ip = 'PROXY_IP'
# proxy_port = "PORT"
# proxies = {
# 'http': 'http://{}:{}'.format(proxy_ip, proxy_port),
# 'https': 'http://{}:{}'.format(proxy_ip, proxy_port)
# }
ntlm_compatibility = NtlmCompatibility.NTLMv2_DEFAULT
# session = requests.Session() <- replaced with PACSession()
session = PACSession()
session.mount(
'https://',
HttpNtlmAdapter(
username,
password,
ntlm_compatibility=ntlm_compatibility
)
)
session.mount(
'http://',
HttpNtlmAdapter(
username,
password,
ntlm_compatibility=ntlm_compatibility
)
)
session.auth = HttpNtlmAuth(
username,
password,
ntlm_compatibility=ntlm_compatibility
)
# Don't need the following thanks to pypacs
# session.proxies = proxies
response = session.get('http://www.google.com')
Im doing this curl request to get some data in python , how can i get session id of curl request so that i can reuse again.
commands.getoutput("curl -H \"Content-Type:application/json\" -k -u username:password -X GET https://10.39.11.4/wapi/v2.7/member -s"
curl has a built-in cookie jar meant for storing just the cookies and sending them back to the server.
To store cookies in the cookie jar we use the -c flag and give it the name of a file we wish to store the cookies in.
$ curl -X POST -c cookies.txt -u "user1:password1" website.com/login
$ cat cookies.txt
# Netscape HTTP Cookie File
# http://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
#HttpOnly_quiet-waters-1228.herokuapp.com FALSE / FALSE 0 _curl_test_app_rails_
session cm53d2RJN1VncV........
There you can find session ID .
As Mentioned by Daniel Stenberg (founder of cURL):
use -b cookies.txt in the subsequent curl command line to make use of those cookies
You should check out the requests library, it has a Session management described here: http://docs.python-requests.org/en/master/user/advanced/#session-objects
s = requests.Session()
s.auth = ('username', 'password')
s.headers.update({'Content-Type': 'application/json'})
r = s.get('https://10.39.11.4/wapi/v2.7/member')
If you want to save a session and then load it, you should use dict_from_cookiejar and cookiejar_from_dict like this:
# Save session
s_dict = requests.utils.dict_from_cookiejar(s.cookies)
# Load session
cookies = requests.utils.cookiejar_from_dict(s_dict)
s = requests.session(cookies=cookies)
I need to export a massive number of events from splunk. Hence for performance reasons i resorted to directly using the REST API in my python code rather than using the Splunk SDK itself.
I found the following curl command to export results. This is also available here:-
curl -ku username:password
https://splunk_host:port/servicesNS/admin/search/search/jobs/export -d
search=“search index%3D_internal | head 3” -d output_mode=json
My attempt at simulating this using python's http functions is as follows:-
//assume i have authenticated to splunk and have a session key
base_url = "http://splunkhost:port"
search_job_urn = '/services/search/jobs/export'
myhttp = httplib2.Http(disable_ssl_certificate_validation=True)
searchjob = myhttp.request(base_url + search_job_urn, 'POST', headers=
{'Authorization': 'Splunk %s' % sessionKey},
body=urllib.urlencode({'search':'search index=indexname sourcetype=sourcename'}))[1]
print searchjob
The last print keeps printing all results until done. For large queries i get "Memory Errors". I need to be able to read results in chunks (say 50,000) and write them to a file and reset the buffer for searchjob. How can i accomplish that?
Despite looking through the API documentation, I couldn't find anything explaining why Github needs cookies enabled, or how to go about it. I may have missed it tho.
I'd like to use the native Webapp2 framework on GAE in Python with Urllib2, and stay away from high-level libraries so that I can learn this from the inside out.
Snippet from my code:
# Get user name
fields = {
"user" : username,
"access_token" : access_token
}
url = 'https://github.com/users/'
data = urllib.urlencode(fields)
result = urlfetch.fetch(url=url,
payload=data,
method=urlfetch.POST
)
username = result.content
result.content returns:
Cookies must be enabled to use GitHub.
I tried putting the following (ref) at the top of my file but it didn't work:
import cookielib
jar = cookielib.FileCookieJar("cookies")
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(jar))
It seems to be related to the api endpoint. From the official doc: All API access is over HTTPS, and accessed from the api.github.com domain (or through yourdomain.com/api/v3/ for enterprise). All data is sent and received as JSON.
You get an error about cookies because you're calling the GitHub website which requires a bunch of stuff to work like cookies and javascript. That's why you need a specific endpoint for the api. The following code sent me back a HTTP 200, note that I'm using the requests library to do HTTP call but you can use whichever you like.
>>> import urllib
>>> import requests
>>> url = "https://api.github.com"
>>> fields = {"user": "Ketouem"}
>>> string_query = urllib.urlencode(fields)
>>> response = requests.get(url + '?' + string_query)
>>> print response.status_code
200
>>> print response.content
'{"current_user_url":"https://api.github.com/user","authorizations_url":"https://api.github.com/authorizations","code_search_url":"https://api.github.com/search/code?q={query}{&page,per_page,sort,order}","emails_url":"https://api.github.com/user/emails","emojis_url":"https://api.github.com/emojis","events_url":"https://api.github.com/events","feeds_url":"https://api.github.com/feeds","following_url":"https://api.github.com/user/following{/target}","gists_url":"https://api.github.com/gists{/gist_id}","hub_url":"https://api.github.com/hub","issue_search_url":"https://api.github.com/search/issues?q={query}{&page,per_page,sort,order}","issues_url":"https://api.github.com/issues","keys_url":"https://api.github.com/user/keys","notifications_url":"https://api.github.com/notifications","organization_repositories_url":"https://api.github.com/orgs/{org}/repos/{?type,page,per_page,sort}","organization_url":"https://api.github.com/orgs/{org}","public_gists_url":"https://api.github.com/gists/public","rate_limit_url":"https://api.github.com/rate_limit","repository_url":"https://api.github.com/repos/{owner}/{repo}","repository_search_url":"https://api.github.com/search/repositories?q={query}{&page,per_page,sort,order}","current_user_repositories_url":"https://api.github.com/user/repos{?type,page,per_page,sort}","starred_url":"https://api.github.com/user/starred{/owner}{/repo}","starred_gists_url":"https://api.github.com/gists/starred","team_url":"https://api.github.com/teams","user_url":"https://api.github.com/users/{user}","user_organizations_url":"https://api.github.com/user/orgs","user_repositories_url":"https://api.github.com/users/{user}/repos{?type,page,per_page,sort}","user_search_url":"https://api.github.com/search/users?q={query}{&page,per_page,sort,order}"}'
The following bash function worked fine for me:
githubDelRepo(){
if [[ $# != 2 ]] ; then
echo "Needs username and repo-name as args 1 and 2 respectively."
else
curl -X DELETE -u "${1}" https://api.github.com/repos/"${1}"/"${2}"
fi
}
put in ~/.bashrc then source ~/.bashrc and run with githubDelRepo myusername myreponame
I had the same problem. I got into my repos by starting from my organization page.
In python facebook SDK , we are missing to upload photos , rather i want to upload photos without using the existing SDKs ..
so how can implments thsese steps in python ??
You can publish a photo to a specific, existing photo album with a POST to http://graph.facebook.com/ALBUM_ID/photos.
curl -F 'access_token=...' \
-F 'source=#file.png' \
-F 'message=Caption for the photo' \
https://graph.facebook.com/me/photos
This isn't a place to get work done cheaply but to learn about programming, so doubtful that someone will do your work for you if he doesn't already have the code lying around.
Anyhow a few pointers: Get yourself Tamper Data or a similar plugin for whatever browser you're using and document all the data that is sent when you're doing the task manually. Afterwards you just have to use the Python urllib lib to imitate that and parse the html documents for the non static parts you need.
No idea about facebook (no account) but you'll presumably have to login and keep cookies, so a small example of something pretty similar I had lying around. First use the login site to login and then maneuver to the site I'm interested in and get the data..
def get_site_content():
def get_response(response):
content = response.info()['Content-Type']
charset = content[content.rfind('=') + 1:] # ugly hack, check orderly
return response.read().decode(charset)
cj = http.cookiejar.CookieJar()
opener = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
data = urllib.parse.urlencode({ "log" : "1", "j_username" : USERNAME, "j_password" : PASSWORD })
opener.open("url", data)
data = urllib.parse.urlencode({ "months" : "0" })
resp = opener.open("url2", data)
return get_response(resp)
You can use urllib2 or the poster module to make post requests, take a look at these questions:
python urllib2 file send problem
Using MultipartPostHandler to POST form-data with Python