Fix potential XSS on Python using Bleach module

Fix potential XSS on Python using Bleach module - python

I am working on a cloud based API implemented using Lambda functions on AWS. We are scanning it for securities issues and Veracode SAST tool has identified a potential XSS on one of the endpoints.
The code looks as follow:
def lambda_handler(event: dict, _):
event = Event(event)
body = event.body
id_xss = body[“id”]
…
return {
status_code: 200,
body:{
"message": f"Successfully endpoint execution ${id_xss}.”
}
}
I am trying to fix it by using the Python module Bleach. So my new code would look as follows:
def lambda_handler(event: dict, _):
event = Event(event)
body = event.body
# SANITIZE THE INPUT
id_xss = bleach.clean(body[“id”], BLEACH_VALID_TAGS,
BLEACH_VALID_ATTRS, BLEACH_VALID_STYLES)
…
return {
status_code: 200,
body:{
"message": f"Successfully endpoint execution ${id_xss}.”
}
}
I've also tried checking that body[“id”] is a natural number.
However, Veracode SAST tool still consider it an issue. Am I using it properly? What would be the right approach?
Any comment or feedback will be welcomed.

Related

Snowflake External Functions using Azure Functions on Python not working

I want to create an external function that can be used to upsert rows into MongoDB. I've created the function, tested it locally using Postman and after publishing. I've followed the documentation from https://docs.snowflake.com/en/sql-reference/external-functions-creating-azure-ui.html and at first, I used the javascript function they proposed to test and worked. However, when I run it it python I get an error. This is the code.
import logging
import azure.functions as func
import pymongo
import json
import os
from datetime import datetime
cluster = pymongo.MongoClient(os.environ['MongoDBConnString'])
db = cluster[f"{os.environ['MongoDB']}"]
collection = db[f"{os.environ['MongoDBCollection']}"]
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
name = req.params.get('name')
if not name:
try:
req_body = req.get_json()
except ValueError:
pass
else:
name = req_body.get('name')
if name:
return func.HttpResponse(f"Hello, {name}. This HTTP triggered function executed successfully.")
else:
collection.update_one(
filter={
'_id':req_body['_id']
},
update={
'$set': {'segment_ids': req_body['segment_ids']}
},
upsert=True)
return func.HttpResponse(
json.dumps({"status_code": 200,
"status_message": "Upsert Success",
"Timestamp": datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%S"),
"_id": req_body['_id']}),
status_code=200,
mimetype="text/plain"
)
The error states that req_body is referenced before being defined, failing at line '_id':req_body['_id']. In Snowflake I've created an external function called mongoUpsert(body variant) and I am parsing a simple query to test.
select mongoUpsert(object_construct('_id', 'someuuid', 'segment_ids;, array_construct(1,2,3,4)))
From what I can tell, the function is not receiving the body I'm parsing in Snowflake for some reason. I don't know what I am doing wrong. Can anyone help me? Can anyone also explain how Snowflake is sending the parameters (as body, params, headers) and is there a way to specify if I want to parse a body or params?

External functions send and receive data in a particular format. All the parameters are sent in the request body.
https://docs.snowflake.com/en/sql-reference/external-functions-data-format.html
You can checkout snowflake-labs
for external functions samples.
There is one specifically for Azure Python functions that calls the Translator API.

I've started from scratch and stripped the layers one by one in Snowflake. So the Snowflake parameter is parsed to the body of the function but wrapped in an array which is then wrapped in another object called 'data'. Furthermore, it expects the same schema as a response back. So here's below the template to use for Azure Functions when using Python.
import logging
import azure.functions as func
import json
def main(req: func.HttpRequest) -> func.HttpResponse:
# Get body response from Snowflake
req_body = req.get_json()['data'][0][1]
###### Do Something
# Return Response
message = {"Task": "Completed"}
return func.HttpResponse(
json.dumps({'data': [[0, message]]}),
status_code=200)
As an example, I've used a simple JSON object:
{
"_id": "someuuid"
}
And created an external function in Snowflake called testfunc(body variant) and called it using select testfunc(object_construct('_id', 'someuuid')).
If you would log the response (using logging.info(req.get_json())) it would print the following
{
"data":
[
[
0,
{
"_id": "someuuid"
}
]
]
}
So to get the clean input I fed in snowflake I have the line
req_body = req.get_json()['data'][0][1]
However, I kept getting errors on the response until I tried just echoing the input and noticed it returned it without the wrapping. The returned body needs to be a string (hence why using json.dumps()) but it also needs the wrapping. So to print it out, first define a message you want (it may be a calculation of the input or an acknowledgement), then wrap the message in {'data': [[0, message]]} and finally compile it as a string (json.dumps())

How to correctly generate x-instagram-gis

I have written the following function in Python3.7 to generate x-instagram-gis. According to my research regarding this topic I have gathered that I only need the rhx_gis and variables (id: profile_id, first: int<50, after: end_cursor) to generate the x-instagram-gis.
def generate_x_instagram_gis(rhx_gis, cursor, profile_id):
params = {
"id": profile_id,
"first": 12,
"after": cursor,
}
json_params = json.dumps(params, separators=(',', ':'))
values = "{}:{}".format(rhx_gis, json_params)
return hashlib.md5(values.encode('utf-8')).hexdigest()
Running the following should return: 90bd6b662f328642477076d92d599064
rhx_gis = "7733066781d53e86a089eeb454c5446d"
cursor = "QVFBZWRqS0RnbGMtaXJhQzhlRW01R0I2YngtVXNQOGRTZzdHZEdseGcyVE1MdUxFYmYyY011Zkx6dFZtQUlsYWNvRl9DWnhtalpXZ2daSU5YQnFNTFBGRg=="
profile_id = "6822549659" #https://www.instagram.com/kimimatiasraikkonen/
print(generate_x_instagram_gis(rhx_gis, cursor, profile_id))
But it returns: f5e1e4be6612701d43523d707e36672b
For reference, these are the sources I've looked at:
https://github.com/rarcega/instagram-scraper/issues/205
How to perform unauthenticated Instagram web scraping in response to recent private API changes?
I'm not entirely sure what I'm doing incorrectly, when I run this with my entire program it doesn't work and this is the only part which causes an issue after much testing. Another thing I noted is that the MD5 is different when running on Python3.7 and Python2.7

I have figured it out.
The rhx_gis value is calculated based on the user-agent sent in the headers. The rhx_gis value I was obtaining was retrieved using python requests which sets its own user-agent (python-requests or something similar), whereas the rhx_gis value I was seeing on Postman was created using a different user-agent (set on Postman)
To fix this issue I had to set the same user-agent in python requests as the one set on Postman.
headers = {
'User-Agent' : '' # user-agent here
}
requests.get(url, headers=headers)

It seems that Instagram updated the API again, and a format for query_variable is changed. It looks like as follow:
{
"id":"25025320",
"include_reel":true,
"fetch_mutual":false,
"first":13,
"after":"QVFDZV9udFJKbVk3OGNlOE1LeGx3V1g0aEUyNFNSQTFUenhWOFVkWktTVzdpdUJRSk9EQXY3Ym9QQXFwTWJEci1pYklhSHFGQU1PTnl6QmhZbGpjalplSQ=="
}

Axiom and Flask POST and GET requests, passing arguments

I am learning how web apps work and after successfully creating connection between front and back end I managed to perform get request with axiom:
Route in my Flask
#app.route('/api/random')
def random_number():
k = kokos()
print(k)
response = {'randomNumber': k}
return jsonify(response)
my kokos() function
def kokos():
return (890)
Function that I call to get data from backend:
getRandomFromBackend () {
const path = `http://localhost:5000/api/random`
axios.get(path)
.then(response => {this.randomNumber = response.data.randomNumber})
.catch(error => {
console.log(error)
})
}
Now suppose I have an input field in my App with value that I want to use in the function kokos() to affect the result and what is going to be displayed in my app.. Can someone explain me how to do that?
Is this what POST requests are for and I have to post first and then get? Or can I use still GET and somehow pass "arguments"? Is this even GET and POST are for or am I making it too complicated for myself?
Is this the proper way to do these kind of thing? I just have a lot of code in python already written and want to simply exchange data between server and client.
Thank you, Jakub

You can add second argument
axios.get(path, {
params: {
id: 122
}
})
.then ...
You can pass id like this or anything it will be available in get params at python side like we pass in URL.
python side [Flask] (http://flask.pocoo.org/docs/1.0/quickstart/#accessing-request-data)
To access parameters submitted in the URL (?key=value) you can use the args attribute:
def random_number():
id = request.args.get('id', '')
k = kokos(id)
id will be passed kokos function if no id is provided it will be blank ''
you can read axios docu to make complex requests.
https://github.com/axios/axios
if any doubt please comment.

How to write a GRPC python unittest

We are using grpc as the RPC protocol for all our internal system. Most of the system is written in Java.
In Java, we can use InprocessServerBuilder for unittest. However, I haven't find a similar class in Python.
Can any one provide a sample code for how to do GRPC unittest in python?

How serendipitous that you have asked this question today; our unit test framework just entered code review. So for the time being the way to test is to use the full production stack to connect your client-side and server-side code (or to violate the API and mock a lot of internal stuff) but hopefully in days to weeks the much better solution will be available to you.

Some example code to get started:
proto
syntax = "proto3";
service MyLibrary {
rpc Search (Request) returns (Response);
}
message Request {
string id = 1;
}
message Response {
string status = 1;
}
python unit test
#!/usr/bin/env python
# coding=utf-8
import unittest
from grpc import StatusCode
from grpc_testing import server_from_dictionary, strict_real_time
import mylibrary_pb2
class TestCase(unittest.TestCase):
def __init__(self, methodName) -> None:
super().__init__(methodName)
myServicer = MyLibraryServicer()
servicers = {
mylibrary_pb2.DESCRIPTOR.services_by_name['MyLibrary']: myServicer
}
self.test_server = server_from_dictionary(
servicers, strict_real_time())
def test_search(self):
request = mylibrary_pb2.Request(
id=2,
)
method = self.test_server.invoke_unary_unary(
method_descriptor=(mylibrary_pb2.DESCRIPTOR
.services_by_name['Library']
.methods_by_name['Search']),
invocation_metadata={},
request=request, timeout=1)
response, metadata, code, details = method.termination()
self.assertTrue(bool(response.status))
self.assertEqual(code, StatusCode.OK)
if __name__ == '__main__':
unittest.main()

I find pytest-grpc is easy to follow and get it works in few minutes.
src: https://pypi.org/project/pytest-grpc/

Test Environment with Mocked REST API

Lets say I have a very simple web app which is presented as blue if the current president is a democrat and red if they are a republican. A REST API is used to get the current president, via the endpoint:
/presidents/current
which currently returns the json object:
{name: "Donald Trump", party: "Republican"}
So when my page loads I call the endpoint and I show red or blue depending on who is returned.
I wish to test this HTML/javascript page and I wish to mock the back-end so that I can control from within the test environment the API responses. For example:
def test_republican():
# configure the response for this test that the web app will receive when it connects to this endpoint
configure_endpoint(
"/presidents/current",
jsonify(
name="Donald Trump",
party="Republican"
)
)
# start the web app in the browser using selenium
load_web_app(driver, "http://localhost:8080")
e = driver.find_element_by_name("background")
assert(e.getCssValue("background-color") == "red")
def test_democrat():
# configure the response for this test that the web app will receive when it connects to this endpoint
configure_endpoint(
"/presidents/current",
jsonify(
name="Barack Obama",
party="Democrat"
)
)
# start the web app in the browser using selenium
load_web_app(driver, "http://localhost:8080")
e = driver.find_element_by_name("background")
assert(e.getCssValue("background-color") == "blue")
So the question is how should I implement the function configure_endpoint() and what libraries can you recommend me?

As #Kie mentioned, configure_endpoint implementation won't be enough, if you're going to stub the whole server-side within Selenium Python code. You would need a web server or whatever that will response via HTTP to requests from within testing environment.
It looks like the question is partially about testing of client-side code. What I see is that you're trying to make unit-test for client-side logic, but use integration testing suite in order to check this logic (it's strange).
The main idea is as follows.
You're trying to test client-side code. So, let's make mocks client-side too! Because this part of code is completely client-side related stuff.
If you actually want to have mocks, not stubs (watch the difference here: https://stackoverflow.com/a/3459491/882187) it is a better way to mock out HTTP requests inside your Javascript code. Just because you're testing a client-side piece of code, not some parts of server-side logic.
Having it isolated from whatever server-side is - is a great idea that you would love when your project become grow, while more and more endpoints will be appearing.
For example, you can use the following approach:
var restResponder = function() { // the original responder your client-side app will use
this.getCurrentPresident = function(successCallback) {
$.get('/presidents/current', callback);
}
};
var createMockResponder = function(president, party){ // factory that creates mocks
var myPresident = president;
var myParty = party;
return function() {
this.getCurrentPresident = function (successCallback) {
successCallback({"name": myPresident, "party": myParty});
}
};
}
// somewhere swap the original restResponder with new mockResponder created by 'createMockResponder'
// then use it in your app:
function drawColor(restResponder, backgroundEl) {
restResponder.getCurrentPresident(function(data){
if (data.party == "Democrat") $(backgroundEl).style('background-color', 'blue')
else if (data.party == "Republican") $(backgroundEl).style('background-color', 'red')
else console.info('Some strange response from server... Nevermind...');
});
}
Practically, this implementation depends on what do you have at the client-side as a framework. If jQuery, then my example is enough, but it looks very wordy. In case you have something more advanced, like AngularJS, you can do the same in 2-3 lines of code:
// Set up the mock http service responses
$httpBackend = $injector.get('$httpBackend');
// backend definition common for all tests
authRequestHandler = $httpBackend.when('GET', '/auth.py')
.respond({userId: 'userX'}, {'A-Token': 'xxx'});
Check out the docs: https://docs.angularjs.org/api/ngMock/service/$httpBackend
If you're still stick to the idea, that you need mocks inside Selenium tests, please
try this project: https://turq.readthedocs.io/en/latest/
It serves with Python DSL for describing REST responders.
Using turq your mocks will look as follows:
path('/presidents/current').json({'name':'Barack Obama', 'party': 'Democrat'}, jsonp=False)
Also, I would recommend to try stubs instead of mocks and use this Python module: mock-server https://pypi.python.org/pypi/mock-server/0.3.7
You are required to create the directory layout containing corresponding pre-populated JSON responses and to add some boilerplate code in order to make the mock-server respond on 'localhost:8080'. The directory layout for your example will look like this:
stub_obama/
presidents/
current/
GET_200.json # will contain {"name": "Barack Obama", "party": "Democrat"}
stub_trump/
presidents/
current/
GET_200.json # will contain {"name": "Donald Trump", "party": "Republican"}
But the mock_server is based on Tornado, it is very heavy solution for using in tests I think.
I hope, my answer is helpful and informative. Welcome to discuss it! I made tons of projects with Selenium, big and small tests, tested client-side and server-side.

I would use tornado web framework.
import json
import functools
import operator
from tornado import ioloop, web, gen
from tornado.options import define, options
define("data_file", default='default/mock.json', type=str)
class Handler(web.RequestHandler):
def data_received(self, chunk):
pass
def initialize(self, data):
self.data = data
#gen.coroutine
def get(self, *args, **kwargs):
path = self.request.path.split("/")[1:]
path = functools.reduce(
operator.add,
[[k, v[0].decode("utf-8")] for k, v in self.request.query_arguments.items()],
path
)
try:
self.write(functools.reduce(operator.getitem, path, self.data))
except KeyError:
self.set_status(404)
class Application(web.Application):
def __init__(self):
data = {}
with open(options.data_file) as data_file:
data = json.load(data_file)
handlers = [
('(.*)', Handler, {"data": data})
]
settings = dict(
gzip=True,
static_hash_cache=True,
)
web.Application.__init__(self, handlers, **settings)
def main():
io_loop = ioloop.IOLoop.instance()
backend_application = Application()
backend_application.listen(8001)
io_loop.start()
if __name__ == "__main__":
main()
This is a code I used for mocking a REST-API which is a standalone script, but it can be embedded into your test environment as well.
I defined a JSON file which defines the different path components and what should be returned. Like this:
{
"presidents": {
"current": {
"name": "Donald Trump",
"party": "Republican"
}
}
}
I saved this to a mock.json and called the script with a parameter mock_rest.py --data-file="./mock.json".
I hope that gives you a starting point and a good example.

If your load_web_app function uses the requests library to access the REST API, using requests-mock is a convenient way to fake that library's functionality for test purposes.

For those who stumble upon this question, and do not want to end up writing the code to create their own mock server implementations of the API, you can use Mocktastic, which is a downloadable desktop application for Windows, MacOS and Linux, which provides an easy to use GUI to setup your mock API servers.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Fix potential XSS on Python using Bleach module - python

Related

Snowflake External Functions using Azure Functions on Python not working

How to correctly generate x-instagram-gis

Axiom and Flask POST and GET requests, passing arguments

How to write a GRPC python unittest

Test Environment with Mocked REST API

Categories

Resources