How to read multi url-encoded text by reqparse.parser in python - python

I am using python (flask) for reading input from command line using following code but when i pass url-encoded strings(multi arguments seperated with space) as input it gets merged to single sting with space as '+'.
Sample.py
from flask_restful import reqparse
parser = reqparse.RequestParser()
parser.add_argument('output')
args = parser.parse_args()
indata=args['output']
print(urllib.parse.quote_plus(indata))
run:
python sample.py
curl http://localhost:5000/mypage -d "output=ld%22+to+the+term old+%7B%0A++++pub" -X POST -v
output:
ld%22+to+the+term+old+%7B%0A++++pubin
while i expect output to be
ld%22+to+the+term old+%7B%0A++++pubin (so that they can be spitted easily with separator)
How can i avoid such thing?

You can't use spaces in form parameters (which you pass with -d in curl).
I suggest you urlencode your parameters before passing them to curl or use some other http client that does that for you.
e.g. requests or httpie

Related

Content limit to update/create GitHub file with API with Python

I try to use GitHub API to update one of my files but i have some error to update some files with large size. First of all, I have to mention that in https://docs.github.com/en/rest/repos/contents#size-limits mentioned the files between 1-100MB use raw or object and greater than 100MB unable to sent. But size of my file is 149KB and don’t work.
I use this script to update my files:
import os , subprocess
Server_name_result = open(f"httpx_new.txt", "rb").read()
Server_name_encoded = subprocess.getoutput(f"""echo "$(cat httpx_new.txt)" | base64 -w 0 """)
sha_file = subprocess.getoutput("""curl -s -H "Authorization: Bearer <TOKEN>" https://api.github.com/repos/PrivetUser/PrivetRepo/contents/Servers.txt | jq -r '.sha' """)
os.system(f"""curl -X PUT -H "Accept: application/vnd.github+json" -H "Authorization: Bearer <TOKEN>" https://api.github.com/repos/PrivetUser/PrivetRepo/contents/Servers.txt -d '{{"message":"a new commit message","committer":{{"name":"name","email":"email#gmail.com"}},"content":"{Server_name_encoded}","sha":"{sha_file}"}}'""")
When my new file is 72KB the script work as well but in this case when my file size become 149KB The script doesn’t work at all and just pass from last command. I believe the problem is in content parameter and because this becomes very long command, It passed. I try double encoded but it doesn’t work. I tested most libraries and codes for update file content but none of them work and this one has this bug.
What is the best way to update file content with python and how i can solve this problem to execute my command!?

!curl commands in Python notebook fail with 500 Internal error

I am running the below code in Google Colab and get The server encountered an internal error or misconfiguration and was unable to complete your request. If I am running the command without passing in the variable $data like below, it runs perfectly fine. Only when I'm looping through the file and passing variables it seems to be failing
import csv
import json
reader = csv.reader(open('/content/drive/MyDrive/file5.csv'))
for row in reader:
data = {"snps": row[0], "pop": "YRI", "r2_threshold": "0.9", "maf_threshold": "0.01"}
data = json.dumps(data)
data = "'{}'".format(data)
!curl -k -H "Content-Type: application/json" -X POST -d "$data" 'https://ldlink.nci.nih.gov/LDlinkRest/snpclip?token=e3e559472899'
This works:
!curl -k -H "Content-Type: application/json" -X POST -d '{"snps": "rs3\nrs4", "pop":"YRI", "r2_threshold": "0.1", "maf_threshold": "0.01"}' 'https://ldlink.nci.nih.gov/LDlinkRest/snpclip?token=e3e559472899'
UPDATE: Actually, ipython does allow you to run ! escapes in a loop; the actual error in your code is purely in the incorrect quoting (especially the addition of single quotes around the data value, but there could be more).
Original (partially incorrect) answer below.
The ! escape tells your notebook (Google Colab, Jupyter, or what have you; basically whatever is running ipython as a kernel or similar) to leave Python and run a shell command. Python itself has no support for this; the closest approximation would be something like
import subprocess
...
for row in reader:
data = {"snps": row[0], "pop": "YRI", "r2_threshold": "0.9", "maf_threshold": "0.01"}
data = json.dumps(data)
# This was wrong on so many levels
# data = "'{}'".format(data)
subprocess.run(['curl', '-k',
'-H', "Content-Type: application/json",
'-X', 'POST', '-d', data,
'https://ldlink.nci.nih.gov/LDlinkRest/snpclip?token=e3e559472899'],
text=True, check=True)
though avoiding subprocess and running Python urllib or requests code to perform the POST would be more efficient and elegant, and give you more control over what gets sent and how it gets handled.
How to properly quote strings when translating between shell commands and Python requires you to understand the shell's quoting behavior. I'll just briefly note that I left double quotes where they were not incorrect in your original command, but otherwise preferred single quotes, and of course, data now refers to a proper Python variable with that name, not a shell variable with the same name.
To reiterate: ipython (which is what your notebook is an interface to) knows how to run both Python code and shell scipt code via !; but once you ask it to run Python code, ipython hands it over to Python proper, and you are no longer in ipython.

How to get data from web in python using curl?

In bash when I used
myscript.sh
file="/tmp/vipin/kk.txt"
curl -L "myabcurlx=10&id-11.com" > $file
cat $file
./myscript.sh gives me below output
1,2,33abc
2,54fdd,fddg3
3,fffff,gfr54
When I tried to fetch it using python and tried below code -
mypython.py
command = curl + ' -L ' + 'myabcurlx=10&id-11.com'
output = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE).stdout.read().decode('ascii')
print(output)
python mypython.py throw error, Can you please point out what is wrong with my code.
Error :
/bin/sh: line 1: &id=11: command not found
Wrong Parameter
command = curl + ' -L ' + 'myabcurlx=10&id-11.com'
Print out what this string is, or just think about it. Assuming that curl is the string 'curl' or '/usr/bin/curl' or something, you get:
curl -L myabcurlx=10&id-11.com
That’s obviously not the same thing you typed at the shell. Most importantly, that last argument is not quoted, and it has a & in the middle of it, which means that what you’re actually asking it to do is to run curl in the background and then run some other program that doesn’t exist, as if you’d done this:
curl -L myabcurlx=10 &
id-11.com
Obviously you could manually include quotes in the string:
command = curl + ' -L ' + '"myabcurlx=10&id-11.com"'
… but that won’t work if the string is, say, a variable rather than a literal in your source—especially if that variable might have quote characters within it.
The shlex module has helpers to quoting things properly.
But the easiest thing to do is just not try to build a command line in the first place. You aren’t using any shell features here, so why add the extra headaches, performance costs, problems with the shell getting in the way of your output and retcode, and possible security issues for no benefit?
Make the arguments a list rather than a string:
command = [curl, '-L', 'myabcurlx=10&id-11.com']
… and leave off the shell=True
And it just works. No need to get spaces and quotes and escapes right.
Well, it still won’t work, because Popen doesn’t return output, it’s a constructor for a Popen object. But that’s a whole separate problem—which should be easy to solve if you read the docs.
But for this case, an even better solution is to use the Python bindings to libcurl instead of calling the command-line tool. Or, even better, since you’re not using any of the complicated features of curl in the first place, just use requests to make the same request. Either way, you get a response object as a Python object with useful attributes like text and headers and request.headers that you can’t get from a command line tool except by parsing its output as a giant string.
import subprocess
fileName="/tmp/vipin/kk.txt"
with open(fileName,"w") as f:
subprocess.read(["curl","-L","myabcurlx=10&id-11.com"],stdout=f)
print(fileName)
recommended approaches:
https://docs.python.org/3.7/library/urllib.request.html#examples
http://docs.python-requests.org/en/master/user/install/

Newlines removed in POST request body? (Google App Engine)

I am building a REST API on Google App Engine (not using Endpoints) that will allow users to upload a CSV or tab-delimited file and search for potential duplicates. Since it's an API, I cannot use <form>s or the BlobStore's upload_url. I also cannot rely on having a single web client that will call this API. Instead, ideally, users will send the file in the body of the request.
My problem is, when I try to read the content of a tab-delimited file, I find that all newline characters have been removed, so there is no way of splitting the content into rows.
If I check the content of the file directly on the Python interpreter, I see that tabs and newlines are there (output is truncated in the example)
>>> with open('./data/occ_sample.txt') as o:
... o.read()
...
'id\ttype\tmodified\tlanguage\trights\n123456\tPhysicalObject\t2015-11-11 11:50:59.0\ten\thttp://creativecommons.org/licenses/by-nc/3.0\n...'
The RequestHandler logs the content of the request body:
import logging
class ReportApi(webapp2.RequestHandler):
def post(self):
logging.info(self.request.body)
...
So when I call the API running in the dev_appserver via curl
curl -X POST -d #data/occ_sample.txt http://localhost:8080/api/v0/report
This shows up in the logs:
id type modified language rights123456 PhysicalObject 2015-11-11 11:50:59.0 en http://creativecommons.org/licenses/by-nc/3.0
As you can see, there is nothing between the last value of the headers and the first record (rights and 123456 respectively) and the same happens with the last value of each record and the first one of the next.
Am I missing something obvious here? I have tried loading the data with self.request.body, self.request.body_file and self.request.POST, and none seem to work. I also tried applying the Content-Type values text/csv, text/plain, application/csv in the request headers, with no success. Should I add a different Content-Type?
You are using the wrong curl command-line option to send your file data, and it is this option that is stripping the newlines.
The -d option parses out your data and sends a application/x-www-form-urlencoded request, and it strips newlines. From the curl manpage:
-d, --data <data>
[...]
If you start the data with the letter #, the rest should be a file name to read the data from, or - if you want curl to read the data from stdin. Multiple files can also be specified. Posting data from a file named 'foobar' would thus be done with --data #foobar. When --data is told to read from a file like that, carriage returns and newlines will be stripped out.
Bold emphasis mine.
Use the --data-binary option instead:
--data-binary <data>
(HTTP) This posts data exactly as specified with no extra processing whatsoever.
If you start the data with the letter #, the rest should be a filename. Data is posted in a similar manner as --data-ascii does, except that newlines and carriage returns are preserved and conversions are never done.
You may want to include a Content-Type header in that case; of course this depends on your handler if you care about that header.

NOAA Weather REST API causes error when requesting with curl

I am trying to write a python program using NOAA's Climate Data Online REST Web Services (http://www.ncdc.noaa.gov/cdo-web/webservices/v2#data). But, I am running into errors in my request responses. When attempting a request with curl from command line I input:
curl -H "token:<MYTOKEN>" http://www.ncdc.noaa.gov/cdo-web/api/v2/data?datasetid=GHCND&locationid=ZIP:22405&startdate=1999-10-05&enddate=1999-10-25
It returns this response:
[1] 24322
[2] 24323
[3] 24324
phil#philUbu:~$ <?xml version="1.0" encoding="UTF-8" standalone="yes"?><response><statusCode>400</statusCode><userMessage>There was an error with the request.</userMessage><developerMessage>Required parameter 'startdate' is missing.</developerMessage></response>
[1] Done curl -H "token:..." http://www.ncdc.noaa.gov/cdo-web/api/v2/data?datasetid=GHCND
[2]- Done locationid=ZIP:22405
[3]+ Done startdate=1999-10-05
For some reason it thinks I am missing the startdate, but I have included it and it is in the proper format according to the documentation. Does anybody have any ideas of what the problem could be?
The ampersands in the url are probably being parsed by your shell. Put single quotes around it:
curl -H "token:<MYTOKEN>" 'http://www.ncdc.noaa.gov/cdo-web/api/v2/data?datasetid=GHCND&locationid=ZIP:22405&startdate=1999-10-05&enddate=1999-10-25'

Categories

Resources