How to prevent % signs from ignoring the following 2 charachters - python

I am trying to use requests to get a links content
r = requests.get('https://exampleurl.com/search/user_agent=Mozilla%2F5.0(Windows+NT+10.0%3B+Win64%3B+x64)+AppleWebKit%2F537.36+(KHTML,+like+Gecko)+Chrome%2F84.0.4147.105+Safari%2F537.36
the % signs ignore the following 2 characters thus the endpoint becomes invalid and returns nothing.
Probably a very beginner question, but any help is appreciated. :)

Python has a feature called 'Raw strings'. You create one by prefixing your string with an r. so in your case, it would be:
r = requests.get(r'https://exampleurl.com/search/user_agent=Mozilla%2F5.0(Windows+NT+10.0%3B+Win64%3B+x64)+AppleWebKit%2F537.36+(KHTML,+like+Gecko)+Chrome%2F84.0.4147.105+Safari%2F537.36')

Related

Issue using a variable with an r-string in Python

Fairly new to Python, and I've got a batch job that I now have to start saving some extracts from out to a company Sharepoint site. I've searched around and cannot seem to find a solution to the issue I keep running into. I need to pass a date into the filename, and was first having issues with using a normal string. If I just type out the entire thing as a raw string, I get the output I want:
x = r"\\mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts\2021-02-15_aRoute.xlsx"
print (x)
The output is: \mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts\2021-02-15_aRoute.xlsx
However, if I break the string into it's parts so I can get a parameter in there, I wind up having to toss an extra double-quote on the "x" parameter to keep the code from running into a "SyntaxError: EOL while scanning string literal" error:
x = r"\\mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts\""
timestamp = date_time_obj.date().strftime('%Y-%m-%d')
filename = "_aRoute.xlsx"
print (x + timestamp + filename)
But the output I get passes that unwanted double quote into my string: \mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts"2021-02-15_aRoute.xlsx
The syntax I need is clearly escaping me, I'm just trying to get the path built so I can save the file itself. If it happens to matter, I'm using pandas to write the file:
data = pandas.read_sql(sql, cnxn)
data.to_excel(string_goes_here)
Any help would be greatly appreciated!
Per the comment from #Matthias, as it turns out, an r-string can't end with a single backslash. The quick workaround, therefore, was:
x = r"\\mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts" + "\\"
The comment from #sammywemmy also linked to what looks to be a much more thorough solution.
Thank you both!

How to fix certain 'Line too long' errors in a Python file?

I have the following line inside a for loop (i.e. it's indented by 4 spaces):
abcdefgh_ijklm_nopqrstuvwxy = abcdefgh_ijklm_nopqrstuvwxy.append(abc_de)
The line is 80 characters long. How can I split it up so that it I do not get a 'Line too long' notification? Please note that I've changed the variable names for privacy reasons (not my code), but I can't modify the name of the variable, so naming it something shorter to fix the problem is not a viable option
As a secondary question, how would I split up a formatted string of the form:
data_header = f"FILE_{heading_angles}_{moment_of_inertia}_{mass_of_object}_{type_of_object}"
to span multiple lines?
I already tried
data_header = f"FILE_{heading_angles}_{moment_of_inertia}_"
f"{mass_of_object}_{type_of_object}"
but that gives me an indentation error.
Any kind of help would be greatly appreciated!
Hope that these points answer your questions:
To simplify your expressions, try to replace the variables with simpler ones before the expressions. This may be inappropriate, if more serious operations are needed. For example:
a = abcdefgh_ijklm_nopqrstuvwxy
b = abcdefgh_ijklm_nopqrstuvwxy.append(abc_de)
a = b
In your case, try using a forward-leaning backlash (\) at the end of the line. For example:
if a == True and \
b == False
Here is a link from another discussion on a similar matter.
Hope this helps.
For you data_header example - you need brackets.
For example:
data_header = (
f"FILE_{heading_angles}_{moment_of_inertia}_"
f"{mass_of_object}_{type_of_object}"
)

String splitting in python in CSV columns

So I am working with a CSV that has a many to one relationship and I have 2 problems I need assistance in solving. The first is that I have the string set up like
thisismystr=thisisanemail#addy.com,blah,blah,blah, startnewCSVcol
So I need to split the string twice, once on = and once on , as I am basically attempting to get the portion that is an e-mail address (thisisanemail#addy.com) so far I have figured out how to split the string on the = using something like this:
str = thisismystr=thisisanemail#addy.com,blah,blah,blah
print str.split("=")
Which returns this "thisisanemail#addy.com,blah,blah,blah"... however this leaves the ,blah,blah,blah portion to be removed... after a bit of research I am stumped as nothing explains how to remove from the middle, just the 1st part or the last part. Does anyone know how to do this?
For the 2nd part I need to do this from multiple line, so this is more of an advice question... is it best to plug this into a variable and loop through like (i = 1 for i, #endofCSV do splitcmd) or is there a more efficient manner to do this? I am more familiar with LUA and I am learning that the more I work with python the more it differs from LUA.
Please help. Thanks!
Does this solve your problem?
#!/usr/bin/env python
#-*- coding:utf-8 -*-
myString = 'thisismystr=thisisanemail#addy.com,blah,blah,blah'
a = myString.split('=')
b = []
for i in a:
b.extend(i.split(','))
print b
I believe you want the email out of strings in this format: 'thisismystr=thisisanemail#addy.com,blah,blah,blah'
This is how you would do that:
str = 'thisismystr=thisisanemail#addy.com,blah,blah,blah'
email = str.split('=')[1].split(',')[0]
print email

Counting characters that print, not processed, in a string

Dealing with an annoying issue using foreign characters (ģ,č,ŗ,ļ,ā,ē,ū,ī,ņ,š,ķ,ž and their capitals). They all have length using len not equal to 1, for example len('ī') is 2 (it shows up as \xc4\xab when processing text). I would like a function that gives back 1 for all those characters. Any help?
Kudos to Robᵩ for the explanatory webpage. A concise solution to my problem:
def varlen(string):
return len(string.decode('utf-8'))

[Python]How to deal with a string ending with one backslash?

I'm getting some content from Twitter API, and I have a little problem, indeed I sometimes get a tweet ending with only one backslash.
More precisely, I'm using simplejson to parse Twitter stream.
How can I escape this backslash ?
From what I have read, such raw string shouldn't exist ...
Even if I add one backslash (with two in fact) I still get an error as I suspected (since I have a odd number of backslashes)
Any idea ?
I can just forget about these tweets too, but I'm still curious about that.
Thanks : )
Prepending the string with r (stands for "raw") will escape all characters inside the string. For example:
print r'\b\n\\'
will output
\b\n\\
Have I understood the question correctly?
I guess you are looking a method similar to stripslashes in PHP. So, here you go:
Python version of PHP's stripslashes
You can try using raw strings by prepending an r (so nothing has to be escaped) to the string or re.escape().
I'm not really sure what you need considering I haven't seen the text of the response. If none of the methods you come up with on your own or get from here work, you may have to forget about those tweets.
Unless you update your question and come back with a real problem, I'm asserting that you don't have an issue except confusion.
You get the string from the Tweeter API, ergo the string does not show up in your code. “Raw strings” exist only in your code, and it is “raw strings” in code that can't end in a backslash.
Consider this:
def some_obscure_api():
"This exists in a library, so you don't know what it does"
return r"hello" + "\\" # addition just for fun
my_string = some_obscure_api()
print(my_string)
See? my_string happily ends in a backslash and your code couldn't care less.

Categories

Resources