Force python to use ' instead of " for strings - python

I have a script that migrates data from one database to another written in python and sql using the psycopg2 library.
I retrieve a string from the first database and store it for later in a list so I can put it into the second database when I finish gathering all the data I need.
If the string has an apostrophe in it then python will represent the string using " ". The problem with this is that sql interprets " " as specifying a column name and ' ' for strings whereas python interprets both as strings. I wish to force python to use apostrophes to represent the string (or another suitable workaround)
Google has not turned up anything. Can't even find a mention of the fact that python will use " " when you have apostrophes in your string. I have considered replacing apostrophes in my string with a different character and converting it back later but this seems like a clumsy solution.
For example
MyString = 'it\'s'
MyList = [MyString]
print(MyList) # returns "it's"
print(MyList[0]) # returns it's
When I insert the new values into the database I am in inserting the whole list as the values.
INSERT INTO table VALUES MyList
This is where the error crops up because the string is using " " instead of ' '.
A solution on either the python or sql side would work.

Found a fix. It's a bit janky but it works. Convert the list into a string. Use the replace function like so:
MyString = MyString.replace('"',"'")
And then use that string instead.

Related

How to convert a regular string to a raw string? [duplicate]

I have a string s, its contents are variable. How can I make it a raw string? I'm looking for something similar to the r'' method.
i believe what you're looking for is the str.encode("string-escape") function. For example, if you have a variable that you want to 'raw string':
a = '\x89'
a.encode('unicode_escape')
'\\x89'
Note: Use string-escape for python 2.x and older versions
I was searching for a similar solution and found the solution via:
casting raw strings python
Raw strings are not a different kind of string. They are a different way of describing a string in your source code. Once the string is created, it is what it is.
Since strings in Python are immutable, you cannot "make it" anything different. You can however, create a new raw string from s, like this:
raw_s = r'{}'.format(s)
As of Python 3.6, you can use the following (similar to #slashCoder):
def to_raw(string):
return fr"{string}"
my_dir ="C:\data\projects"
to_raw(my_dir)
yields 'C:\\data\\projects'. I'm using it on a Windows 10 machine to pass directories to functions.
raw strings apply only to string literals. they exist so that you can more conveniently express strings that would be modified by escape sequence processing. This is most especially useful when writing out regular expressions, or other forms of code in string literals. if you want a unicode string without escape processing, just prefix it with ur, like ur'somestring'.
For Python 3, the way to do this that doesn't add double backslashes and simply preserves \n, \t, etc. is:
a = 'hello\nbobby\nsally\n'
a.encode('unicode-escape').decode().replace('\\\\', '\\')
print(a)
Which gives a value that can be written as CSV:
hello\nbobby\nsally\n
There doesn't seem to be a solution for other special characters, however, that may get a single \ before them. It's a bummer. Solving that would be complex.
For example, to serialize a pandas.Series containing a list of strings with special characters in to a textfile in the format BERT expects with a CR between each sentence and a blank line between each document:
with open('sentences.csv', 'w') as f:
current_idx = 0
for idx, doc in sentences.items():
# Insert a newline to separate documents
if idx != current_idx:
f.write('\n')
# Write each sentence exactly as it appared to one line each
for sentence in doc:
f.write(sentence.encode('unicode-escape').decode().replace('\\\\', '\\') + '\n')
This outputs (for the Github CodeSearchNet docstrings for all languages tokenized into sentences):
Makes sure the fast-path emits in order.
#param value the value to emit or queue up\n#param delayError if true, errors are delayed until the source has terminated\n#param disposable the resource to dispose if the drain terminates
Mirrors the one ObservableSource in an Iterable of several ObservableSources that first either emits an item or sends\na termination notification.
Scheduler:\n{#code amb} does not operate by default on a particular {#link Scheduler}.
#param the common element type\n#param sources\nan Iterable of ObservableSource sources competing to react first.
A subscription to each source will\noccur in the same order as in the Iterable.
#return an Observable that emits the same sequence as whichever of the source ObservableSources first\nemitted an item or sent a termination notification\n#see ReactiveX operators documentation: Amb
...
Just format like that:
s = "your string"; raw_s = r'{0}'.format(s)
With a little bit correcting #Jolly1234's Answer:
here is the code:
raw_string=path.encode('unicode_escape').decode()
s = "hel\nlo"
raws = '%r'%s #coversion to raw string
#print(raws) will print 'hel\nlo' with single quotes.
print(raws[1:-1]) # will print hel\nlo without single quotes.
#raws[1:-1] string slicing is performed
The solution, which worked for me was:
fr"{orignal_string}"
Suggested in comments by #ChemEnger
I suppose repr function can help you:
s = 't\n'
repr(s)
"'t\\n'"
repr(s)[1:-1]
't\\n'
Just simply use the encode function.
my_var = 'hello'
my_var_bytes = my_var.encode()
print(my_var_bytes)
And then to convert it back to a regular string do this
my_var_bytes = 'hello'
my_var = my_var_bytes.decode()
print(my_var)
--EDIT--
The following does not make the string raw but instead encodes it to bytes and decodes it.

Properly Escaping String Python

I'm trying to use selenium to locate a checkbox within an unordered list:
ul=browser.find_element_by_xpath('//[#id="TestTake"]/div[2]/div/div/ol/li[{}]/div[2]/ul'.format(num))
checkbox_id=ul.find_element_by_xpath("//[contains(text(),'{}')]".format(correct.replace("'","\'"))).get_attribute("for")
The problem with:
"//[contains(text(),'{}')]".format(correct.replace("'","\'"))).get_attribute("for"
occurs when correct is equal to L' or something that contains a quote.
How can I properly escape the single quote? I'm not sure if the correct will have a quote or not, so I need to be able to handle both cases, a double quote as well.
Also, this is the only approach because I only can get the id by the attribute for which I find by using its contents.
Turns out that string concatenation did the trick:
checkbox_id = ul.find_element_by_xpath("//*[contains(text(),\"" + correct + "\")]").get_attribute("for")

Insert invisible unicode into MySQL using python3 but encountered duplicate

When I insert the device data into MySQL(v5.5.6) using python(v3.2). It encountered a problem.
This is device A (It contains three unicode and a blank space):
'\u202d\u202d \u202d'
And device B (It is only a blank space):
' '
The problem is when i insert all device data into MySQL , Error is
Duplicate entry 'activate_device-20151201-1-5740-01000P---‭‭ ‭--' for key 'PRIMARY'
I guess MySQL has deal the '\u202d'(A unicode to reverse string maybe?).
How can I simulate the process in python3 like MySQL?
How can I avoid the duplicate?
The expected result is translate '\u202d\u202d \u202d' to ' ' in python3.
Please help me.
There are some ambiguities here. Do you want to keep only the visible ascii characters or also visible unicode characters ?
If you want to keep only visible ascii characters, the simple way is to use the python inbuilt string module.
import string
new_string = "".join(filter(lambda x:x in string.printable, original_string))
For your specific usecase, a space is part of visible ascii - so the above will convert '\u202d\u202d \u202d' and ' ' to ' '

String Delimiter in Python

I want to do split a string using "},{" as the delimiter. I have tried various things but none of them work.
string="2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3 "
Split it into something like this:
2,1,6,4,5,1
8,1,4,9,6,6,7,0
6,1,2,3,9
2,3,5,4,3
string.split("},{") works at the Python console but if I write a Python script in which do this operation it does not work.
You need to assign the result of string.split("},{") to a new string. For example:
string2 = string.split("},{")
I think that is the reason you think it works at the console but not in scripts. In the console it just prints out the return value, but in the script you want to make sure you use the returned value.
You need to return the string back to the caller. Assigning to the string parameter doesn't change the caller's variable, so those changes are lost.
def convert2list(string):
string = string.strip()
string = string[2:len(string)-2].split("},{")
# Return to caller.
return string
# Grab return value.
converted = convert2list("{1,2},{3,4}")
You could do it in steps:
Split at commas to get "{...}" strings.
Remove leading and trailing curly braces.
It might not be the most Pythonic or efficient, but it's general and doable.
I was taking the input from the console in the form of arguments to the script....
So when I was taking the input as {{2,4,5},{1,9,4,8,6,6,7},{1,2,3},{2,3}} it was not coming properly in the arg[1] .. so the split was basically splitting on an empty string ...
If I run the below code from a script file (in Python 2.7):
string="2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3 "
print string.split("},{")
Then the output I got is:
['2,1,6,4,5,1', '8,1,4,9,6,6,7,0', '6,1,2,3,9', '2,3,5,4,3 ']
And the below code also works fine:
string="2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3 "
def convert2list(string):
string=string.strip()
string=string[:len(string)].split("},{")
print string
convert2list(string)
Use This:
This will split the string considering },{ as a delimiter and print the list with line breaks.
string = "2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3"
for each in string.split('},{'):
print each
Output:
2,1,6,4,5,1
8,1,4,9,6,6,7,0
6,1,2,3,9
2,3,5,4,3
If you want to print the split items in the list only you can use this simple print option.
string = "2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3"
print string.split('},{')
Output:
['2,1,6,4,5,1', '8,1,4,9,6,6,7,0', '6,1,2,3,9', '2,3,5,4,3']
Quite simply ,you have to use split() method ,and "},{" as a delimeter, then print according to arguments (because string will be a list ) ,
like the following :
string.split("},{")
for i in range(0,len(string)):
print(string[i])

How can I replace double and single quotations in a string efficiently?

I'm parsing a xml file and inserting it into database.
However since some text containes double or single quotation I'm having problem with insertion. Currently I'm using the code shown below. But it seems it's inefficient.
s = s.replace('"', ' ')
s = s.replace("'", ' ')
Is there any way I can insert text without replacing these quotations?
OR
Is there any efficient way to substitute them efficiently ?
Thanks !
Why can't you insert strings containing quote marks into your database? Is there some weird data type that permits any character except a quote mark? Or are you building an insert statement with literal strings, rather than binding your strings to query parameters as you should be doing?
If you're doing
cursor.execute('insert into mytable (somefield) values ("%s")' % (mystring))
then that's unsafe and wrong. Instead, you should be doing
cursor.execute('insert into mytable (somefield) values (%(myparam)s)',
dict(myparam=mystring))
you should use str.translate instead of doing two replace() calls
>>> import string
>>> quotes_to_spaces=string.maketrans('"\''," ")
>>> s=s.translate(quotes_to_spaces)
You could try something like _mysql.escape_string():
>>> import _mysql
>>> a = '''I said, "Don't do that"'''
>>> a
'I said, "Don\'t do that"'
>>> _mysql.escape_string(a)
'I said, \\"Don\\\'t do that\\"'
However, the manual recommends using connection.escape_string(), but I think you need a database connection first.

Categories

Resources