python strip() seems doesn't working as expected - python

I do my apologizes for the dummy question, but i'm experiencing a weird problem with a simple script that seems correct but doesnt' works as expected
#!/usr/bin/python
import json,sys
obj=json.load(sys.stdin)
oem=obj["_id"]
models = obj.get("modelli", 0)
if models != 0:
for marca in obj["modelli"]:
brand=obj["modelli"][marca]
for serie in brand:
ser=brand[serie]
for modello in ser:
model=modello
marca = marca.strip()
modello = modello.strip()
serie = serie.strip()
print oem,";",marca,";",serie,";",modello
It should just cycle an array from a json var and print the output in csv format, but i still get the string containing one withespace at the begin and at the end of each variable (oem, marca, serie, modello) like this
KD-CH884 ; Dell ; ; 966
This is my very first script in python, i've just followed some simple directives, so i'm missing something or what?
Any guess?

The print statement is putting in that whitespace.
From the docs here:
A space is written before each object is (converted and) written,
unless the output system believes it is positioned at the beginning of
a line.
Use ';'.join(...) instead.

Python is actually stripping the whitespaces out. Its just the print statement:
print oem,";",marca,";",serie,";",modello
.. that is reintroducing the spaces. Try concatenating the variables and display them.

';'.join(filter(None, [oem, marca, serie, modello]))
It will only place the semicolon between two existing strings. If a variable holds the empty string after being stripped '', filtering the None will take it out of the list.

try this:
print "%s;%s;%s;%s" % (oem,marca,serie,modello)
or
print ";".join([oem,marca,serie,modello])

Related

Issue using a variable with an r-string in Python

Fairly new to Python, and I've got a batch job that I now have to start saving some extracts from out to a company Sharepoint site. I've searched around and cannot seem to find a solution to the issue I keep running into. I need to pass a date into the filename, and was first having issues with using a normal string. If I just type out the entire thing as a raw string, I get the output I want:
x = r"\\mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts\2021-02-15_aRoute.xlsx"
print (x)
The output is: \mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts\2021-02-15_aRoute.xlsx
However, if I break the string into it's parts so I can get a parameter in there, I wind up having to toss an extra double-quote on the "x" parameter to keep the code from running into a "SyntaxError: EOL while scanning string literal" error:
x = r"\\mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts\""
timestamp = date_time_obj.date().strftime('%Y-%m-%d')
filename = "_aRoute.xlsx"
print (x + timestamp + filename)
But the output I get passes that unwanted double quote into my string: \mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts"2021-02-15_aRoute.xlsx
The syntax I need is clearly escaping me, I'm just trying to get the path built so I can save the file itself. If it happens to matter, I'm using pandas to write the file:
data = pandas.read_sql(sql, cnxn)
data.to_excel(string_goes_here)
Any help would be greatly appreciated!
Per the comment from #Matthias, as it turns out, an r-string can't end with a single backslash. The quick workaround, therefore, was:
x = r"\\mnt4793\DavWWWRoot\sites\GlobalSupply\Plastics\DataExtracts" + "\\"
The comment from #sammywemmy also linked to what looks to be a much more thorough solution.
Thank you both!

Python - \n appearing in concatenated strings

I've been having an issue with my Python code. I am trying to concatenate the value of two string objects, but when I run the code it keeps printing a '\n' between the two strings.
My code:
while i < len(valList):
curVal = valList[i]
print(curVal)
markupConstant = 'markup.txt'
markupFileName = curVal + markupConstant
markupFile = open(markupFileName)
Now when I run this, it gives me this error:
OSError: [Errno 22] Invalid argument: 'cornWhiteTrimmed\nmarkup.txt'
See that \n between the two strings? I've dissected the code a bit by printing each string individually, and neither one contains a \n on its own. Any ideas as to what I'm doing wrong?
Thanks in advance!
The concatenation itself doesn't add the \n for sure. valList is probably the result of calling readlines() on a file object, so each element in it will have a trailing \n. Call strip on each element before using it:
while i < len(valList):
curVal = valList[i].strip()
print(curVal)
markupConstant = 'markup.txt'
markupFileName = curVal + markupConstant
markupFile = open(markupFileName)
The reason you are not seeing the \n when you actually print out the python statements is because \n is technically the newline character. You will not see this when you actually print, it will only skip to a new line. The problem is when you have this in the middle of your two strings, it is going to cause problems. The solution to your issue is the strip method. You can read into its documentation here (https://www.tutorialspoint.com/python/string_strip.htm) but basically you can use this method to strip the newline character off of any of your strings.
Just to make an addition to the other answers explaining why this came about:
When you need to actually inspect what characters a string contains, you can't simply print it. Many characters are "invisible" when printed.
Turn the string into a list first:
list(curVal)
Or my personal favorite:
[c for c in curVal]
These will create lists that properly show all hard to see characters.

Delete last printed character python

I am writing a program in Python and want to replace the last character printed in the terminal with another character.
Pseudo code is:
print "Ofen",
print "\b", # NOT NECCESARILY \b, BUT the wanted print statement that will erase the last character printed
print "r"
I'm using Windows8 OS, Python 2.7, and the regular interpreter.
All of the options I saw so far didn't work for me. (such as: \010, '\033[#D' (# is 1), '\r').
These options were suggested in other Stack Overflow questions or other resources and don't seem to work for me.
EDIT: also using sys.stdout.write doesn't change the affect. It just doesn't erase the last printed character. Instead, when using sys.stdout.write, my output is:
Ofenr # with a square before 'r'
My questions:
Why don't these options work?
How do I achieve the desired output?
Is this related to Windows OS or Python 2.7?
When I find how to do it, is it possible to erase manually (using the wanted eraser), delete the '\n' that is printed in python's print statement?
When using print in python a line feed (aka '\n') is added. You should use sys.stdout.write() instead.
import sys
sys.stdout.write("Ofen")
sys.stdout.write("\b")
sys.stdout.write("r")
sys.stdout.flush()
Output: Ofer
You can also import the print function from Python 3. The optional end argument can be any string that will be added. In your case it is just an empty string.
from __future__ import print_function # Only needed in Python 2.X
print("Ofen",end="")
print("\b",end="") # NOT NECCESARILY \b, BUT the wanted print statement that will erase the last character printed
print("r")
Output
Ofer
I think string stripping would help you. Save the input and just print the string upto the length of string -1 .
Instance
x = "Ofen"
print (x[:-1] + "r")
would give you the result
Ofer
Hope this helps. :)

String Delimiter in Python

I want to do split a string using "},{" as the delimiter. I have tried various things but none of them work.
string="2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3 "
Split it into something like this:
2,1,6,4,5,1
8,1,4,9,6,6,7,0
6,1,2,3,9
2,3,5,4,3
string.split("},{") works at the Python console but if I write a Python script in which do this operation it does not work.
You need to assign the result of string.split("},{") to a new string. For example:
string2 = string.split("},{")
I think that is the reason you think it works at the console but not in scripts. In the console it just prints out the return value, but in the script you want to make sure you use the returned value.
You need to return the string back to the caller. Assigning to the string parameter doesn't change the caller's variable, so those changes are lost.
def convert2list(string):
string = string.strip()
string = string[2:len(string)-2].split("},{")
# Return to caller.
return string
# Grab return value.
converted = convert2list("{1,2},{3,4}")
You could do it in steps:
Split at commas to get "{...}" strings.
Remove leading and trailing curly braces.
It might not be the most Pythonic or efficient, but it's general and doable.
I was taking the input from the console in the form of arguments to the script....
So when I was taking the input as {{2,4,5},{1,9,4,8,6,6,7},{1,2,3},{2,3}} it was not coming properly in the arg[1] .. so the split was basically splitting on an empty string ...
If I run the below code from a script file (in Python 2.7):
string="2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3 "
print string.split("},{")
Then the output I got is:
['2,1,6,4,5,1', '8,1,4,9,6,6,7,0', '6,1,2,3,9', '2,3,5,4,3 ']
And the below code also works fine:
string="2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3 "
def convert2list(string):
string=string.strip()
string=string[:len(string)].split("},{")
print string
convert2list(string)
Use This:
This will split the string considering },{ as a delimiter and print the list with line breaks.
string = "2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3"
for each in string.split('},{'):
print each
Output:
2,1,6,4,5,1
8,1,4,9,6,6,7,0
6,1,2,3,9
2,3,5,4,3
If you want to print the split items in the list only you can use this simple print option.
string = "2,1,6,4,5,1},{8,1,4,9,6,6,7,0},{6,1,2,3,9},{2,3,5,4,3"
print string.split('},{')
Output:
['2,1,6,4,5,1', '8,1,4,9,6,6,7,0', '6,1,2,3,9', '2,3,5,4,3']
Quite simply ,you have to use split() method ,and "},{" as a delimeter, then print according to arguments (because string will be a list ) ,
like the following :
string.split("},{")
for i in range(0,len(string)):
print(string[i])

Python: 2.6 and 3.1 string matching inconsistencies

I wrote my module in Python 3.1.2, but now I have to validate it for 2.6.4.
I'm not going to post all my code since it may cause confusion.
Brief explanation:
I'm writing a XML parser (my first interaction with XML) that creates objects from the XML file. There are a lot of objects, so I have a 'unit test' that manually scans the XML and tries to find a matching object. It will print out anything that doesn't have a match.
I open the XML file and use a simple 'for' loop to read line-by-line through the file. If I match a regular expression for an 'application' (XML has different 'application' nodes), then I add it to my dictionary, d, as the key. I perform a lxml.etree.xpath() query on the title and store it as the value.
After I go through the whole thing, I iterate through my dictionary, d, and try to match the key to my value (I have to use the get() method from my 'application' class). Any time a mismatch is found, I print the key and title.
Python 3.1.2 has all matching items in the dictionary, so nothing is printed. In 2.6.4, every single value is printed (~600) in all. I can't figure out why my string comparisons aren't working.
Without further ado, here's the relevant code:
for i in d:
if i[1:-2] != d[i].get('id'):
print('X%sX Y%sY' % (i[1:-3], d[i].get('id')))
I slice the strings because the strings are different. Where the key would be "9626-2008olympics_Prod-SH"\n the value would be 9626-2008olympics_Prod-SH, so I have to cut the quotes and newline. I also added the Xs and Ys to the print statements to make sure that there wasn't any kind of whitespace issues.
Here is an example line of output:
X9626-2008olympics_Prod-SHX Y9626-2008olympics_Prod-SHY
Remember to ignore the Xs and Ys. Those strings are identical. I don't understand why Python2 can't match them.
Edit:
So the problem seems to be the way that I am slicing.
In Python3,
if i[1:-2] != d[i].get('id'):
this comparison works fine.
In Python2,
if i[1:-3] != d[i].get('id'):
I have to change the offset by one.
Why would strings need different offsets? The only possible thing that I can think of is that Python2 treats a newline as two characters (i.e. '\' + 'n').
Edit 2:
Updated with requested repr() information.
I added a small amount of code to produce the repr() info from the "2008olympics" exmpale above. I have not done any slicing. It actually looks like it might not be a unicode issue. There is now a "\r" character.
Python2:
'"9626-2008olympics_Prod-SH"\r\n'
'9626-2008olympics_Prod-SH'
Python3:
'"9626-2008olympics_Prod-SH"\n'
'9626-2008olympics_Prod-SH'
Looks like this file was created/modified on Windows. Is there a way in Python2 to automatically suppress '\r'?
You are printing i[1:-3] but comparing i[1:-2] in the loop.
Very Important Question
Why are you writing code to parse XML when lxml will do all that for you? The point of unit tests is to test your code, not to ensure that the libraries you are using work!
Russell Borogrove is right.
Python 3 defaults to unicode, and the newline character is correctly interpreted as one character. That's why my offset of [1:-2] worked in 3 because I needed to eliminate three characters: ", ", and \n.
In Python 2, the newline is being interpreted as two characters, meaning I have to eliminate four characters and use [1:-3].
I just added a manual check for the Python major version.
Here is the fixed code:
for i in d:
# The keys in D contain quotes and a newline which need
# to be removed. In v3, newline = 1 char and in v2,
# newline = 2 char.
if sys.version_info[0] < 3:
if i[1:-3] != d[i].get('id'):
print('%s %s' % (i[1:-3], d[i].get('id')))
else:
if i[1:-2] != d[i].get('id'):
print('%s %s' % (i[1:-2], d[i].get('id')))
Thanks for the responses everyone! I appreciate your help.
repr() and %r format are your friends ... they show you (for basic types like str/unicode/bytes) exactly what you've got, including type.
Instead of
print('X%sX Y%sY' % (i[1:-3], d[i].get('id')))
do
print('%r %r' % (i, d[i].get('id')))
Note leaving off the [1:-3] so that you can see what is in i before you slice it.
Update after comment "You are perfectly right about comparing the wrong slice. However, once I change it, python2.6 works, but python3 has the problem now (i.e. it doesn't match any objects)":
How are you opening the file (two answers please, for Python 2 and 3). Are you running on Windows? Have you tried getting the repr() as I suggested?
Update after actual input finally provided by OP:
If, as it appears, your input file was created on Windows (lines are separated by "\r\n"), you can read Windows and *x text files portably by using the "universal newlines" option ... open('datafile.txt', 'rU') on Python2 -- read this. Universal newlines mode is the default in Python3. Note that the Python3 docs say that you can use 'rU' also in Python3; this would save you having to test which Python version you are using.
I don't understand what you're doing exactly, but would you try using strip() instead of slicing and see whether it helps?
for i in d:
stripped = i.strip()
if stripped != d[i].get('id'):
print('X%sX Y%sY' % (stripped, d[i].get('id')))

Categories

Resources