Storing int or str in the list - python

I created a text file and opened it in Python using:
for word_in_line in open("test.txt"):
To loop through the words in a line in txt file.
The text file only has one line, which is:
int 111 = 3 ;
When I make a list using .split():
print("Input: {}".format(word_in_line))
line_list = word_in_line.split()
It creates:
['int', '111', '=', '3', ';']
And I was looking for a way to check if line_list[1] ('111') is an integer.
But when I try type(line_list[1]), it says that its str because of ''.
My goal is to read through the txt file and see if it is integer or str or other data type, etc.

What you have in your list is a string. So the type coming is correct and expected.
What you are looking to do is check to see if what you have are all digits in your string. So to do that use the isdigit string method:
line_list[1].isdigit()
Depending on what exactly you are trying to validate here, there are cases where all you want are purely digits, where this solution provides exactly that.
There could be other cases where you want to check whether you have some kind of number. For example, 10.5. This is where isdigit will fail. For cases like that, you can take a look at this answer that provides an approach to check whether you have a float

I don't agree with the above answer.
Any string parsing like #idjaw's answer of line_list[1].isdigit() will fail on an odd edge case. For example, what if the number is a float and like .50 and starts with a dot? The above approach won't work. Technically we only care about ints in this example so this won't matter, but in general it is dangerous.
In general if you are trying to check whether a string is a valid number, it is best to just try to convert the string to a number and then handle the error accordingly.
def isNumber(string):
try:
val = int(string)
return True
except ValueError:
return False

Related

How to convert a regular string to a raw string? [duplicate]

I have a string s, its contents are variable. How can I make it a raw string? I'm looking for something similar to the r'' method.
i believe what you're looking for is the str.encode("string-escape") function. For example, if you have a variable that you want to 'raw string':
a = '\x89'
a.encode('unicode_escape')
'\\x89'
Note: Use string-escape for python 2.x and older versions
I was searching for a similar solution and found the solution via:
casting raw strings python
Raw strings are not a different kind of string. They are a different way of describing a string in your source code. Once the string is created, it is what it is.
Since strings in Python are immutable, you cannot "make it" anything different. You can however, create a new raw string from s, like this:
raw_s = r'{}'.format(s)
As of Python 3.6, you can use the following (similar to #slashCoder):
def to_raw(string):
return fr"{string}"
my_dir ="C:\data\projects"
to_raw(my_dir)
yields 'C:\\data\\projects'. I'm using it on a Windows 10 machine to pass directories to functions.
raw strings apply only to string literals. they exist so that you can more conveniently express strings that would be modified by escape sequence processing. This is most especially useful when writing out regular expressions, or other forms of code in string literals. if you want a unicode string without escape processing, just prefix it with ur, like ur'somestring'.
For Python 3, the way to do this that doesn't add double backslashes and simply preserves \n, \t, etc. is:
a = 'hello\nbobby\nsally\n'
a.encode('unicode-escape').decode().replace('\\\\', '\\')
print(a)
Which gives a value that can be written as CSV:
hello\nbobby\nsally\n
There doesn't seem to be a solution for other special characters, however, that may get a single \ before them. It's a bummer. Solving that would be complex.
For example, to serialize a pandas.Series containing a list of strings with special characters in to a textfile in the format BERT expects with a CR between each sentence and a blank line between each document:
with open('sentences.csv', 'w') as f:
current_idx = 0
for idx, doc in sentences.items():
# Insert a newline to separate documents
if idx != current_idx:
f.write('\n')
# Write each sentence exactly as it appared to one line each
for sentence in doc:
f.write(sentence.encode('unicode-escape').decode().replace('\\\\', '\\') + '\n')
This outputs (for the Github CodeSearchNet docstrings for all languages tokenized into sentences):
Makes sure the fast-path emits in order.
#param value the value to emit or queue up\n#param delayError if true, errors are delayed until the source has terminated\n#param disposable the resource to dispose if the drain terminates
Mirrors the one ObservableSource in an Iterable of several ObservableSources that first either emits an item or sends\na termination notification.
Scheduler:\n{#code amb} does not operate by default on a particular {#link Scheduler}.
#param the common element type\n#param sources\nan Iterable of ObservableSource sources competing to react first.
A subscription to each source will\noccur in the same order as in the Iterable.
#return an Observable that emits the same sequence as whichever of the source ObservableSources first\nemitted an item or sent a termination notification\n#see ReactiveX operators documentation: Amb
...
Just format like that:
s = "your string"; raw_s = r'{0}'.format(s)
With a little bit correcting #Jolly1234's Answer:
here is the code:
raw_string=path.encode('unicode_escape').decode()
s = "hel\nlo"
raws = '%r'%s #coversion to raw string
#print(raws) will print 'hel\nlo' with single quotes.
print(raws[1:-1]) # will print hel\nlo without single quotes.
#raws[1:-1] string slicing is performed
The solution, which worked for me was:
fr"{orignal_string}"
Suggested in comments by #ChemEnger
I suppose repr function can help you:
s = 't\n'
repr(s)
"'t\\n'"
repr(s)[1:-1]
't\\n'
Just simply use the encode function.
my_var = 'hello'
my_var_bytes = my_var.encode()
print(my_var_bytes)
And then to convert it back to a regular string do this
my_var_bytes = 'hello'
my_var = my_var_bytes.decode()
print(my_var)
--EDIT--
The following does not make the string raw but instead encodes it to bytes and decodes it.

Creating one for loop for 3 strings

So I have a String called 'Number' with 'abf573'. The task is, to find out if the String 'Number' just has characters and numbers from the Hexadecimal System.
My plan was to make a for loop, where we go through each position of the String 'Numbers', to check with an if statement if it is something out of the Hexadecimal System. To check that, I thought about writing down the A-F, a-f and 0-9 into Lists or separat Strings.
My Problem now is, that I have never done something like this in Python. I know how to make for loops and if-/else-/elif-Statements, but I dunno how to implement this in to this Problem.
Would be nice, if someone can give me a hint, how to do it, or if my way of thinking is even right or not.
I find it quite smart and fast to try to convert this string into an integer using int(), and to handle the exception ValueError which occurs if it is not possible.
Here is the beautiful short code:
my_string = 'abf573'
try:
result = int(my_string, 16)
print("OK")
except ValueError:
print("NOK")
Strings are iterables. So, you can write
Number = '12ab'
for character in Number:
if character in 'abcdef':
print('it is HEX')
Also, there is an isdigit method on strings, so your number is hex is not Number.isdigit()

Import string that looks like a list "[0448521958, +61439800915]" from JSON into Python and make it an actual list?

I am extracting a string out of a JSON document using python that is being sent by an app in development. This question is similar to some other questions, but I'm having trouble just using x = ast.literal_eval('[0448521958, +61439800915]') due to the plus sign.
I'm trying to get each phone number as a string in a python list x, but I'm just not sure how to do it. I'm getting this error:
raise ValueError('malformed string')
ValueError: malformed string
your problem is not just the +
the first number starts with 0 which is an octal number ... it only supports 0-7 ... but the number ends with 8 (and also has other numbers bigger than 8)
but it turns out your problems dont stop there
you can use regex to fix the plus
fixed_string = re.sub('\+(\d+)','\\1','[0445521757, +61439800915]')
ast.literal_eval(fixed_string)
I dont know what you can do about the octal number problem however
I think the problem is that ast.literal_eval is trying to interpret the phone numbers as numbers instead of strings. Try this:
str = '[0448521958, +61439800915]'
str.strip('[]').split(', ')
Result:
['0448521958', '+61439800915']
Technically that string isn't valid JSON. If you want to ignore the +, you could strip it out of the file or string before you evaluate it. If you want to preserve it, you'll have to enclose the value with quotes.

How to strip letters out of a string and compare values?

I have just learned Python for this project I am working on and I am having trouble comparing two values - I am using the Python xlwt and xlrd libraries and pulling values of cells from the documents. The problem is some of the values are in the format 'NP_000000000', 'IPI00000000.0', and '000000000' so I need to check which format the value is in and then strip the characters and decimal points off if necessary before comparing them.
I have tried using S1[:3] to get the value without alphabet characters, but I get a 'float is not subscriptable' error
Then I tried doing re.sub(r'[^\d.]+, '', S1) but I get a Typerror: expected a string or buffer
I figured since the value of the cell that is being returned via sheet.cell( x, y).value would be a string since it is alphanumeric, but it seems like it must be returned as a float
What is the best way to format these values and then compare them?
You are trying to get the numbers from the strings in the format shown? Like to get 2344 from NP_2344? If yes then use this
float(str(S1)[3:])
to get what you want. You can change float to int.
It sounds like the API you're using is returning different types depending on the content of the cells. You have two options.
You can convert everything to a string and then do what you're currently doing:
s = str(S1)
...
You can check the types of the input and act appropriately:
if isinstance(S1, basestring):
# this is a string, strip off the prefix
elif isinstance(S1, float):
# this is a float, just use it

Python - Writing to a text file using functions?

i wrote a simple function to write into a text file. like this,
def write_func(var):
var = str(var)
myfile.write(var)
a= 5
b= 5
c= a + b
write_func(c)
this will write the output to a desired file.
now, i want the output in another format. say,
write_func("Output is :"+c)
so that the output will have a meaningful name in the file. how do i do it?
and why is that i cant write an integer to a file? i do, int = str(int) before writing to a file?
You can't add/concatenate a string and integer directly.
If you do anything more complicated than "string :"+str(number), I would strongly recommend using string formatting:
write_func('Output is: %i' % (c))
Python is a strongly typed language. This means, among other things, that you cannot concatenate a string and an integer. Therefore you'll have to convert the integer to string before concatenating. This can be done using a format string (as Nick T suggested) or passing the integer to the built in str function (as NullUserException suggested).
Simple, you do:
write_func('Output is' + str(c))
You have to convert c to a string before you can concatenate it with another string. Then you can also take off the:
var = str(var)
From your function.
why is that i cant write an integer to
a file? i do, int = str(int) before
writing to a file?
You can write binary data to a file, but byte representations of numbers aren't really human readable. -2 for example is 0xfffffffe in a 2's complement 32-bit integer. It's even worse when the number is a float: 2.1 is 0x40066666.
If you plan on having a human-readable file, you need to human-readable characters on them. In an ASCII file '0.5' isn't a number (at least not as a computer understands numbers), but instead the characters '0', '.' and '5'. And that's why you need convert your numbers to strings.
From http://docs.python.org/library/stdtypes.html#file.write
file.write(str)
Write a string to the file. There is no return value. Due to buffering,
the string may not actually show up in
the file until the flush() or close()
method is called.
Note how documentation specifies that write's argument must be a string.
So you should create a string yourself before passing it to file.write().

Categories

Resources