I came across an example in a Python textbook that used word[1:2] to slice a string. It was doing it this way to demonstrate that only one letter of the string would be sliced.
It got me thinking - is there ever a use case where one would use word[1:2] instead of just word[1], which returns the same result?
The unwritten rule is that slicing will return a sub-sequence and subscription will return an element. It just happens that for length one strings, these notions are equal. But there is a subtle API difference: slicing strings can not throw an IndexError
>>> s = "x"
>>> s[1:2]
''
>>> s[1]
IndexError: string index out of range
In some rare occasions this can be convenient/useful when you want to make a check and avoid the possibility of unhanded exception.
Perhaps also worth mentioning here: there is a more significant difference with bytestrings, for which slicing again returns substrings but subscription returns ordinals.
>>> b'xyz'[1]
121
>>> b'xyz'[1:2]
b'y'
Assuming that you're working with valid indices, for string slicing there's no difference, because a single element of a string is still itself a string (i.e. there's no difference between a "character" and a string that's one character long).
>>> word = 'asdf'
>>> word[1:2]
's'
>>> word[1]
's'
For other slicable objects (e.g. a list) the two may not be equivalent:
>>> word = ['a', 's', 'd', 'f']
>>> word[1:2]
['s']
>>> word[1]
's'
word[1] is different from word[1:2]. word[1] is returning the value of list item with index 1. But also word[1:2] is a list slice which returns the list in the position of 1.
example:
word = ['a', 'b', 'c']
print(word[1])
#output: b
#but also
print(word[1:2])
#output: ['b']
Related
This question already has answers here:
Determine prefix from a set of (similar) strings
(11 answers)
Closed 2 years ago.
I need to know how to identify prefixes in strings in a list. For example,
list = ['nomad', 'normal', 'nonstop', 'noob']
Its answer should be 'no' since every string in the list starts with 'no'
I was wondering if there is a method that iterates each letter in strings in the list at the same time and checks each letter is the same with each other.
Use os.path.commonprefix it will do exactly what you want.
In [1]: list = ['nomad', 'normal', 'nonstop', 'noob']
In [2]: import os.path as p
In [3]: p.commonprefix(list)
Out[3]: 'no'
As an aside, naming a list "list" will make it impossible to access the list class, so I would recommend using a different variable name.
Here is a code without libraries:
for i in range(len(l[0])):
if False in [l[0][:i] == j[:i] for j in l]:
print(l[0][:i-1])
break
gives output:
no
There is no built-in function to do this. If you are looking for short python code that can do this for you, here's my attempt:
def longest_common_prefix(words):
i = 0
while len(set([word[:i] for word in words])) <= 1:
i += 1
return words[0][:i-1]
Explanation: words is an iterable of strings. The list comprehension
[word[:i] for word in words]
uses string slices to take the first i letters of each string. At the beginning, these would all be empty strings. Then, it would consist of the first letter of each word. Then the first two letters, and so on.
Casting to a set removes duplicates. For example, set([1, 2, 2, 3]) = {1, 2, 3}. By casting our list of prefixes to a set, we remove duplicates. If the length of the set is less than or equal to one, then they are all identical.
The counter i just keeps track of how many letters are identical so far.
We return words[0][i-1]. We arbitrarily choose the first word and take the first i-1 letters (which would be the same for any word in the list). The reason that it's i-1 and not i is that i gets incremented before we check if all of the words still share the same prefix.
Here's a fun one:
l = ['nomad', 'normal', 'nonstop', 'noob']
def common_prefix(lst):
for s in zip(*lst):
if len(set(s)) == 1:
yield s[0]
else:
return
result = ''.join(common_prefix(l))
Result:
'no'
To answer the spirit of your question - zip(*lst) is what allows you to "iterate letters in every string in the list at the same time". For example, list(zip(*lst)) would look like this:
[('n', 'n', 'n', 'n'), ('o', 'o', 'o', 'o'), ('m', 'r', 'n', 'o'), ('a', 'm', 's', 'b')]
Now all you need to do is find out the common elements, i.e. the len of set for each group, and if they're common (len(set(s)) == 1) then join it back.
As an aside, you probably don't want to call your list by the name list. Any time you call list() afterwards is gonna be a headache. It's bad practice to shadow built-in keywords.
def duplicate_count(s):
return len([c for c in set(s.lower()) if s.lower().count(c)>1])
I'm having difficulty understanding how this code works. I was doing a codewars challenge to return the number of elements with duplicates in a string.
eg. Asasd --> 2
I came up with my own implmentation but I wasn't able to really understand what this code does. If anyone could point me in the direction, it would be much appreciated :)
This is, first of all, a highly inefficient solution to the problem. But let's break it down:
s.lower() would convert all the characters in a string to lower case:
In [1]: s = "Hello, WORLD"
In [2]: s.lower()
Out[2]: 'hello, world'
set(s.lower()) would create a set (make sure to read about what sets are) of characters from a string - eliminating all the duplicates:
In [3]: set(s.lower())
Out[3]: {' ', ',', 'd', 'e', 'h', 'l', 'o', 'r', 'w'}
for c in set(s.lower()) iterates over every single character in a set we created above
for every character in this set, we apply this if condition: if s.lower().count(c)>1. The count(c) here would count how many times c appears in the string. The >1 helps us to leave characters that are met more than 1 time in a string
[c for c in set(s.lower()) if s.lower().count(c)>1] is called a list comprehension. It is basically a short way of creating a list. Here, we are creating a list of characters that occur in a string more than one time. Check out this topic about how to verbalize and read the list comprehensions.
len() then just gets us the length of the list
To summarize, you iterate over the unique characters in a given string and count which of them occur in a string more than one time.
set(s.lower()) # gives unique elements in lower case
and
s.lower().count(c)>1 #checks if an element shows up more than once
All in all the function finds number of not unique elements in a string ,ignoring case.
I believe using collections.Counter is more efficient:
In [7]: from collections import Counter
In [8]: sum(v > 1 for v in Counter("Aabcc".lower()).values())
Out[8]: 2
def duplicate_count(s):
result = []
for c in set(s.lower()):
if s.lower().count(c) > 1:
result.append(c)
return len(result)
Can someone please help me with a simple code that returns a list as an output to list converted to string? The list is NOT like this:
a = u"['a','b','c']"
but the variable is like this:
a = '[a,b,c]'
So,
list(a)
would yield the following output
['[', 'a', ',', 'b', ',', 'c', ']']
instead I want the input to be like this:
['a', 'b', 'c']
I have even tried using the ast.literal_eval() function - on using which I got a ValueError exception stating the argument is a malformed string.
There is no standard library that'll load such a list. But you can trivially do this with string processing:
a.strip('[]').split(',')
would give you your list.
str.strip() will remove any of the given characters from the start and end; so it'll remove any and all [ and ] characters from the start until no such characters are found anymore, then remove the same characters from the end. That suffices nicely for your input sample.
str.split() then splits the remainder (minus the [ and ] characters at either end) into separate strings at any point there is a comma:
>>> a = '[a,b,c]'
>>> a.strip('[]')
'a,b,c'
>>> a.strip('[]').split(',')
['a', 'b', 'c']
Let us use hack.
import string
x = "[a,b,c]"
for char in x:
if char in string.ascii_lowercase:
x = x.replace(char, "'%s'" % char)
# Now x is "['a', 'b', 'c']"
lst = eval(x)
This checks if a character is in the alphabet(lowercase) if it is, it replaces it with a character with single quotes around it.
Why not use this solution ?:
Fails for duplicate elements
Fails for elements with more than single characters.
You need to be careful about confusing single quote and double quotes
Why use this solution ?:
There are no reasons to use this solution rather than Martijn's. But it was fun coding it anyway.
I wish you luck in your problem.
I have been trying to figure out a simple way to replace the integers within a string with x's in python. I was able to get something ok by doing the following:
In [73]: string = "martian2015"
In [74]: string = list(string)
In [75]: for n, i in enumerate(string):
....: try:
....: if isinstance(int(i), int):
....: string[n]='x'
....: except ValueError:
....: continue
This actually yields something like the following:
In [81]: string
Out[81]: ['m', 'a', 'r', 't', 'i', 'a', 'n', 'x', 'x', 'x', 'x']
In [86]: joiner = ""
In [87]: string = joiner.join(string)
In [88]: string
Out[88]: 'martianxxxx'
My question is: is there any way of getting the result in a simpler manner without relying on error/exception handling?
Yes, using regex and the re module:
import re
new_string = re.sub("\d", "x", "martin2015")
The string "\d" tells Python to search for all digits in the string. The second argument is what you want to replace all matches with, and the third argument is your input. (re.sub stands for "substitute")
You can use the str.isdigit function and list comprehension, like this
>>> data = "martian2015"
>>> "".join(["x" if char.isdigit() else char for char in data])
'martianxxxx'
The isdigit function will return True if all the characters in it are numeric digits. So, if it is a digit, then we use "x" otherwise we use the actual character itself.
You can actually use generator expression instead of list comprehension to do the same, like this
>>> "".join("x" if char.isdigit() else char for char in data)
'martianxxxx'
The only difference is generators are lazy evaluated, unlike the list comprehension which builds the entire list. The generator will give values only on demand. Read more about them here.
But in this particular case, with str.join, the list is built anyway.
If you are going to do this kind of replacement often, then you might want to know about str.translate and str.maketrans.
>>> mapping = str.maketrans("0123456789", "x" * 10)
>>> "martin2015".translate(mapping)
'martinxxxx'
>>> "10-03-2015".translate(mapping)
'xx-xx-xxxx'
>>>
The maketrans builds a dictionary with the character codes of values in the first string and the corresponding character in the second string. So, when we use the mapping with the translate, whenever it finds a character in the mapping, it will simply replace it with the corresponding value.
change isinstance to .isdigit
string = "martian2015"
for i in string:
if i.isdigit():
string.replace(i, "x")
(or regular expressions, regex / re )
In [102]: import string
In [103]: mystring
Out[103]: 'martian2015'
In [104]: a='x'*10
In [105]: leet=maketrans('0123456789',a)
In [106]: mystring.translate(leet)
Out[106]: 'martianxxxx'
If you don't know advance data process method, you can invoke string module to filter num string.
import string
old_string = "martin2015"
new_string = "".join([s if s not in string.digits else "x" for s in old_string ])
print new_string
# martinxxxx
Although I know my anwser is not the best solution, I want to offer different methods help people solve problems.
How would I get the first character from the first string in a list in Python?
It seems that I could use mylist[0][1:] but that does not give me the first character.
>>> mylist = []
>>> mylist.append("asdf")
>>> mylist.append("jkl;")
>>> mylist[0][1:]
'sdf'
You almost had it right. The simplest way is
mylist[0][0] # get the first character from the first item in the list
but
mylist[0][:1] # get up to the first character in the first item in the list
would also work.
You want to end after the first character (character zero), not start after the first character (character zero), which is what the code in your question means.
Get the first character of a bare python string:
>>> mystring = "hello"
>>> print(mystring[0])
h
>>> print(mystring[:1])
h
>>> print(mystring[3])
l
>>> print(mystring[-1])
o
>>> print(mystring[2:3])
l
>>> print(mystring[2:4])
ll
Get the first character from a string in the first position of a python list:
>>> myarray = []
>>> myarray.append("blah")
>>> myarray[0][:1]
'b'
>>> myarray[0][-1]
'h'
>>> myarray[0][1:3]
'la'
Numpy operations are very different than python list operations.
Python has list slicing, indexing and subsetting. Numpy has masking, slicing, subsetting, indexing.
These two videos cleared things up for me.
"Losing your Loops, Fast Numerical Computing with NumPy" by PyCon 2015:
https://youtu.be/EEUXKG97YRw?t=22m22s
"NumPy Beginner | SciPy 2016 Tutorial" by Alexandre Chabot LeClerc:
https://youtu.be/gtejJ3RCddE?t=1h24m54s
Indexing in python starting from 0. You wrote [1:] this would not return you a first char in any case - this will return you a rest(except first char) of string.
If you have the following structure:
mylist = ['base', 'sample', 'test']
And want to get fist char for the first one string(item):
myList[0][0]
>>> b
If all first chars:
[x[0] for x in myList]
>>> ['b', 's', 't']
If you have a text:
text = 'base sample test'
text.split()[0][0]
>>> b
Try mylist[0][0]. This should return the first character.
If your list includes non-strings, e.g. mylist = [0, [1, 's'], 'string'], then the answers on here would not necessarily work. In that case, using next() to find the first string by checking for them via isinstance() would do the trick.
next(e for e in mylist if isinstance(e, str))[:1]
Note that ''[:1] returns '' while ''[0] spits IndexError, so depending on the use case, either could be useful.
The above results in StopIteration if there are no strings in mylist. In that case, one possible implementation is to set the default value to None and take the first character only if a string was found.
first = next((e for e in mylist if isinstance(e, str)), None)
first_char = first[0] if first else None