Splitting words into variables

Splitting words into variables - python

I have been doing some Python coding recntly and wanted to do the following:
import shlex
shlex.split("this is a test")
print (shlex.split("this is a test"))
It works, but I want to store the split phrase into different variables, if anyone can help me that would be awesome. Thanks!

Like this?
>>> str = "this is a test"
>>> arr = str.split(" ")
>>> arr
['this', 'is', 'a', 'test']
>>> arr[0]
'this'
>>> a = arr[0]
>>> b = arr[1]
>>> c = arr[2]
>>> d = arr[3]
>>> a
'this'

split() returns a list. Since you probably don't know how many words there will be, you can't declare all the individual variables you will need. Instead, you should use the returned list and use it as appropriate:
words = shlex.split("this is a test");
Note that this stores the list of words in a single variable, rather than trying to store each word in its own variable. I suggest you study more about how to manipulate lists in Python.

Related

How to change the index of an element in a list/array to another position/index without deleting/changing the original element and its value

For example lets say I have a list as below,
list = ['list4','this1','my3','is2'] or [1,6,'one','six']
So now I want to change the index of each element to match the number or make sense as I see fit (needn't be number) like so, (basically change the index of the element to wherever I want)
list = ['this1','is2','my3','list4'] or ['one',1,'six',6]
how do I do this whether there be numbers or not ?
Please help, Thanks in advance.

If you don't wanna use regex and learn it's mini language use this simpler method:
list1 = ['list4','this1', 'he5re', 'my3','is2']
def mySort(string):
if any(char.isdigit() for char in string): #Check if theres a number in the string
return [float(char) for char in string if char.isdigit()][0] #Return list of numbers, and return the first one (we are expecting only one number in the string)
list1.sort(key = mySort)
print(list1)
Inspired by this answer: https://stackoverflow.com/a/4289557/11101156

For the first one, it is easy:
>>> lst = ['list4','this1','my3','is2']
>>> lst = sorted(lst, key=lambda x:int(x[-1]))
>>> lst
['this1', 'is2', 'my3', 'list4']
But this assumes each item is string, and the last character of each item is numeric. Also it works as long as the numeric parts in each item is single digit. Otherwise it breaks. For the second one, you need to define "how you see it fit", in order to sort it in a logic.
If there are multiple numeric characters:
>>> import re
>>> lst = ['lis22t4','th2is21','my3','is2']
>>> sorted(lst, key=lambda x:int(re.search(r'\d+$', x).group(0)))
['is2', 'my3', 'list4', 'this21']
# or,
>>> ['is2', 'my3', 'lis22t4', 'th2is21']
But you can always do:
>>> lst = [1,6,'one','six']
>>> lst = [lst[2], lst[0], lst[3], lst[1]]
>>> lst
['one', 1, 'six', 6]
Also, don't use python built-ins as variable names. list is a bad variable name.

If you just want to move element in position 'y' to position 'x' of a list, you can try this one-liner, using pop and insert:
lst.insert(x, lst.pop(y))

If you know the order how you want to change indexes you can write simple code:
old_list= ['list4','this1','my3','is2']
order = [1, 3, 2, 0]
new_list = [old_list[idx] for idx in order]
If you can write your logic as a function, you can use sorted() and pass your function name as a key:
old_list= ['list4','this1','my3','is2']
def extract_number(string):
digits = ''.join([c for c in string if c.isdigit()])
return int(digits)
new_list = sorted(old_list, key = extract_number)
This case list is sorted by number, which is constructed by combining digits found in a string.

a = [1,2,3,4]
def rep(s, l, ab):
id = l.index(s)
q = s
del(l[id])
l.insert(ab, q)
return l
l = rep(a[0], a, 2)
print(l)
Hope you like this
Its much simpler

Python function to modify string

I was asked once to create a function that given a string, remove a few characters from the string.
Is it possible to do this in Python?
This can be done for lists, for example:
def poplist(l):
l.pop()
l1 = ['a', 'b', 'c', 'd']
poplist(l1)
print l1
>>> ['a', 'b', 'c']
What I want is to do this function for strings.
The only way I can think of doing this is to convert the string to a list, remove the characters and then join it back to a string. But then I would have to return the result.
For example:
def popstring(s):
copys = list(s)
copys.pop()
s = ''.join(copys)
s1 = 'abcd'
popstring(s1)
print s1
>>> 'abcd'
I understand why this function doesn't work. The question is more if it is possible to do this in Python or not? If it is, can I do it without copying the string?

Strings are immutable, that means you can not alter the str object. You can of course construct a new string that is some modification of the old string. But you can thus not alter the s object in your code.
A workaround could be to use a container:
class Container:
def __init__(self,data):
self.data = data
And then the popstring thus is given a contain, it inspect the container, and puts something else into it:
def popstring(container):
container.data = container.data[:-1]
s1 = Container('abcd')
popstring(s1)
But again: you did not change the string object itself, you only have put a new string into the container.
You can not perform call by reference in Python, so you can not call a function:
foo(x)
and then alter the variable x: the reference of x is copied, so you can not alter the variable x itself.

Strings are immutable, so your only main option is to create a new string by slicing and assign it back.
#removing the last char
>>> s = 'abcd'
>>> s = s[:-1]
=> 'abc'
Another easy to go method maybe to use list and then join the elements in it to create your string. Ofcourse, it all depends on your preference.
>>> l = ['a', 'b', 'c', 'd']
>>> ''.join(l)
=> 'abcd'
>>> l.pop()
=> 'd'
>>> ''.join(l)
=> 'abc'
Incase you are looking to remove char at a certain index given by pos (index 0 here), you can slice the string as :
>>> s='abcd'
>>> s = s[:pos] + s[pos+1:]
=> 'abd'

You could use bytearray instead:
s1 = bytearray(b'abcd') # NB: must specify encoding if coming from plain string
s1.pop() # now, s1 == bytearray(b'abc')
s1.decode() # returns 'abc'
Caveats:
if you plan to filter arbitrary text (i.e. non pure ASCII), this is a very bad idea to use bytearray
in this age of concurrency and parallelism, it might be a bad idea to use mutation
By the way, perhaps it is an instance of the XY problem. Do you really need to mute strings in the first place?

You can remove parts of a strings and assign it to another string:
s = 'abc'
s2 = s[1:]
print(s2)

You wont do that.. you can still concatenate but you wont pop until its converted into a list..
>>> s = 'hello'
>>> s+='world'
>>> s
'helloworld'
>>> s.pop()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'pop'
>>> list(s).pop()
'd'
>>>
But still You can play with Slicing
>>> s[:-1]
'helloworl'
>>> s[1:]
'elloworld'
>>>

Python disable list string "breaking"

Is there a way to disable breaking string with list. For example:
>>> a = "foo"
>>> b = list()
>>> b.append(list(a))
>>> b
>>>[['f', 'o', 'o']]
Is there a way to have a list inside of a list with string that is not "broken", for example [["foo"],["bar"]]?

Very esay:
>>> a = "foo"
>>> b = list()
>>> b.append([a])
>>> b
[['foo']]

Do this:
>>> a = "foo"
>>> b = list()
>>> b.append([a])
>>> b
[["foo"]]

The reason this happens is that the list function works by taking each element of the sequence you pass it and putting them in a list. A string in Python is a sequence, the elements of the sequence are the individual characters.
Having this abstract concept of a "sequence" means that a lot of Python functions can work on multiple data types, as long as they accept a sequence. Once you get used to this idea, hopefully you'll start finding this concept more useful than surprising.

you sound like you want to break on word boundaries instead of on each letter.
Try something like
a = "foo bar"
b = list()
b.append(a.split(' ')) # [['foo', 'bar']]
Example with RegEx (to support multiple consecutive spaces) :
import re
a = "foo bar"
b.append(re.split(r'\s+', a)) # [['foo', 'bar']]

Get the first character of the first string in a list?

How would I get the first character from the first string in a list in Python?
It seems that I could use mylist[0][1:] but that does not give me the first character.
>>> mylist = []
>>> mylist.append("asdf")
>>> mylist.append("jkl;")
>>> mylist[0][1:]
'sdf'

You almost had it right. The simplest way is
mylist[0][0] # get the first character from the first item in the list
but
mylist[0][:1] # get up to the first character in the first item in the list
would also work.
You want to end after the first character (character zero), not start after the first character (character zero), which is what the code in your question means.

Get the first character of a bare python string:
>>> mystring = "hello"
>>> print(mystring[0])
h
>>> print(mystring[:1])
h
>>> print(mystring[3])
l
>>> print(mystring[-1])
o
>>> print(mystring[2:3])
l
>>> print(mystring[2:4])
ll
Get the first character from a string in the first position of a python list:
>>> myarray = []
>>> myarray.append("blah")
>>> myarray[0][:1]
'b'
>>> myarray[0][-1]
'h'
>>> myarray[0][1:3]
'la'
Numpy operations are very different than python list operations.
Python has list slicing, indexing and subsetting. Numpy has masking, slicing, subsetting, indexing.
These two videos cleared things up for me.
"Losing your Loops, Fast Numerical Computing with NumPy" by PyCon 2015:
https://youtu.be/EEUXKG97YRw?t=22m22s
"NumPy Beginner | SciPy 2016 Tutorial" by Alexandre Chabot LeClerc:
https://youtu.be/gtejJ3RCddE?t=1h24m54s

Indexing in python starting from 0. You wrote [1:] this would not return you a first char in any case - this will return you a rest(except first char) of string.
If you have the following structure:
mylist = ['base', 'sample', 'test']
And want to get fist char for the first one string(item):
myList[0][0]
>>> b
If all first chars:
[x[0] for x in myList]
>>> ['b', 's', 't']
If you have a text:
text = 'base sample test'
text.split()[0][0]
>>> b

Try mylist[0][0]. This should return the first character.

If your list includes non-strings, e.g. mylist = [0, [1, 's'], 'string'], then the answers on here would not necessarily work. In that case, using next() to find the first string by checking for them via isinstance() would do the trick.
next(e for e in mylist if isinstance(e, str))[:1]
Note that ''[:1] returns '' while ''[0] spits IndexError, so depending on the use case, either could be useful.
The above results in StopIteration if there are no strings in mylist. In that case, one possible implementation is to set the default value to None and take the first character only if a string was found.
first = next((e for e in mylist if isinstance(e, str)), None)
first_char = first[0] if first else None

Find array item in a string

I know can use string.find() to find a substring in a string.
But what is the easiest way to find out if one of the array items has a substring match in a string without using a loop?
Pseudocode:
string = 'I would like an apple.'
search = ['apple','orange', 'banana']
string.find(search) # == True

You could use a generator expression (which somehow is a loop)
any(x in string for x in search)
The generator expression is the part inside the parentheses. It creates an iterable that returns the value of x in string for each x in the tuple search. x in string in turn returns whether string contains the substring x. Finally, the Python built-in any() iterates over the iterable it gets passed and returns if any of its items evaluate to True.
Alternatively, you could use a regular expression to avoid the loop:
import re
re.search("|".join(search), string)
I would go for the first solution, since regular expressions have pitfalls (escaping etc.).

Strings in Python are sequences, and you can do a quick membership test by just asking if one string exists inside of another:
>>> mystr = "I'd like an apple"
>>> 'apple' in mystr
True
Sven got it right in his first answer above. To check if any of several strings exist in some other string, you'd do:
>>> ls = ['apple', 'orange']
>>> any(x in mystr for x in ls)
True
Worth noting for future reference is that the built-in 'all()' function would return true only if all items in 'ls' were members of 'mystr':
>>> ls = ['apple', 'orange']
>>> all(x in mystr for x in ls)
False
>>> ls = ['apple', 'like']
>>> all(x in mystr for x in ls)
True

The simpler is
import re
regx = re.compile('[ ,;:!?.:]')
string = 'I would like an apple.'
search = ['apple','orange', 'banana']
print any(x in regx.split(string) for x in search)
EDIT
Correction, after having read Sven's answer: evidently, string has to not be splited, stupid ! any(x in string for x in search) works pretty well
If you want no loop:
import re
regx = re.compile('[ ,;:!?.:]')
string = 'I would like an apple.'
search = ['apple','orange', 'banana']
print regx.split(string)
print set(regx.split(string)) & set(search)
result
set(['apple'])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Splitting words into variables - python

I have been doing some Python coding recntly and wanted to do the following: import shlex shlex.split("this is a test") print (shlex.split("this is a test")) It works, but I want to store the split phrase into different variables, if anyone can help me that would be awesome. Thanks!

Like this? >>> str = "this is a test" >>> arr = str.split(" ") >>> arr ['this', 'is', 'a', 'test'] >>> arr[0] 'this' >>> a = arr[0] >>> b = arr[1] >>> c = arr[2] >>> d = arr[3] >>> a 'this'

Related

How to change the index of an element in a list/array to another position/index without deleting/changing the original element and its value

Python function to modify string

Python disable list string "breaking"

Get the first character of the first string in a list?

Find array item in a string

Categories

Resources