Splitting a string after a specific character in python [duplicate] - python

This question already has answers here:
How to get a string after a specific substring?
(9 answers)
How can I split a URL string up into separate parts in Python?
(6 answers)
Closed 2 years ago.
I wanna split everything comes after = and assigning it into a new variable
example:
https://www.exaple.com/index.php?id=24124
I wanna split whatever comes after = which's in this case 24124 and put it into a new variable.

You can of course split this specific string. rsplit() would be a good choice since you are interested in the rightmost value:
s = "https://www.exaple.com/index.php?id=24124"
rest, n = s.rsplit('=', 1)
# n == '24124'
However, if you are dealing with URLs this is fragile. For example, a url to the same page might look like:
s = "https://www.exaple.com/index.php?id=24124#anchor"
and the above split would return '24124#anchor', which is probably not what you want.
Python includes good url parsing, which you should use if you are dealing with URLS. In this case it's just as simple to get what you want and less fragile:
from urllib.parse import (parse_qs, urlparse)
s = "https://www.exaple.com/index.php?id=24124"
qs = urlparse(s)
parse_qs(qs.query)['id'][0]
# '24124'

Simply you use .split() and then take the second part only
url = 'https://www.exaple.com/index.php?id=24124'
print(url.split('=')[1])

For your specific case, you could do...
url = "https://www.exaple.com/index.php?id=24124"
id_number = url.split('=')[1]
If you want to store id_number as an integer, then id_number = int(url.split('=')[1]) instead.

Related

How I read a word after the # symbol [duplicate]

This question already has answers here:
How to extract the substring between two markers?
(22 answers)
Closed last month.
I'm having a problem. I need to create the #everyone_or_person feature. A bit like discord. But I'll have to be able to read the word after the # and stop reading when there is a ("SPACE"/"_") and check for that word in the list. I've appended a simple version as an example. I knew it would not work but I couldn't think of anything else.
input = input("input: ")
value = input.find("#")
output = input.partition("#")[0]
print(str(output))
I've tried to look up how to do it but to no avail.
simply use split:
test = "Some input with #your_desired_value in it"
result = test.split("#")[1].split(" ")[0]
print(result)
this splits your text at the #, takes the entire string after the #, splits again at the first space, and takes the string before that.

Converting a string variable to a regular expression in python [duplicate]

This question already has answers here:
How to use a variable inside a regular expression?
(12 answers)
Closed 2 years ago.
I am creating a python function with two inputs: a file and a string, in which user can find the location of the string in the file. I figured the best way to do this would be with regular expressions. I have converted the file to one big string (file_string) earlier in the code. For example, let's say the user wants to find "hello" in the file.
input = "hello"
user_input = "r'(" + input + ")'"
regex = re.compile(user_input)
for match in regex.finditer(file_string):
print(match.start())
Creating a new string with r' ' around the input variable is not working. However, the code works perfectly if I replace user_input with r'hello'. How can I convert the string input the user enters to an expression that can be put into re.compile()?
Thanks in advance.
The r is just part of the syntax for raw string literals, which are useful for simplifying some regular expressions. For example, "\\foo" and r'\foo' produce the same str object. It is not part of the value of the string itself.
All you need to do is create a string with the value of input between ( and ).
input = "hello"
user_input = "(" + input + ")"
More efficiently (if only marginally so)
user_input = "({})".format(input)
Or more simply in recent versions of Python:
user_input = f'({input})'

Python regular expression parentheses matching not returning the proper substring [duplicate]

This question already has answers here:
Python non-greedy regexes
(7 answers)
Closed 3 years ago.
I am trying to use python's re to match a certain format.
import re
a = "y:=select(R,(time>50)and(qty<10))"
b = re.search("=.+\(",a).group(0)
print(b)
I actually want to select this portion"=select("from string a. but the code I have made outputs the answer as =select(R,(time>50)and(. I tried re.findall, but this too returns the same output. It does not notice the first match and only outputs the final match. Anywhere I'm going wrong? Your Help is greatly appreciated. I basically want to find the function name, in this case select. The strategy is used was appears after = and before (.
You are missing '?' in your pattern, try this:
=.+?\(
Demo
Another method that works - you specify explicitly what you need:
import re
a = "y:=select(R,(time>50)and(qty<10))"
# make sure your piece does not contain "("
b = re.search("=[^\(]+\(",a).group(0)
print(b)

Python matches the part after a .* at its last occurance [duplicate]

This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Closed 5 years ago.
Im trying to read the server states from the guildwars API. For that i match the servername, then comes an occasional language specifier and a ",\n which i intend to match with .* and after that follows the population. But instead of directly matching the first occurrence of population it instead matches the last one. Can someone tell me why( and how to fix this)?
Edit: I found a workaround. By substituting .* with .{,20} it works.
relevant part of the API
"name": "Riverside [DE]",
"population": "Full"
with urlopen('https://api.guildwars2.com/v2/worlds?ids=all') as api:
s = api.read()
s = s.decode('utf-8')
search = re.search(r'''Riverside.*"population": "''',s,re.S)
print(search)
s = s[search.span()[1]:]
state = re.search(r'[a-zA-Z]*',s)
print(state)
There are two things
You should use .*?(trailing question mark) which will stop at the first instance.I wont think this as good or better solution
Instead once you get the data convert it into JSON and do your manipulation on top of it
import json
with urlopen('https://api.guildwars2.com/v2/worlds?ids=all') as api:
s = api.read()
s = s.decode('utf-8')
jsondata = json.loads(s)
filtered_data = filter(lambda a: str(a["name"]).find("Riverside") > -1,jsondata)
print(filtered_data[0]["population"])

Differentiating between double and single characters? [duplicate]

This question already has an answer here:
Dynamic variable name in python
(1 answer)
Closed 6 years ago.
I have a weird variable name for a dictionary that I have to use that looks like so:
val.display_counter1.rjj = {}
This is illegal so I've decided to use this format:
val_display__counter1_rjj = {}
Later in the code I need to match up that dict variable name with the original name. So I'm trying to find a way to replace those single underscores with dots and the double underscores with a single underscore. I'm sure that there is a regex solution, but regex isn't my strong suit.
Is there a way to selectively replace like this?
Edit:
There is some confusion with my question so allow me to clarify. The original name:
val.display_counter1.rjj
This is NOT a variable in itself but merely an item name from the 3D software package Modo. There are many items that share this format. What I am trying to do is create a class of dicts that will store information about these items. I want to name the dicts for the items and be able to match them in program.
For me to make this match I need to revert my dict name back to it's original so I can make the match:
val_display__counter1_rjj --> val.display_counter1.rjj
All I need to know is how to make the Regex match ONLY the single underscore and discard the matches that are surrounded by other underscores.
Also, not sure why this is marked as duplicate. But my question doesn't involve dynamic variables.
Well, I am new to Python.
But hope this works!!!
import re;
val_display__counter1_rij= {};
l = ['val_display__counter1_rij', 'val_display___counter1_rij','val.display__counter1_rij'] # list of variables to match
for x in l:
if "." not in x:
article = re.sub(r'(?is)_', '.', x)
if ".." in article:
article= article.replace("..","__");
if (article == 'val.display__counter1.rij'):
print(article)

Categories

Resources