Split string in multiple places - python

New to programming and currently working with python. I am trying to take a user inputted string (containing letters, numbers and special characters), I then need to split it multiple times at different points to reform new strings. I have done research on the splitting of strings (and lists) and feel I understand it but I still know there must be a better way to do this than I can think of.
This is what I currently have
ass=input("Enter Assembly Number: ")
#Sample Input 1 - BF90UQ70321-14
#Sample Input 2 - BS73OA91136-43
ass0=ass[0]
ass1=ass[1]
ass2=ass[2]
ass3=ass[3]
ass4=ass[4]
ass5=ass[5]
ass6=ass[6]
ass7=ass[7]
ass8=ass[8]
ass9=ass[9]
ass10=ass[10]
ass11=ass[11]
ass12=ass[12]
ass13=ass[13]
code1=ass0+ass2+ass3+ass4+ass5+ass6+ass13
code2=ass0+ass2+ass3+ass4+ass5+ass6+ass9
code3=ass1+ass4+ass6+ass7+ass12+ass6+ass13
code4=ass1+ass2+ass4+ass5+ass6+ass9+ass12
# require 21 different code variations
Please tell me that there is a better way to do this.
Thank you

Give a look to this code and Google "python string slicing" (a nice tutorial for beginners is at https://www.youtube.com/watch?v=EqAgMUPRh7U).
String (and list) slicing is used a lot in Python. Be sure to learn it well. The upper index could be not so intuitive, but it becomes second nature.
ass="ABCDEFGHIJKLMN"
code1 = ass[0] + ass[2:7] + ass[13] # ass[2:7] is to extract 5 chars starting from index 2 (7 is excluded)
code2 = ass[0] + ass[3:7] + ass[9]
code3 = ass[1] + ass[4] + ass[6:8] + ass[12] + ass[6] + ass[13]
code4 = ass[1:3] + ass[4:7] + ass[9] + ass[12]
PS: You probably need also to check if the string length is 14 before working with it.
EDIT: Second solution
Here is another solution, perhaps it is easier to follow:
def extract_chars(mask):
chars = ""
for i in mask:
chars += ass[i]
return chars
mask = [0,2,3,4,5,6,13]
print extract_chars(mask)
Here you define a mask of indexes of the chars you want to extract.

You can try something like this,
input1 = 'BF90UQ70321-14'
code = lambda anum, pos: ''.join(anum[p] for p in pos)
code4 = code(input1, (1,2,4,5,6,9,12))

Related

Amateur Text Editor Malfunctioning

My brother and I are creating a simple text editor that changes entries to pig latin using Python. Code below:
our_word = ("cat")
vowels = ("a","e","i","o","u")
#remember I have to compare variables not strings
way = "way"
for i in range(len(our_word)):
for j in range (len(vowels)):
#checking if there is any vowel present
if our_word[i] == vowels[j]:
# if there were to be any vowels our_word[i] wil now be changed with way
#.replace is our function the dot is what notates this in the python library
our_word = our_word.replace(our_word[i], way)
print(our_word)
Right now we're testing the word 'cat' but the program when run returns the following:
/Users/x/PycharmProjects/pythonProject3/venv/bin/python /Users/x/PycharmProjects/pythonProject3/main.py
cwwayyt
Process finished with exit code 0
We're not sure why there is a double 'w' and a double 'y'. It seems the word 'cat' is edited once to 'cwayt' and then a second time to 'cwwayyt'.
Any suggestions are welcome!
The problem arises from the fact that on the next iteration of for loop after doing the substitution, you are looking at the next position, which is part of the way that you just substituted into place. Instead, you need to skip past this. You would also experience another problem, that it only loops up to the original length, rather than the new increased length. You are probably better in this situation to use a while loop with an index variable that you can manipulate to point to the correct place as needed. For example:
our_word = "cat"
vowels = "aeiou"
way = "way"
i = 0
while i < len(our_word):
if our_word[i] in vowels:
our_word = our_word[:i] + way + our_word[i + 1:]
i += len(way) # <=== if you made a substitution, skip over the bit
# that you just substituted in place
else:
i += 1 # <=== if you didn't make any substitution
# just go to the next position next time
print(our_word)

Looking for vulnerabilities in my code (split method)

Here I've tried to recreate the str.split() method in Python. I've tried and tested this code and it works fine, but I'm looking for vulnerabilities to correct. Do check it out and give feedback, if any.
Edit:Apologies for not being clear,I meant to ask you guys for exceptions where the code won't work.I'm also trying to think of a more refined way without looking at the source code.
def splitt(string,split_by = ' '):
output = []
x = 0
for i in range(string.count(split_by)):
output.append((string[x:string.index(split_by,x+1)]).strip())
x = string.index(split_by,x+1)
output.append((((string[::-1])[:len(string)-x])[::-1]).strip())
return output
There are in fact a few problems with your code:
by searching from x+1, you may miss an occurance of split_by at the very start of the string, resulting in index to fail in the last iteration
you are calling index more often than necessary
strip only makes sense if the separator is whitespace, and even then might remove more than intended, e.g. trailing spaces when splitting lines
instead, add len(split_by) to the offset for the next call to index
no need to reverse the string twice in the last step
This should fix those problems:
def splitt(string,split_by=' '):
output = []
x = 0
for i in range(string.count(split_by)):
x2 = string.index(split_by, x)
output.append((string[x:x2]))
x = x2 + len(split_by)
output.append(string[x:])
return output

is there a way to modify a string to remove a decimal?

I have a file with a lot of images. Each image is named something like:
100304.jpg
100305.jpg
100306.jpg
etc...
I also have a spreadsheet, Each image is a row, the first value in the row is the name, the values after the name are various decimals and 0's to describe features of each image.
The issue is that when I pull the name from the sheet, something is adding a decimal which then results in the file not being able to be transferred via the shutil.move()
import xlrd
import shutil
dataLocation = "C:/Users/User/Documents/Python/Project/sort_solutions_rev1.xlsx"
imageLocBase = "C:/Users/User/Documents/Python/Project/unsorted"
print("Specify which folder to put images in. Type the number only.")
print("1")
print("2")
print("3")
int(typeOfSet) = input("")
#Sorting for folder 1
if int(typeOfSet) == 1:
#Identifying what to move
name = str(sheet.cell(int(nameRow), 0).value)
sortDataStorage = (sheet.cell(int(nameRow), 8).value) #float
sortDataStorageNoFloat = str(sortDataStorage) #non-float
print("Proccessing: " + name)
print(name + " has a correlation of " + (sortDataStorageNoFloat))
#sorting for this folder utilizes the information in column 8)
if sortDataStorage >= sortAc:
print("test success")
folderPath = "C:/Users/User/Documents/Python/Project/Image Folder/Folder1"
shutil.move(imageLocBase + "/" + name, folderPath)
print(name + " has been sorted.")
else:
print(name + " does not meet correlation requirement. Moving to next image.")
The issue I'm having occurs with the shutil.move(imageLocBase + "/" +name, folderPath)
For some reason my code takes the name from the spreadsheet (ex: 100304) and then adds a ".0" So when trying to move a file, it is trying to move 100304.0 (which doesn't exist) instead of 100304.
Using pandas to read your Excel file.
As suggested in a comment on the original question, here is a quick example of how to use pandas to read your Excel file, along with an example of the data structure.
Any questions, feel free to shout, or have a look into the docs.
import pandas as pd
# My path looks a little different as I'm on Linux.
path = '~/Desktop/so/MyImages.xlsx'
df = pd.read_excel(path)
Data Structure
This is completely contrived as I don't have an example of your actual file.
IMAGE_NAME FEATURE_1 FEATURE_2 FEATURE_3
0 100304.jpg 0.0111 0.111 1.111
1 100305.jpg 0.0222 0.222 2.222
2 100306.jpg 0.0333 0.333 3.333
Hope this helps get you started.
Suggestion:
Excel likes to think it's clever and does 'unexpected' things, as you're experiencing with the decimal (data type) issue. Perhaps consider storing your image data in a database (SQLite) or as plain old CSV file. Pandas can read from either of these as well! :-)
splitOn = '.'
nameOfFile = text.split(splitOn, 1)[0]
Should work
if we take your file name eg 12345.0 and create a var
name = "12345.0"
Now we need to split this var. In this case we wish to split on .
So we save this condition as a second var
splitOn = '.'
Using the .split for python.
Here we offer the text (variable name) and the python split command.
so to make it literal
12345.0
split at .
only make one split and save as two vars in a list
(so we have 12345 at position 0 (1st value)
and 0 at position 1 (2nd value) in a list)
save 1st var
(as all lists are 0 based we ask for [0]
(if you ever get confused with list, arrays etc just start counting
from 0 instead of one on your hands and then you know
ie position 0 1 2 3 4 = 1st value, 2nd value, 3rd value, 4th value, 5th value)
nameOfFile = name.split(splitOn, 1)[0]
12345.0 split ( split on . , only one split ) save position 0 ie first value
So.....
name = 12345.0
splitOn = '.'
nameOfFile = name.split(splitOn, 1)[0]
yield(nameOfFile)
output will be
12345
I hope that helps
https://www.geeksforgeeks.org/python-string-split/
OR
as highlighted below, convert to float to in
https://www.geeksforgeeks.org/type-conversion-python/
if saved as float
name 12345.0
newName = round(int(name))
this will round the float (as its 0 will round down)
OR
if float is saved as a string
print(int(float(name)))
Apparently the value you retrieve from the spreadsheet comes parsed as a float, so when you cast it to string it retains the decimal part.
You can trim the “.0” from the string value, or cast it to integer before casting to string.
You could also check the spreadsheet’s cell format and ensure it is set to normal (idk the setting, but something that is not a number). With that fixed, your data probably wont come with the .0 anymore.
If always add ".0" to the end of the variable, You need to read the var_string "name" in this way:
shutil.move(imageLocBase + "/" + name[:-2], folderPath)
A string is like a list that we can choose the elements to read.
Slicing is colled this method
Sorry for my English. Bye
All these people have taken time to reply, please out of politeness rate the replies.

Python: Change variable suffix with for loop

I know this was asked a lot but I can not work with/understand the answers so far.
I want to change the suffix of variables in a for loop.
I tried all answers the stackoverflow search provides. But it is difficult to understand specific codes the questioner often presents.
So for clarification I use an easy example. This is not meant as application-oriented. I just want to understand how I can change the suffix.
var_1 = 10
var_2 = 100
var_3 = 1000
for i in range(1,4):
test_i = var_i + 1
print(test_i)
Expected result:
creating and printing variables:
test_1 = 11
test_2 = 101
test_3 = 1001
Expected Output
11
101
1001
Error: var_i is read as a variable name without the changes for i.
I would advise against using eval in 99.99% of all cases. What you could do is use the built-in getattr function:
import sys
var_1 = 10
var_2 = 100
var_3 = 1000
for i in range(1,4):
test_i = getattr(sys.modules[__name__], f"var_{i}") + 1
print(test_i)
Instead of doing a convoluted naming convention, try to conceive of your problem using a data structure like dictionaries, for example.
var={}
var[1] = 10
var[2] = 100
var[3] = 1000
test={}
for i in range(1,4):
test[i] = var[i] +1
print(test)
If somehow you are given var_1 etc as input, maybe use .split("_") to retrieve the index number and use that as the dictionary keys (they can be strings or values).
Small explanation about using indexing variable names. If you are starting out learning to program, there are many reasons not to use the eval, exec, or getattr methods. Most simply, it is inefficient, not scalable, and is extremely hard to use anywhere else in the script.
I am not one to insist on "best practices" if there is an easier way to do something, but this is something you will want to learn to avoid. We write programs to avoid having to type things like this.
If you are given that var_2 text as a starting point, then I would use string parsing tools to split and convert the string to values and variable names.
By using a dictionary, you can have 1000 non-consecutive variables and simply loop through them or assign new associations. If you are doing an experiment, for example, and call your values tree_1, tree_10 etc, then you will always be stuck typing out the full variable names in your code rather than simply looping through all the entries in a container called tree.
This is a little related to using a bunch of if:else statements to assign values:
# inefficient way -- avoid
if name == 'var_1' then:
test_1=11
elif name == 'var_2' then:
test_2=101
It is so much easier just to say:
test[i]= var[i]+1
and that one line will work for any number of values.
for i in range(1, 4):
print(eval('var_' + str(i)))
Step by step:
1) Make your variables strings:
stringified_number = str(i)
2) evaluate your expression during runtime:
evaluated_variable = eval('var_' + stringified_number)

Python - Zip code to Barcode

The code is supposed to take a 5 digit zip code input and convert it to bar codes as the output. The bar code for each digit is:
{1:'...!!',2:'..!.!',3:'..!!.',4:'.!..!',5:'.!.!.',6:'.!!..',7:'!...!',8:'!..!.',9:'!.!..',0:'!!...'}
For example, the zip code 95014 is supposed to produce:
!!.!.. .!.!. !!... ...!! .!..! ...!!!
There is an extra ! at the start and end, that is used to determine where the bar code starts and stops. Notice that at the end of the bar code is an extra ...!! which is an 1. This is the check digit and you get the check digit by:
Adding up all the digits in the zipcode to make the sum Z
Choosing the check digit C so that Z + C is a multiple of 10
For example, the zipcode 95014 has a sum of Z = 9 + 5 + 0 + 1 + 4 = 19, so the check digit C is 1 to make the total sum Z + C equal to 20, which is a multiple of 10.
def printDigit(digit):
digit_dict = {1:'...!!',2:'..!.!',3:'..!!.',4:'.!..!',5:'.!.!.',6:'.!!..',7:'!...!',8:'!..!.',9:'!.!..',0:'!!...'}
return digit_dict[digit]
def printBarCode(zip_code):
sum_digits=0
num=zip_code
while num!=0:
sum_digits+=(num%10)
num/=10
rem = 20-(sum_digits%20)
answer=[]
for i in str(zip_code):
answer.append(printDigit(int(i)))
final='!'+' '.join(answer)+'!'
return final
print printBarCode(95014)
The code I currently have produces an output of
!!.!.. .!.!. !!... ...!! .!..!!
for the zip code 95014 which is missing the check digit. Is there something missing in my code that is causing the code not to output the check digit? Also, what to include in my code to have it ask the user for the zip code input?
Your code computes rem based on the sum of the digits, but you never use it to add the check-digit bars to the output (answer and final). You need to add code to do that in order to get the right answer. I suspect you're also not computing rem correctly, since you're using %20 rather than %10.
I'd replace the last few lines of your function with:
rem = (10 - sum_digits) % 10 # correct computation for the check digit
answer=[]
for i in str(zip_code):
answer.append(printDigit(int(i)))
answer.append(printDigit(rem)) # add the check digit to the answer!
final='!'+' '.join(answer)+'!'
return final
Interesting problem. I noticed that you solved the problem as a C-style programmer. I'm guessing your background is in C/C++. I's like to offer a more Pythonic way:
def printBarCode(zip_code):
digit_dict = {1:'...!!',2:'..!.!',3:'..!!.',4:'.!..!',5:'.!.!.',
6:'.!!..',7:'!...!',8:'!..!.',9:'!.!..',0:'!!...'}
zip_code_list = [int(num) for num in str(zip_code)]
bar_code = ' '.join([digit_dict[num] for num in zip_code_list])
check_code = digit_dict[10 - sum(zip_code_list) % 10]
return '!{} {}!'.format(bar_code, check_code)
print printBarCode(95014)
I used list comprehension to work with each digit rather than to iterate. I could have used the map() function to make it more readable, but list comprehension is more Pythonic. Also, I used the Python 3.x format for string formatting. Here is the output:
!!.!.. .!.!. !!... ...!! .!..! ...!!!
>>>

Categories

Resources