Not getting the output I'm expecting when using .join - python

I'm trying to remove the brackets and commas using .join. It works in other places in my program but not here. This is the code:
def load():
fileName = raw_input("Please enter the name of the save file to load. Please don't enter '.txt'.")
return open(fileName+".txt", "r")
fileToLoad = load()
fileData = fileToLoad.readlines()
code = (fileData[4])
splitcode = "".join(code)
print code
print splitcode
and the two outputs I'm getting are both:
['Y', 'G', 'R']
['Y', 'G', 'R']
I thought that the second output should be:
YGR
Thanks for the help!

It appears that code is the literal string "['Y', 'G', 'R']" and not a list as would be required for join to work as you expect. The simplest way around this is to first convert code into a list by calling ast.literal_eval on it, or, if you can be absolutely sure the contents of the file contain nothing malicious or malformed, eval.

Rather than doing something as dangerous as eval, you could try turning it in to valid JSON then loading that.
code = code.replace("'", '"')
listified = json.loads(code)
joined = ''.join(listified)

Related

Imported string in dataframe not the same as same string assigned to variable

I need to import a excel file - either .csv or .xlsx - and then use that data to update some fields on a website. However, I am running into this problem where the data that I import is not the same as what I might have imported as a string.
I've tried .csv and .xlsx, and both file formats give me the same problem.
I'm using selenium to get the data from a field on a website.
search = driver.find_element(By.ID,'comments')
comment_text = search.get_attribute("value")
# Comment text will display something generic like:
comment_text = 'This is a comment. This is an addition test. This is an addition test. \nThis confirms that I can add comments.\nThis comment should be different from the others in the field. It is a test to make sure everything is alright.'
I import my file:
request_list = pd.read_excel(r'PATH\request_comms.xlsx')
Which looks similar to this
Requests
Comment
0
11001
This comment should be different...
1
12012
Request needs to be updated by...
2
13012
NaN
If I set the string to a variable and test if it exists as a substring:
comment = "This comment should be different from the others in the field. It is a test to make sure everything is alright."
if comment in comment_text:
print(True)
else:
print(False)
It will print True.
However, if I try with the imported string:
if request_list['Comment'][0] in comment_text:
print(True)
else:
print(False)
It will print False
If I compare the two strings
if request_list['Comment'][0] == comment:
print(True)
else:
print(False)
It will print False.
The strings are typed exactly the same. I have also created my own dataframe with the comment, and that worked.
df = pd.DataFrame(data={'Comment': comment})
if df['Comment'][0] in comment_text:
print(True)
else:
print(False)
Which prints True.
So the only different seems to be importing the file. Does anybody know why this would be the case? In the excel and csv files both were set to text, and the dataframe type is string.
Edit:
I also tested the two strings this way:
first_set = set(request_list['Comment'][0])
second_set = set(comment)
difference = first_set.symmetric_difference(second_set)
print(difference)
And got "{'m', 'c', 'b', 'g', 'y', 'r', 'k', 't', 'f', 's', 'o', 'This comment should be different from the others in the field. It is a test to make sure everything is alright.', 'u', 'n', 'd', 'T', 'h', 'I', 'i', ' ', 'e', 'l', '.', 'v', 'a'}" as my output.

.split() returning empty results

I am trying to split a list that I have converted with str(), but I don't seem to be returning any results?
My code is as follows:
import csv
def csv_read(file_obj):
reader=csv.DictReader(file_obj,delimiter=',')
for line in reader:
unique_id.append(line["LUSERFLD4"])
total_amt.append(line["LAMOUNT1"])
luserfld10.append(line["LUSERFLD10"])
break
bal_fwd, junk, sdc, junk2, est_read=str(luserfld10).split(' ')
if __name__=="__main__":
with open("UT_0004A493.csv") as f_obj:
csv_read(f_obj)
print (luserfld10)
print (bal_fwd)
print (sdc)
print (est_read)
print (luserfld10) returns ['N | N | Y'] which is correct. (Due to system limitations when creating the csv file, this field holds three separate values)
All variables have been defined and I'm not getting any errors, but my last three print commands are returning empty lists?
I've tried dedenting the .split() line, but then I can unpack only one value.
How do I get them to each return N or Y?
Why isn't it working as it is?
I'm sure it's obvious, but this is my first week of coding and I haven't been able to find the answer anywhere here. Any help (with explanations please) would be appreciated :)
Edit: all defined variables are as follows:
luserfld10=[]
bal_fwd=[]
sdc=[]
est_read=[]
etc.
File contents I'm not certain how to show? I hope this is okay?
LACCNBR,LAMOUNT1,LUSERFLD4,LUSERFLD5,LUSERFLD6,LUSERFLD8,LUSERFLD9,LUSERFLD10
1290,-12847.28,VAAA0022179,84889.363,Off Peak - nil,5524.11,,N | N | N
2540255724,12847.28,VAAA0022179,84889.363,Off Peak - nil,5524.11,,N | N | N
If the luserfld10 is ['N | N | Y']
then,
luserfld10[0].replace('|', '').split()
Result:
['N', 'N', 'Y']
Even if you fix the .split stuff in
bal_fwd, junk, sdc, junk2, est_read=str(luserfld10).split(' ')
it won't do what you want because it's assigning the results of the split to local names bal_fwd, sdc, etc, that only exist inside the csv_read function, not to the names you defined outside the function in the global scope.
You could use global statements to tell Python to assign those values to the global names, but it's generally best to avoid using the global statement unless you really need it. Also, merely using a global statement won't put the string data into your bal_fwd list. Instead, it will bind the global name to your string data and discard the list. If you want to put the string into the list you need to .append it, like you did with unique_id. You don't need global for that, since you aren't performing an assignment, you're just modifying the existing list object.
Here's a repaired version of your code, tested with the data sample you posted.
import csv
unique_id = []
total_amt = []
luserfld10 = []
bal_fwd = []
sdc = []
est_read = []
def csv_read(file_obj):
for line in csv.DictReader(file_obj, delimiter=','):
unique_id.append(line["LUSERFLD4"])
total_amt.append(line["LAMOUNT1"])
fld10 = line["LUSERFLD10"]
luserfld10.append(fld10)
t = fld10.split(' | ')
bal_fwd.append(t[0])
sdc.append(t[1])
est_read.append(t[2])
if __name__=="__main__":
with open("UT_0004A493.csv") as f_obj:
csv_read(f_obj)
print('id', unique_id)
print('amt', total_amt)
print('fld10', luserfld10)
print('bal', bal_fwd)
print('sdc', sdc)
print('est_read', est_read)
output
id ['VAAA0022179', 'VAAA0022179']
amt ['-12847.28', '12847.28']
fld10 ['N | N | N', 'N | N | N']
bal ['N', 'N']
sdc ['N', 'N']
est_read ['N', 'N']
I should mention that using t = fld10.split(' | ') is a bit fragile: if the separator isn't exactly ' | ' then the split will fail. So if there's a possibility that there might not be exactly one space either side of the pipe (|) then you should use a variation of Jinje's suggestion:
t = fld10.replace('|', ' ').split()
This replaces all pipe chars with spaces, and then splits on runs of white space, so it's guaranteed to split the subields correctly, assuming there's at least one space or pipe between each subfield (Jinje's original suggestion will fail if both spaces are missing on either side of the pipe).
Breaking your data up into separate lists may not be a great strategy: you have to be careful to keep the lists synchronised, so it's tricky to sort them or to add or remove items. And it's tedious to manipulate all the data as a unit when you have it spread out over half a dozen named lists.
One option is to put your data into a dictionary of lists:
import csv
from pprint import pprint
def csv_read(file_obj):
data = {
'unique_id': [],
'total_amt': [],
'bal_fwd': [],
'sdc': [],
'est_read': [],
}
for line in csv.DictReader(file_obj, delimiter=','):
data['unique_id'].append(line["LUSERFLD4"])
data['total_amt'].append(line["LAMOUNT1"])
fld10 = line["LUSERFLD10"]
t = fld10.split(' | ')
data['bal_fwd'].append(t[0])
data['sdc'].append(t[1])
data['est_read'].append(t[2])
return data
if __name__=="__main__":
with open("UT_0004A493.csv") as f_obj:
data = csv_read(f_obj)
pprint(data)
output
{'bal_fwd': ['N', 'N'],
'est_read': ['N', 'N'],
'sdc': ['N', 'N'],
'total_amt': ['-12847.28', '12847.28'],
'unique_id': ['VAAA0022179', 'VAAA0022179']}
Note that csv_read doesn't directly modify any global variables. It creates a dictionary of lists and passes it back to the code that calls it. This makes the code more modular; trying to debug large programs that use globals can become a nightmare because you have to keep track of every part of the program that modifies those globals.
Alternatively, you can put the data into a list of dictionaries, one per row.
def csv_read(file_obj):
data = []
for line in csv.DictReader(file_obj, delimiter=','):
luserfld10 = line["LUSERFLD10"]
bal_fwd, sdc, est_read = luserfld10.split(' | ')
# Put desired data and into a new dictionary
row = {
'unique_id': line["LUSERFLD4"],
'total_amt': line["LAMOUNT1"],
'bal_fwd': bal_fwd,
'sdc': sdc,
'est_read': est_read,
}
data.append(row)
return data
if __name__=="__main__":
with open("UT_0004A493.csv") as f_obj:
data = csv_read(f_obj)
pprint(data)
output
[{'bal_fwd': 'N',
'est_read': 'N',
'sdc': 'N',
'total_amt': '-12847.28',
'unique_id': 'VAAA0022179'},
{'bal_fwd': 'N',
'est_read': 'N',
'sdc': 'N',
'total_amt': '12847.28',
'unique_id': 'VAAA0022179'}]

Python - Iterating through a list of list with a specifically formatted output; file output

Sorry to ask such a trivial question but I can't find the answer anyway and it's my first day using Python (need it for work). Think my problem is trying to use Python like C. Anyway, here is what I have:
for i in data:
for j in i:
print("{}\t".format(j))
Which gives me data in the form of
elem[0][0]
elem[1][0]
elem[2][0]
...
elem[0][1]
elem[1][1]
...
i.e. all at once. What I really want to do, is access each element directly so I can output the list of lists data to a file whereby the elements are separated by tabs, not commas.
Here's my bastardised Python code for outputting the array to a file:
k=0
with open("Output.txt", "w") as text_file:
for j in data:
print("{}".format(data[k]), file=text_file)
k += 1
So basically, I have a list of lists which I want to save to a file in tab delimited/separated format, but currently it comes out as comma separated. My approach would involve reiterating through the lists again, element by element, and saving the output by forcing in the the tabs.
Here's data excerpts (though changed to meaningless values)
data
['a', 'a', 306518, ' 1111111', 'a', '-', .... ]
['a', 'a', 306518, ' 1111111', 'a', '-', .... ]
....
text_file
a a 306518 1111111 a -....
a a 306518 1111111 a -....
....
for i in data:
print("\t".join(i))
if data is something like this '[[1,2,3],[2,3,4]]'
for j in data:
text_file.write('%s\n' % '\t'.join(str(x) for x in j))
I think this should work:
with open(somefile, 'w') as your_file:
for values in data:
print("\t".join(valeues), file=your_file)

How do i format the ouput of a list of list into a textfile properly?

I am really new to python and now I am struggeling with some problems while working on a student project. Basically I try to read data from a text file which is formatted in columns. I store the data in a list of list and sort and manipulate the data and write them into a file again. My problem is to align the written data in proper columns. I found some approaches like
"%i, %f, %e" % (1000, 1000, 1000)
but I don't know how many columns there will be. So I wonder if there is a way to set all columns to a fixed width.
This is how the input data looks like:
2 232.248E-09 74.6825 2.5 5.00008 499.482
5 10. 74.6825 2.5 -16.4304 -12.3
This is how I store the data in a list of list:
filename = getInput('MyPath', workdir)
lines = []
f = open(filename, 'r')
while 1:
line = f.readline()
if line == '':
break
splitted = line.split()
lines.append(splitted)
f.close()
To write the data I first put all the row elements of the list of list into one string with a free fixed space between the elements. But instead i need a fixed total space including the element. But also I don't know the number of columns in the file.
for k in xrange(len(lines)):
stringlist=""
for i in lines[k]:
stringlist = stringlist+str(i)+' '
lines[k] = stringlist+'\n'
f = open(workdir2, 'w')
for i in range(len(lines)):
f.write(lines[i])
f.close()
This code works basically, but sadly the output isn't formatted properly.
Thank you very much in advance for any help on this issue!
You are absolutely right about begin able to format widths as you have above using string formatting. But as you correctly point out, the tricky bit is doing this for a variable sized output list. Instead, you could use the join() function:
output = ['a', 'b', 'c', 'd', 'e',]
# format each column (len(a)) with a width of 10 spaces
width = [10]*len(a)
# write it out, using the join() function
with open('output_example', 'w') as f:
f.write(''.join('%*s' % i for i in zip(width, output)))
will write out:
' a b c d e'
As you can see, the length of the format array width is determined by the length of the output, len(a). This is flexible enough that you can generate it on the fly.
Hope this helps!
String formatting might be the way to go:
>>> print("%10s%9s" % ("test1", "test2"))
test1 test2
Though you might want to first create strings from those numbers and then format them as I showed above.
I cannot fully comprehend your writing code, but try working on it somehow like that:
from itertools import enumerate
with open(workdir2, 'w') as datei:
for key, item in enumerate(zeilen):
line = "%4i %6.6" % key, item
datei.write(item)

can a list be converted to an integer

I am trying to write a program to convert a message inta a secret code. I m trying to create a basic code to work up from. here is the problem.
data = input('statement')
for line in data:
code = ('l' == '1',
'a' == '2'
'r' == '3',
'y' == '4')
line = line.replace(data, code, [data])
print(line)
this point of the above progam is so when i input my name:
larry
the output should be
12334
but I continue to recieve this message
TypeError: 'list' object cannot be interpreted as an integer
so i assumed this meant that my code variable must be an integer to be used in replace()
is there a way to convert that string into an integer or is there another way to fix this?
The reason why your original code gave you the error is because of line.replace(data, code, [data]). The str.replace method can take 3 arguments. The first is the string you want to replace, the second is the replacement string, and the third, optional argument is how many instances of the string you want to replace - an integer. You were passing a list as the third argument.
However, there are other problems to your code as well.
code is currently (False, False, False, False). What you need is a dictionary. You might also want to assign it outside of the loop, so you don't evaluate it every iteration.
code = {'l': '1', 'a': '2', 'r': '3', 'y': '4'}
Then, change your loop to this:
data = ''.join(code[i] for i in data)
print(data) gives you the desired output.
Note however that if a letter in the input isn't in the dictionary, you'll get an error. You can use the dict.get method to supply a default value if the key isn't in the dictionary.
data = ''.join(code.get(i, ' ') for i in data)
Where the second argument to code.get specifies the default value.
So your code should look like this:
code = {'l': '1', 'a': '2', 'r': '3', 'y': '4'}
data = input()
data = ''.join(code.get(i, ' ') for i in data)
print(data)
Just to sum up:
% cat ./test.py
#!/usr/bin/env python
data = raw_input()
code = {'l': '1', 'a': '2',
'r': '3', 'y': '4'}
out = ''.join(code[i] for i in data)
print (out)
% python ./test.py
larry
12334
You can use translate:
>>> print("Larry".lower().translate(str.maketrans('lary', '1234')))
12334
(assuming Python 3)
The previous comments should give you a good explanation on your error message,
so I will just give you another way to make the translation from data to code.
We can make use of Python's translate method.
# We will use the "maketrans" function, which is not included in Python's standard Namespace, so we need to import it.
from string import maketrans
data = raw_input('statement')
# I recommend using raw_input when dealing with strings, this way
# we won't need to write the string in quotes.
# Now, we create a translation table
# (it defines the mapping between letters and digits similarly to the dict)
trans_table = maketrans('lary', '1234')
# And we translate the guy based on the trans_table
secret_data = data.translate(trans_table)
# secret_data is now a string, but according to the post title you want integer. So we convert the string into an integer.
secret_data = int(secret_data)
print secret_data
Just for the record, if you are interested in encoding data, you should check for
hashing.
Hashing is a widely used method for generating secret data format.
A simple example of hashing in Python (using the so-called sha256 hashing method):
>>> import hashlib
>>> data = raw_input('statement: ')
statement: larry
>>> secret_data = hashlib.sha256(data)
>>>print secret_data.hexdigest()
0d098b1c0162939e05719f059f0f844ed989472e9e6a53283a00fe92127ac27f

Categories

Resources