Add column number to every column - python

Im reading txt file and add array row by row. but I need to change every row like this
My list like = [[1strow],[2ndrow],[3rdrow],........,[8000throw]]. ıts like list in list.
My rows : Every row contain 23 letters but I only want to change 2-23 not first one.
e,a,b,c,d,r,y,t,w,s,e,t......s (23th letter , but If you start 0 cause of index, Its 22th)
t,y,e,e,s,f,g,r,t,q,w,e,r,.....s
What I want is
e,a1,b2,c3,d4,r5,y6,t7,w8,s9,e10,t11......s22
t,y1,e2,e3,s4,f5,g6,r7,t8,q9,w10,e11,r12,.....a22
My main code :
with open('C:/Users/xxx/Desktop/input/mushrooms.csv', 'r') as csvfile:
spamreader = csv.reader(csvfile)
for row in spamreader:
datas.append(row)
print(datas[0]) --> ['p', 'x', 's', 'n', 't', 'p', 'f', 'c', 'n', 'k', 'e', 'e', 's', 's', 'w', 'w', 'p', 'w', 'o', 'p', 'k', 's', 'u']
How can I do that with python ?

row = ['e','a','b','c','d','r','y','t','w','s','e','t']
newrow = row[0:1] + [letter + str(num) for num,letter in enumerate(row[1:],1)]
In your specific example,
newdatas = [row[0:1] + [letter + str(num) for num,letter in enumerate(row[1:],1)] for row in datas]

Related

For-loop not inserting a line break when using zip_longest in Python 3

I am writing a simple text comparison tool. It takes two text files - a template and a target - and compares each character in each line using two for-loops. Any differences are highlighted with a Unicode full block symbol (\u2588). In the case that the target line is longer than the template, I am using itertools.zip_longest to fill the non-existant characters with a fill value.
from itertools import zip_longest
def compare(filename1, filename2):
file1 = open(filename1, "r")
file2 = open(filename2, "r")
for line1, line2 in zip_longest(file1, file2):
for char1, char2 in zip_longest(line1, line2, fillvalue=None):
if char1 == char2:
print(char2, end='')
elif char1 == None:
print('\u2588', end='')
compare('template.txt', 'target.txt')
Template file: Target file:
First line First lineXX
Second line Second line
Third line Third line
However, this appears to mess with Python's automatic line break placement. When a line ends with such a fill value, a line break is not generated, giving this result:
First line██Second line
Third line
Instead of:
First line██
Second line
Third line
The issue persisted after rewriting the script to use .append and .join (not shown to keep it short), though it allowed me to highlight the issue:
Result when both files are identical:
['F', 'i', 'r', 's', 't', ' ', 'l', 'i', 'n', 'e', '\n']
First line
['S', 'e', 'c', 'o', 'n', 'd', ' ', 'l', 'i', 'n', 'e', '\n']
Second line
['T', 'h', 'i', 'r', 'd', ' ', 'l', 'i', 'n', 'e']
Third line
Result when first line of target file has two more characters:
['F', 'i', 'r', 's', 't', ' ', 'l', 'i', 'n', 'e', '█', '█']
First line██['S', 'e', 'c', 'o', 'n', 'd', ' ', 'l', 'i', 'n', 'e', '\n']
Second line
['T', 'h', 'i', 'r', 'd', ' ', 'l', 'i', 'n', 'e']
Third line
As you can see, Python automatically adds a line break \n if the lines are of identical length, but as soon as zip_longest is involved, the last character in the list is the block, not a line break. Why does this happen?
Strip your lines before comparing characters and print new line between each line:
from itertools import zip_longest
def compare(filename1, filename2):
file1 = open(filename1, "r")
file2 = open(filename2, "r")
for line1, line2 in zip_longest(file1, file2):
line1, line2 = line1.strip(), line2.strip() # <- HERE
for char1, char2 in zip_longest(line1, line2, fillvalue=None):
if char1 == char2:
print(char2, end='')
elif char1 == None:
print('\u2588', end='')
print() # <- HERE
compare('template.txt', 'target.txt')

Iterate over columns in a row in pandas

I have a csv file with following headers
question_no,question,A,B,C,D
where A,B,C,D are options for a question. The number of options for a question can vary from file to file(for eg. 4 - A,B,C,D 6 - A,B,C,D,E,F). I am trying to get the values of options in the row using the following code.
data = pd.read_csv(request.FILES['myfile'])
optioncodes = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
col_nos = len(data.columns)
opt_lmt = col_nos - 2
for (idx, row) in data.iterrows():
print(row.question_no)
for j in range(opt_lmt):
print(row.optioncodes[j])
but I am getting the error
'Series' object has no attribute 'optioncodes'
How can I achieve this?
The dot accessor (df.col_name or serie.index_value) is only a shortcut for the named element accessor (df['col_name'] or serie['index_value']). And it is only valid at 2 conditions:
the name must be a constant - while you want it to be a variable
the name must be a valid identifier (no space or special character)
What you want here is just:
...
for j in range(opt_lmt):
print(row[optioncodes[j]])

Python script to generate a word with specific structure and letter combinations

I want to write a really short script that will help me generate a random/nonsense word with the following qualities:
-Has 8 letters
-First letter is "A"
-Second and Fourth letters are random letters
-Fifth letter is a vowel
-Sixth and Seventh letters are random letters and are the same
-Eighth letter is a vowel that's not "a"
This is what I have tried so far (using all the info I could find and understand online)
firsts = 'A'
seconds = ['a','b','c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
thirds = ['a', 'e', 'i', 'o', 'u', 'y']
fourths = ['a','b','c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
fifths = ['a', 'e', 'i', 'o', 'u', 'y']
sixths = sevenths = ['a','b','c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
eighths = ['e', 'i', 'o', 'u', 'y']
print [''.join(first, second, third, fourth, fifth)
for first in firsts
for second in seconds
for third in thirds
for fourth in fourths
for fifth in fifths
for sixth in sixths
for seventh in sevenths
for eighth in eighths]
However it keeps showing a SyntaxError: invalid syntax after the for and now I have absolutely no idea how to make this work. If possible please look into this for me, thank you so much!
So the magic function you need to know about to pick a random letter is random.choice. You can pass a list into this function and it will give you a random element from that list. It also works with strings because strings are basically a list of chars. Also to make your life easier, use string module. string.ascii_lowercase returns all the letters from a to z in a string so you don't have to type it out. Lastly, you don't use loops to join strings together. Keep it simple. You can just add them together.
import string
from random import choice
first = 'A'
second = choice(string.ascii_lowercase)
third = choice(string.ascii_lowercase)
fourth = choice(string.ascii_lowercase)
fifth = choice("aeiou")
sixthSeventh = choice(string.ascii_lowercase)
eighth = choice("eiou")
word = first + second + third + fourth + fifth + sixthSeventh + sixthSeventh + eighth
print(word)
Try this:
import random
sixth=random.choice(sixths)
s='A'+random.choice(seconds)+random.choice(thirds)+random.choice(fourths)+random.choice(fifths)+sixth+sixth+random.choice(eighths)
print(s)
Output:
Awixonno
Ahiwojjy
etc
There are several things to consider. First, the str.join() method takes in an iterable (e.g. a list), not a bunch of individual elements. Doing
''.join([first, second, third, fourth, fifth])
fixes the program in this respect. If you are using Python 3, print() is a function, and so you should add parentheses around the entire list comprehension.
With the syntax out of the way, let's get to a more interesting problem: Your program constructs every (82255680 !) possible word. This takes a long time and memory. What you want is probably to just pick one. You can of course do this by first constructing all, then picking one at random. It's far cheaper though to pick one letter from each of firsts, seconds, etc. at random and then collecting these. All together then:
import random
firsts = ['A']
seconds = ['a','b','c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
thirds = ['a', 'e', 'i', 'o', 'u', 'y']
fourths = ['a','b','c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
fifths = ['a', 'e', 'i', 'o', 'u', 'y']
sixths = sevenths = ['a','b','c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
eighths = ['e', 'i', 'o', 'u', 'y']
result = ''.join([
random.choice(firsts),
random.choice(seconds),
random.choice(thirds),
random.choice(fourths),
random.choice(fifths),
random.choice(sixths),
random.choice(sevenths),
random.choice(eighths),
])
print(result)
To improve the code from here, try to:
Find a way to generate the "data" in a neater way than writing it out explicitly. As an example:
import string
seconds = list(string.ascii_lowercase) # you don't even need list()!
Instead of having a separate variable firsts, seconds, etc., collect these into a single variable, e.g. a single list containing each original list as a single str with all characters included.
This will implement what you describe. You can make the code neater by putting the choices into an overall list rather than have several different variables, but you will have to explicitly deal with the fact that the sixth and seventh letters are the same; they will not be guaranteed to be the same simply because there are the same choices available for each of them.
The list choices_list could contain sub-lists per your original code, but as you are choosing single characters it will work equally with strings when using random.choice and this also makes the code a bit neater.
import random
choices_list = [
'A',
'abcdefghijklmnopqrstuvwxyz',
'aeiouy',
'abcdefghijklmnopqrstuvwxyz',
'aeiouy',
'abcdefghijklmnopqrstuvwxyz',
'eiouy'
]
letters = [random.choice(choices) for choices in choices_list]
word = ''.join(letters[:6] + letters[5:]) # here the 6th letter gets repeated
print(word)
Some example outputs:
Alaeovve
Aievellu
Ategiwwo
Aeuzykko
Here's the syntax fix:
print(["".join([first, second, third])
for first in firsts
for second in seconds
for third in thirds])
This method might take up a lot of memory.

Trying to use a list to populate cells in a row

I want to use a list to populate a row of data cells.
new_list = ['X', 'N', 'N', 'N', 'X', 'N', 'N', 'X', 'N', 'N', 'N', 'N', 'X', 'N', 'N',]
def new_list_report(new_list):
with open('newlist.csv', 'w') as csvfile:
thewriter = csv.writer(csvfile)
for word in new_list:
thewriter.writerow(word)
When I execute the code, it populates each data cell in a column. What I would like to do is instead populate each cell in a row. How could I do this?
You want all the elements of new_list to appear as cells in a single row? Then write the whole list as a row, instead of each element of new_list as a row.
new_list = ['X', 'N', 'N', 'N', 'X', 'N', 'N', 'X', 'N', 'N', 'N', 'N', 'X', 'N', 'N',]
def new_list_report(new_list):
with open('newlist.csv', 'w') as csvfile:
thewriter = csv.writer(csvfile)
thewriter.writerow(new_list)

Python loops are missing results

I am reading a file with about 13,000 names on it into a list.
Then, I look at each character of each item on that list and if there is a match I remove that line from the list of 13,000.
If I run it once, it removes about half of the list. On the 11th run it seems to cut it down to 9%. Why is this script missing results? Why does it catch them with successive runs?
Using Python 3.
with open(fname) as f:
lines = f.read().splitlines()
bad_letters = ['B', 'C', 'F', 'G', 'H', 'J', 'L', 'O', 'P', 'Q', 'U', 'W', 'X']
def clean(callsigns, bad):
removeline = 0
for line in callsigns:
for character in line:
if character in bad:
removeline = 1
if removeline == 1:
lines.remove(line)
removeline = 0
return callsigns
for x in range (0, 11):
lines = clean(lines, bad_letters)
print (len(lines))
You are changing (i.e., mutating) the lines array while you're looping (i.e. iterating) over it. This is never a good idea because it means that you are changing something while you're reading it, which leads to you skipping over lines and not removing them in the first go.
There are many ways of fixing this. In the below example, we keep track of which lines to remove, and remove them in a separate loop in a way so that the indices do not change.
with open(fname) as f:
lines = f.read().splitlines()
bad_letters = ['B', 'C', 'F', 'G', 'H', 'J', 'L', 'O', 'P', 'Q', 'U', 'W', 'X']
def clean(callsigns, bad):
removeline = 0
to_remove = []
for line_i, line in enumerate(callsigns):
for b in bad:
if b in line:
# We're removing this line, take note of it.
to_remove.append(line_i)
break
# Remove the lines in a second step. Reverse it so the indices don't change.
for r in reversed(to_remove):
del callsigns[r]
return callsigns
for x in range (0, 11):
lines = clean(lines, bad_letters)
Save the names you want to keep in a separate list.. Maybe this way:-
with open(fname) as f:
lines = f.read().splitlines()
bad_letters = ['B', 'C', 'F', 'G', 'H', 'J', 'L', 'O', 'P', 'Q', 'U', 'W', 'X']
def clean(callsigns, bad):
valid = [i for i in callsigns if not any(j in i for j in bad)]
return valid
valid_names = clean(lines,bad_letters)
print (len(valid_names))

Categories

Resources