Python: Length of list as single integer - python

I'm new to Python and I'm trying to output the length of a list as a single integer, eg:
l1 = ['a', 'b', 'c']
len(l1) = 3
However, it is printing on cmdline with 1s down the page, eg:
1
1
1
1
1
1
etc
How can I get it to just output the number rather than a list of 1s?
(Here's the code:)
def Q3():
from datetime import datetime, timedelta
inputauth = open("auth.log", "r")
authStrings = inputauth.readlines()
failedPass = 'Failed password for'
for line in authStrings:
time = line[7:15]
dateHour = line[0:9]
countAttack1 = []
if time in line and failedPass in line:
if dateHour == 'Feb 3 08':
countAttack1.append(time)
length1 = len(countAttack1)
print(length1)
Ideally, I'd like it to output the number in a print so that I could format it, aka:
print("Attack 1: " + length1)

I think you are looping and ifs are inside a loop. If so, just print the length outside loop scope.
Please share the complete code for a better answer

Well as Syed Abdul Wahab said, the problem is that the "list" is getting recreated each loop. This makes so that the print reports "1", as it is the actual length of the list.
The other problem, repetition of the printng is similar - you are actually printing "each time in the loop".
The solution is then simple: you initialize the list outside the loop; and also report outside the loop.
def Q3():
from datetime import datetime, timedelta
inputauth = open("auth.log", "r")
authStrings = inputauth.readlines()
failedPass = 'Failed password for'
countAttack1 = [] # after this line the countAttack will be empty
for line in authStrings:
time = line[7:15]
dateHour = line[0:9]
if time in line and failedPass in line:
if dateHour == 'Feb 3 08':
countAttack1.append(time)
length1 = len(countAttack1)
print("Attack 1: " + str(length1))
I'd also like to take a bit of time to link you to string formatting While the documentation is complex it will make printing much easier, above print is trnasformed into:
print("Attack 1: {0}".format(length1))
Further analysing the code gives some peculiarities, you check if time is in the line string. - However just a few codelines above you create time from a slice of line - so it will always be inside line. (Except for the edge case where line is not of correct length, but that'll error anyways). So that if statement should be simplified to:
if failedPass in line:

Here is the function that prints the the length:
def print_length():
if time in line and failedPass in line:
if dateHour == 'Feb 3 08':
countAttack1.append(time)
length1 = len(countAttack1)
print(length1)
print_length()
>>>Print length of the List.

Related

How to read ONLY 1 word in python?

I've created an empty text file, and saved some stuff to it. This is what I saved:
Saish ddd TestUser ForTestUse
There is a space before these words. Anyways, I wanted to know how to read only 1 WORD in the text file using python. This is the code I used:
#Uncommenting the line below the line does literally nothing.
import time
#import mmap, re
print("Loading Data...")
time.sleep(2)
with open("User_Data.txt") as f:
lines = f.read() ##Assume the sample file has 3 lines
first = lines.split(None, 1)[0]
print(first)
print("Type user number 1 - 4 for using different user.")
ans = input('Is the name above correct?(y/1 - 4) ')
if ans == 'y':
print("Ok! You will be called", first)
elif ans == '1':
print("You are already registered to", first)
elif ans == '2':
print('Switching to accounts...')
time.sleep(0.5)
with open("User_Data.txt") as f:
lines = f.read() ##Assume the sample file has 3 lines
second = lines.split(None, 2)[2]
print(second)
#Fix the passord issue! Very important as this is SECURITY!!!
when I run the code, my output is:
Loading Data...
Saish
Type user number 1 - 4 for using different user.
Is the name above correct?(y/1 - 4) 2
Switching to accounts...
TestUser ForTestUse
as you can see, it diplays both "TestUser" and "ForTestUse" while I only want it to display "TestUser".
When you give a limit to split(), all the items from that limit to the end are combined. So if you do
lines = 'Saish ddd TestUser ForTestUse'
split = lines.split(None, 2)
the result is
['Saish', 'ddd', 'TestUser ForTestUse']
If you just want the third word, don't give a limit to split().
second = lines.split()[2]
You can use it directly without passing any None
lines.split()[2]
I understand your passing (None, 2) because you want to get None if there is no value at index 2,
A simple way to check if the index is available in the list
Python 2
2 in zip(*enumerate(lines.split()))[0]
Python 3
2 in list(zip(*enumerate(lines.split())))[0]

Trying to transcribe using ascii values using ord and chr

Trying to make an encrypter that shifts ascii values of each character in a message by the value of a corresponding character in a password - Output always results in either a single character or a string index out of range error:
msg = input()
pw = input()
pwN = 0
msgN = 0
for i in msg:
newmsg =""
nchar = chr(ord(msg[msgN]) + ord(pw[pwN]))
pwN += 1
msgN += 1
if pwN > len(pw):
pwN = 0
newmsg += nchar
print (newmsg)
Running it in this form results in a single character rather than a message length string in some cases, and in others gives me this error:
Traceback (most recent call last):
File "file", line 8, in <module>
nchar = str(chr(ord(msg[msgN]) + ord(pw[pwN])))
IndexError: string index out of range
I can't figure out what I'm missing.
The issue is that you're setting newmsg to the empty string in each loop. Moving newmsg = "" before the for loop should fix the issue of single characters, although figuring out the out of range error is difficult because of your manual increasing of several indices while also iterating over msg.
I would suggest taking a look at the iteration features Python offers. You are technically iterating over msg, but never actually use i, instead relying solely on indices. A more pythonic way to solve this might be as follows:
from itertools import cycle
msg = input()
pw = input()
newmsg = ""
for mchar, pwchar in zip(msg, cycle(pw)): # cycle loops the password so that abc becomes abcabcabc...
newmsg += chr(ord(mchar) + ord(pwchar))
print(newmsg)
if you want to stick to the loop. I would even use a generator expression to make it
from itertools import cycle
msg = input()
pw = input()
newmsg = "".join(chr(ord(mchar) + ord(pwchar)) for mchar, pwchar in zip(msg, cycle(pw)))
print(newmsg)

File reading & counting & sorting by hours in Python

I'm new to Python & here is my question
Write a program to read through the mbox-short.txt and figure out the distribution by hour of the day for each of the messages. You can pull the hour out from the 'From ' line by finding the time and then splitting the string a second time using a colon.
From stephen.marquard#uct.ac.za Sat Jan 5 09:14:16 2008
Once you have accumulated the counts for each hour, print out the counts, sorted by hour as shown below.
Link of the file:
http://www.pythonlearn.com/code/mbox-short.txt
This is my code:
name = raw_input("Enter file:")
if len(name) < 1 : name = "mbox-short.txt"
handle = open(name)
counts = dict()
for line in handle:
if not line.startswith ("From "):continue
#words = line.split()
col = line.find(':')
coll = col - 2
print coll
#zero = line.find('0')
#one = line.find('1')
#b = line[ zero or one : col ]
#print b
#hour = words[5:6]
#print hour
#for line in hour:
# hr = line.split(':')
# x = hr[1]
for x in coll:
counts[x] = counts.get(x,0) + 1
for key, value in sorted(counts.items()):
print key, value
My first try was with list splitting(Comments) and it didn't work as it considered the 0 & the 1 as the first & the second letter not the numbers
second one was with line find (:) which is partially worked with minutes not with hours as required!!
First question
Why when I write line.find(:), it takes automatically the 2 numbers after?
Second question
Why when I run the program now, it gives an error
TypeError: 'int' object is not iterable on line 26 ??
Third question
Why it considered 0 & 1 as first & second letters of the line not 0 & 1 numbers
Finally
If possible please solve me this problem with a little of explanation please (with the same codes to keep my learning sequence)
Thank you...
First question
Why when I write line.find(:), it takes automatically the 2 numbers
after?
str.find() return the first index of the character that you want to find. If your string is "From 00:00:00", it returns 7 as the first ':' is at index 7.
Second question
Why when I run the program now, it gives an error TypeError: 'int'
object is not iterable on line 26 ??
As have said above, it returns an int, which you cannot iterate
Third question
Why it considered 0 & 1 as first & second letters of the line not 0 &
1 numbers
I don't really understand what do you mean here. Anyway, as I understand, you try to find the first index which '0' or '1' occurs and assume that the first letter of hour? What about 8-11pm(start with 2)?
Finally If possible please solve me this problem with a little of
explanation please (with the same codes to keep my learning sequence)
Sure, it will be like this:
for line in f:
if not line.startswith("From "): continue
first_colon_index = line.find(":")
if first_colon_index == -1: # there is no ':'
continue
first_char_hour_index = first_colon_index - 2
# string slicing
# [a:b] get string from index a to b
hour = line[first_char_hour_index:first_char_hour_index+2]
hour_int = int(hour)
# if key exist, increase by 1. If not, set to 1
if hour_int in count:
count[hour_int] += 1
else:
count[hour_int] = 1
# print hour & count, in sorting order
for hour in sorted(count):
print hour, count[hour]
The part about string slicing can be confusing, you can read more about it at Python docs.
And you have to sure that: in the line, there is no other ":" or this method will fail as the first ":" will not be the one between hour and minute.
To make sure it works, it's better to use Regex. Something like:
for line in f:
if not line.startswith("From"): continue
match = re.search(r'^From.*?([0-9]{2,2}:[0-9]{2,2}:[0-9]{2,2})', line)
if match:
time = match.group(1) # hh:mm:ss
hh = int(time.split(":")[0])
# if key exist, increase by 1. If not, set to 1
if hh in count:
count[hh] += 1
else:
count[hh] = 1
# print hour & count, in sorting order
for hour in sorted(count):
print hour, count[hour]
That's because str.find() returns an index of the found substring, not the string itself. Consequently, when you subtract 2 from it and then try to loop through it it will complain that you're trying to loop through an integer and raise a TypeError.
You can grab the whole time string as:
time_start = line.find(":")
if time_start == -1: # not found
continue
time_string = line[time_start-2:time_start+6] # slice out the whole time string
You can then further split the time_string by : to get hours, minutes and seconds (e.g. hours, minutes, seconds = time_string.split(":", 2) just keep in mind that those will be strings, not integers), or if you just want the hour:
hour = int(line[time_start-2:time_start])
You can take it from there - just increase your dict value and when you're done with parsing the file sort everything out.

The String is Not Read Fully

I wrote a programme to generate a string of number, consisting of 0,1,2,and 3 with the length s and write the output in decode.txt file. Below is the code :
import numpy as np
n_one =int(input('Insert the amount of 1: '))
n_two =int(input('Insert the amount of 2: '))
n_three = int(input('Insert the amount of 3: '))
l = n_one+n_two+n_three
n_zero = l+1
s = (2*(n_zero))-1
data = [0]*n_zero + [1]*n_one + [2]*n_two + [3]*n_three
print ("Data string length is %d" % len(data))
while data[0] == 0 and data[s-1]!=0:
np.random.shuffle(data)
datastring = ''.join(map(str, data))
datastring = str(int(datastring))
files = open('decode.txt', 'w')
files.write(datastring)
files.close()
print("Data string is : %s " % datastring)
The problem occur when I try to read the file from another program, the program don't call the last value of the string.
For example, if the string generated is 30112030000 , the other program will only call 3011203000, means the last 0 is not called.
But if I key in 30112030000 directly to the .txt file, all value is read. I can't figure out where is wrong in my code.
Thank you
Some programs might not like the fact that the file doesn't end with a newline. Try adding files.write('\n') before you close it.

How do you make tables with previously stored strings?

So the question basically gives me 19 DNA sequences and wants me to makea basic text table. The first column has to be the sequence ID, the second column the length of the sequence, the third is the number of "A"'s, 4th is "G"'s, 5th is "C", 6th is "T", 7th is %GC, 8th is whether or not it has "TGA" in the sequence. Then I get all these values and write a table to "dna_stats.txt"
Here is my code:
fh = open("dna.fasta","r")
Acount = 0
Ccount = 0
Gcount = 0
Tcount = 0
seq=0
alllines = fh.readlines()
for line in alllines:
if line.startswith(">"):
seq+=1
continue
Acount+=line.count("A")
Ccount+=line.count("C")
Gcount+=line.count("G")
Tcount+=line.count("T")
genomeSize=Acount+Gcount+Ccount+Tcount
percentGC=(Gcount+Ccount)*100.00/genomeSize
print "sequence", seq
print "Length of Sequence",len(line)
print Acount,Ccount,Gcount,Tcount
print "Percent of GC","%.2f"%(percentGC)
if "TGA" in line:
print "Yes"
else:
print "No"
fh2 = open("dna_stats.txt","w")
for line in alllines:
splitlines = line.split()
lenstr=str(len(line))
seqstr = str(seq)
fh2.write(seqstr+"\t"+lenstr+"\n")
I found that you have to convert the variables into strings. I have all of the values calculated correctly when I print them out in the terminal. However, I keep getting only 19 for the first column, when it should go 1,2,3,4,5,etc. to represent all of the sequences. I tried it with the other variables and it just got the total amounts of the whole file. I started trying to make the table but have not finished it.
So my biggest issue is that I don't know how to get the values for the variables for each specific line.
I am new to python and programming in general so any tips or tricks or anything at all will really help.
I am using python version 2.7
Well, your biggest issue:
for line in alllines: #1
...
fh2 = open("dna_stats.txt","w")
for line in alllines: #2
....
Indentation matters. This says "for every line (#1), open a file and then loop over every line again(#2)..."
De-indent those things.
This puts the info in a dictionary as you go and allows for DNA sequences to go over multiple lines
from __future__ import division # ensure things like 1/2 is 0.5 rather than 0
from collections import defaultdict
fh = open("dna.fasta","r")
alllines = fh.readlines()
fh2 = open("dna_stats.txt","w")
seq=0
data = dict()
for line in alllines:
if line.startswith(">"):
seq+=1
data[seq]=defaultdict(int) #default value will be zero if key is not present hence we can do +=1 without originally initializing to zero
data[seq]['seq']=seq
previous_line_end = "" #TGA might be split accross line
continue
data[seq]['Acount']+=line.count("A")
data[seq]['Ccount']+=line.count("C")
data[seq]['Gcount']+=line.count("G")
data[seq]['Tcount']+=line.count("T")
data[seq]['genomeSize']+=data[seq]['Acount']+data[seq]['Gcount']+data[seq]['Ccount']+data[seq]['Tcount']
line_over = previous_line_end + line[:3]
data[seq]['hasTGA']= data[seq]['hasTGA'] or ("TGA" in line) or (TGA in line_over)
previous_line_end = str.strip(line[-4:]) #save previous_line_end for next line removing new line character.
for seq in data.keys():
data[seq]['percentGC']=(data[seq]['Gcount']+data[seq]['Ccount'])*100.00/data[seq]['genomeSize']
s = '%(seq)d, %(genomeSize)d, %(Acount)d, %(Ccount)d, %(Tcount)d, %(Tcount)d, %(percentGC).2f, %(hasTGA)s'
fh2.write(s % data[seq])
fh.close()
fh2.close()

Categories

Resources