I'm trying to produce shorter, more pythonic, readable python. And I have this working solution for Project Euler's problem 8 (find the greatest product of 5 sequential digits in a 1000 digit number).
Suggestions for writing a more pythonic version of this script?
numstring = ''
for line in open('8.txt'):
numstring += line.rstrip()
nums = [int(x) for x in numstring]
best=0
for i in range(len(nums)-4):
subset = nums[i:i+5]
product=1
for x in subset:
product *= x
if product>best:
best=product
bestsubset=subset
print best
print bestsubset
For example: there's gotta be a one-liner for the below snippet. I'm sure there's a past topic on here but I'm not sure how to describe what I'm doing below.
numstring = ''
for line in open('8.txt'):
numstring += line.rstrip()
Any suggestions? thanks guys!
I'm working on a full answer, but for now here's the one liner
numstring = ''.join(x.rstrip() for x in open('8.txt'))
Edit: Here you go! One liner for the search. List comprehensions are wonderful.
from operator import mul
def prod(list):
return reduce(mul, list)
numstring = ''.join(x.rstrip() for x in open('8.txt'))
nums = [int(x) for x in numstring]
print max(prod(nums[i:i+5]) for i in range(len(nums)-4))
from operator import mul
def product(nums):
return reduce(mul, nums)
nums = [int(c) for c in open('8.txt').read() if c.isdigit()]
result = max((product(nums[i:i+5]) for i in range(len(nums))))
Here is my solution. I tried to write the most "Pythonic" code that I know how to write.
with open('8.txt') as f:
numstring = f.read().replace('\n', '')
nums = [int(x) for x in numstring]
def sub_lists(lst, length):
for i in range(len(lst) - (length - 1)):
yield lst[i:i+length]
def prod(lst):
p = 1
for x in lst:
p *= x
return p
best = max(prod(lst) for lst in sub_lists(nums, 5))
print(best)
Arguably, this is one of the ideal cases to use reduce so maybe prod() should be:
# from functools import reduce # uncomment this line for Python 3.x
from operator import mul
def prod(lst):
return reduce(mul, lst, 1)
I don't like to try to write one-liners where there is a reason to have more than one line. I really like the with statement, and it's my habit to use that for all I/O. For this small problem, you could just do the one-liner, and if you are using PyPy or something the file will get closed when your small program finishes executing and exits. But I like the two-liner using with so I wrote that.
I love the one-liner by #Steven Rumbalski:
nums = [int(c) for c in open('8.txt').read() if c.isdigit()]
Here's how I would probably write that:
with open("8.txt") as f:
nums = [int(ch) for ch in f.read() if ch.isdigit()]
Again, for this kind of short program, your file will be closed when the program exits so you don't really need to worry about making sure the file gets closed; but I like to make a habit of using with.
As far as explaining what that last bit was, first you create an empty string called numstring:
numstring = ''
Then you loop over every line of text (or line of strings) in the txt file 8.txt:
for line in open('8.txt'):
And so for every line you find, you want to add the result of line.rstrip() to it. rstrip 'strips' the whitespace (newlines,spaces etc) from the string:
numstring += line.rstrip()
Say you had a file, 8.txt that contains the text: LineOne \nLyneDeux\t\nLionTree you'd get a result that looked something like this in the end:
>>>'LineOne' #loop first time
>>>'LineOneLyneDeux' # second time around the bush
>>>'LineOneLyneDeuxLionTree' #final answer, reggie
Here's a full solution! First read out the number:
with open("8.txt") as infile:
number = infile.replace("\n", "")
Then create a list of lists with 5 consecutive numbers:
cons_numbers = [list(map(int, number[i:i+5])) for i in range(len(number) - 4)]
Then find the largest and print it:
print(max(reduce(operator.mul, nums) for nums in cons_numbers))
If you're using Python 3.x you need to replace reduce with functools.reduce.
Related
Beginner here. I'm having problems with this task: accum("hello") should return "H-Ee-Lll-Llll-Ooooo". But what I get with my code is "H-Ee-Lll- Lll -Ooooo". It doesn't work for double characters. Is this because the iteration variable in "for i in s" "skips" over double "i's" or something? And do you have an idea how I can fix this? I'm not striving for elegant code or something, my goal atm is to try and make easily readable lines for myself :)
Thank you!
(Sorry if this is something basic, I didn't really know what to search for!)
def accum(s):
s_list = []
s = [ele for ele in s]
for i in s:
sum_ind = ((s.index(i)) + 1) * i
s_list.append(sum_ind)
s_list = [e.capitalize() for e in s_list]
s_list = '-'.join(s_list)
return s_list
Here's a way to do:
def accum(stri):
p = []
for i, s in enumerate(stri, 1):
p.append((s*i).capitalize())
return '-'.join(p)
accum('hello')
'H-Ee-Lll-Llll-Ooooo'
Take a quick read about: enumerate
I think you could solve this easily with enumerate:
def accum(s):
r = []
for i, letter in enumerate(s):
r.append(letter.upper() + (letter*i).lower())
return '-'.join(r)
Here is one way:
def accum(s):
return "-".join( (c*i).capitalize() for i,c in enumerate(s,1) )
yields:
'H-Ee-Lll-Llll-Ooooo'
As mentioned in many comments, we can give a short explanation of working of enumerate here.
As per your requirement, given a letter from a string, you find its index (position). Then you first make the caps of letter and glue it with index-times small letters.
So you need a counter which keeps track of the position of letter, so we can do something like this (A DUMMY, SLOW EXAMPLE):
def accum(word):
ans = ""
for index in range(len(word)):
letter = word[index]
ans += letter.upper() + index*letter + "-"
else:
word = word[:-1] #Remove trailing '-'
return word
BUT THIS FUNCTION IS EXTREMELY SLOW. BECAUSE IT USES SIMPLE STRING ADDITION AND NOT OTHER PROPER METHOD.
That's why people ask you to use enumerate. In short it keeps track of your indices.
for index, name in enumerate(["John", "Lex", "Vui"]):
print(name, "is found at", index)
That's it !
{Im not writing the answer you wanted, as almost everyone provided the best answer, they could, my aim was to explain you the use of enumerate and other slow methods for your problem}
I am a newbie in python and I am working on a function that I expect to pass a string like abcd and it outputs something like A-Bb-Ccc-Dddd.
I have created the following.
`
def mumbler(s):
chars = list(s)
mumbled = []
result = []
for char in chars:
caps = char.upper()
num = chars.index(char)
low = char.lower()
mumbled.append( caps+ low*num)
for i in mumbled:
result.append(i+'-')
result = ''.join(result)
return result[:-1]
`
It works for most cases. However, when I pass a string like Abcda. It fails to return the expected output, in this case, A-Bb-Ccc-Dddd-Aaaaa.
How should I go about solving this?
Thank you for taking the time to answer this.
You can do it in a much simpler way using list comprehension and enumerate
>>> s = 'abcd'
>>> '-'.join([c.upper() + c.lower()*i for i,c in enumerate(s)])
'A-Bb-Ccc-Dddd'
If you want to make your own code work, you'll just need to convert the result list to string outside your second for-loop:
def mumbler(s):
chars = list(s)
mumbled = []
result = []
for char in chars:
caps = char.upper()
num = chars.index(char)
low = char.lower()
mumbled.append( caps+ low*num)
for i in mumbled:
result.append(i+'-')
result = ''.join(result)
return result[:-1]
mumbler('Abcda')
'A-Bb-Ccc-Dddd-Aaaaa'
Go for a simple 1-liner - next() on count for maintaining the times to repeat and title() for title-casing:
from itertools import count
s = 'Abcda'
i = count(1)
print('-'.join([(x * next(i)).title() for x in s]))
# A-Bb-Ccc-Dddd-Aaaaa
I am trying to make a reverse function which takes an input (text) and outputs the reversed version. So "Polar" would print raloP.
def reverse(text):
list = []
text = str(text)
x = len(text) - 1
list.append("T" * x)
for i in text:
list.insert(x, i)
x -= 1
print "".join(list)
reverse("Something")
As others have mentioned, Python already provides a couple of ways to reverse a string. The simple way is to use extended slicing: s[::-1] creates a reversed version of string s. Another way is to use the reversed function: ''.join(reversed(s)). But I guess it can be instructive to try implementing it for yourself.
There are several problems with your code.
Firstly,
list = []
You shouldn't use list as a variable name because that shadows the built-in list type. It won't hurt here, but it makes the code confusing, and if you did try to use list() later on in the function it would raise an exception with a cryptic error message.
text = str(text)
is redundant. text is already a string. str(text) returns the original string object, so it doesn't hurt anything, but it's still pointless.
x = len(text) - 1
list.append("T" * x)
You have an off-by-one error here. You really want to fill the list with as many items as are in the original string, this is short by one. Also, this code appends the string as a single item to the list, not as x separate items of one char each.
list.insert(x, i)
The .insert method inserts new items into a list, the subsequent items after the insertion point get moved up to make room. We don't want that, we just want to overwrite the current item at the x position, and we can do that by indexing.
When your code doesn't behave the way you expect it to, it's a Good Idea to add print statements at strategic places to make sure that variables have the value that they're supposed to have. That makes it much easier to find where things are going wrong.
Anyway, here's a repaired version of your code.
def reverse(text):
lst = []
x = len(text)
lst.extend("T" * x)
for i in text:
x -= 1
lst[x] = i
print "".join(lst)
reverse("Something")
output
gnihtemoS
Here's an alternative approach, showing how to do it with .insert:
def reverse(text):
lst = []
for i in text:
lst.insert(0, i)
print "".join(lst)
Finally, instead of using a list we could use string concatenation. However, this approach is less efficient, especially with huge strings, but in modern versions of Python it's not as inefficient as it once was, as the str type has been optimised to handle this fairly common operation.
def reverse(text):
s = ''
for i in text:
s = i + s
print s
BTW, you really should be learning Python 3, Python 2 reaches its official End Of Life in 2020.
You can try :
def reverse(text):
return text[::-1]
print(reverse("Something")) # python 3
print reverse("Something") # python 2
Easier way to do so:
def reverse(text):
rev = ""
i = len(text) - 1
while i > -1:
rev += text[i]
i = i - 1
return rev
print(reverse("Something"))
result: gnihtemoS
You could simply do
print "something"[::-1]
Hello I am fairly new at programming,
I would like to know is there a function or a method that allows us to find out how many letters have been changed in a string..
example:
input:
"Cold"
output:
"Hold"
Hence only 1 letter was changed
or the example:
input:
"Deer"
output:
"Dial"
Hence 3 letters were changed
I spoke too soon. First result googling:
https://pypi.python.org/pypi/python-Levenshtein/
This should be able to measure the minimum number of changes needed to get from one string to another.
If you don't need to consider character insertions or deletions, the problem is reduced to simply counting the number of characters that are different between the strings.
Since you're new to programming, a imperative-style program would be:
def differences(string1,string2):
i=0
different=0
for i in range(len(string1)):
if string1[i]!=string2[i]:
different= different+1
return different
something slightly more pythonic would be:
def differences(string1,string2):
different=0
for a,b in zip(string1,string2):
if a!=b:
different+= 1
return different
or, if you want to go fully functional:
def differences(string1,string2):
return sum(map(lambda (x,y):x!=y, zip(string1,string2)))
which, as #DSM suggested, is equivalent to the more readable generator expression:
def differences(string1,string2):
return sum(x != y for x,y in zip(string1, string2))
Use the itertools library as follows (Python 3.x)
from itertools import zip_longest
def change_count(string1, string2):
count = 0
for i, (char1, char2) in enumerate(zip_longest(string1, string2)):
if char1 != char2:
count = count + 1
return count
string1 = input("Enter one string: ")
string2 = input("Enter another string: ")
changed = change_count(string1, string2)
print("Times changed: ", changed)
Check out the difflib library, particularly then ndiff method. Note: this is kind of overkill for the required job, but it is really great for seeing the differences between two files (you can see which are new, which are changed, etc etc)
word1 = "Cold"
word2 = "Waldo"
i = 0
differences = difflib.ndiff(word1, word2)
for line in differences:
if line[0] is not " ":
i += 1
print(i)
I am having some trouble with a piece of code below:
Input: li is a nested list as below:
li = [['>0123456789 mouse gene 1\n', 'ATGTTGGGTT/CTTAGTTG\n', 'ATGGGGTTCCT/A\n'], ['>9876543210 mouse gene 2\n', 'ATTTGGTTTCCT\n', 'ATTCAATTTTAAGGGGGGGG\n']]
Using the function below, my desired output is simply the 2nd to the 9th digits following '>' under the condition that the number of '/' present in the entire sublist is > 1.
Instead, my code gives the digits to all entries. Also, it gives them multiple times. I therefore assume something is wrong with my counter and my for loop. I can't quite figure this out.
Any help, greatly appreciated.
import os
cwd = os.getcwd()
def func_one():
outp = open('something.txt', 'w') #output file
li = []
for i in os.listdir(cwd):
if i.endswith('.ext'):
inp = open(i, 'r').readlines()
li.append(inp)
count = 0
lis = []
for i in li:
for j in i:
for k in j[1:] #ignore first entry in sublist
if k == '/':
count += 1
if count > 1:
lis.append(i[0][1:10])
next_func(lis, outp)
Thanks,
S :-)
Your indentation is possibly wrong, you should check count > 1 within the for j in i loop, not within the one that checks every single character in j[1:].
Also, here's a much easier way to do the same thing:
def count_slashes(items):
return sum(item.count('/') for item in items)
for item in li:
if count_slashes(item[1:]) > 1:
print item[0][1:10]
Or, if you need the IDs in a list:
result = [item[0][1:10] for item in li if count_slashes(item[1:]) > 1]
Python list comprehensions and generator expressions are really powerful tools, try to learn how to use them as it makes your life much simpler. The count_slashes function above uses a generator expression, and my last code snippet uses a list comprehension to construct the result list in a nice and concise way.
Tamás has suggested a good solution, although it uses a very different style of coding than you do. Still, since your question was "I am having some trouble with a piece of code below", I think something more is called for.
How to avoid these problems in the future
You've made several mistakes in your approach to getting from "I think I know how to write this code" to having actual working code.
You are using meaningless names for your variables which makes it nearly impossible to understand your code, including for yourself. The thought "but I know what each variable means" is obviously wrong, otherwise you would have managed to solve this yourself. Notice below, where I fix your code, how difficult it is to describe and discuss your code.
You are trying to solve the whole problem at once instead of breaking it down into pieces. Write small functions or pieces of code that do just one thing, one piece at a time. For each piece you work on, get it right and test it to make sure it is right. Then go on writing other pieces which perhaps use pieces you've already got. I'm saying "pieces" but usually this means functions, methods or classes.
Fixing your code
That is what you asked for and nobody else has done so.
You need to move the count = 0 line to after the for i in li: line (indented appropriately). This will reset the counter for every sub-list. Second, once you have appended to lis and run your next_func, you need to break out of the for k in j[1:] loop and the encompassing for j in i: loop.
Here's a working code example (without the next_func but you can add that next to the append):
>>> li = [['>0123456789 mouse gene 1\n', 'ATGTTGGGTT/CTTAGTTG\n', 'ATGGGGTTCCT/A\n'], ['>9876543210 mouse gene 2\n', 'ATTTGGTTTCCT\n', 'ATTCAATTTTAAGGGGGGGG\n']]
>>> lis = []
>>> for i in li:
count = 0
for j in i:
break_out = False
for k in j[1:]:
if k == '/':
count += 1
if count > 1:
lis.append(i[0][1:10])
break_out = True
break
if break_out:
break
>>> lis
['012345678']
Re-writing you code to make it readable
This is so you see what I meant in the beginning of my answer.
>>> def count_slashes(gene):
"count the number of '/' character in the DNA sequences of the gene."
count = 0
dna_sequences = gene[1:]
for sequence in dna_sequences:
count += sequence.count('/')
return count
>>> def get_gene_name(gene):
"get the name of the gene"
gene_title_line = gene[0]
gene_name = gene_title_line[1:10]
return gene_name
>>> genes = [['>0123456789 mouse gene 1\n', 'ATGTTGGGTT/CTTAGTTG\n', 'ATGGGGTTCCT/A\n'], ['>9876543210 mouse gene 2\n', 'ATTTGGTTTCCT\n', 'ATTCAATTTTAAGGGGGGGG\n']]
>>> results = []
>>> for gene in genes:
if count_slashes(gene) > 1:
results.append(get_gene_name(gene))
>>> results
['012345678']
>>>
import itertools
import glob
lis = []
with open('output.txt', 'w') as outfile:
for file in glob.iglob('*.ext'):
content = open(file).read()
if content.partition('\n')[2].count('/') > 1:
lis.append(content[1:10])
next_func(lis, outfile)
The reason you digits to all entries, is because you're not resetting the counter.