Recently I've been working on a code that saves a set of variables on a list, and each set is saved on one list that contains all the other lists of variables, then I remove some characters of the variables and transform them into float, finally I take the smallest number of each list and save it on another list. The problem is when I move those numbers to the new list it just show me one number and not the entire list. Can somebody help me?
Here's the code:
from typing import List
from bs4 import BeautifulSoup
import requests
import pandas as pd
from decimal import Decimal
ListaPreciosCromos = list()
ListaUrl = ['https://steamcommunity.com/market/search?category_753_Game%5B%5D=tag_app_495570&category_753_cardborder%5B%5D=tag_cardborder_0&category_753_item_class%5B%5D=tag_item_class_2#p1_price_asc', 'https://steamcommunity.com/market/search?category_753_Game%5B%5D=tag_app_540190&category_753_cardborder%5B%5D=tag_cardborder_0&category_753_item_class%5B%5D=tag_item_class_2#p1_price_asc', 'https://steamcommunity.com/market/search?category_753_Game%5B%5D=tag_app_607210&category_753_cardborder%5B%5D=tag_cardborder_0&category_753_item_class%5B%5D=tag_item_class_2#p1_price_asc',]
PageCromos = [requests.get(x) for x in ListaUrl]
SoupCromos = [BeautifulSoup(x.content, "html.parser") for x in PageCromos]
PrecioCromos = [x.find_all("span", {"data-price": True}) for x in SoupCromos]
for x in PrecioCromos:
for i in x: #
Cromolist2 = [h.replace("$","") for h in i]
CromoList3 = [h.replace("USD","") for h in Cromolist2]
CromoList4 = [float(h) for h in CromoList3]
CantidadCromos = len(CromoList4)
CromoList5 = sorted(CromoList4)
CromoList6 = CromoList5[0]
print(CromoList6)
Output:
0.06
Change CromoList6 = CromoList5[0] to CromoList6.append(CromoList5[0])
Instead of using replace() twice, you can use strip(USD$) to remove the $ and USD from the string.
You can use min() function to get the minimum value of list instead of sorting.
min() takes O(N) and sorted() takes O(NlogN) Time.
Since you need a list of minimum values, you need to use this
CromoList6.append(CromoList5[0]) - This appends all the minimum values to CromoList6.
Here is a minified version of your code with above mentioned changes.
from typing import List
from bs4 import BeautifulSoup
import requests
ListaPreciosCromos = list()
ListaUrl = ['https://steamcommunity.com/market/search?category_753_Game%5B%5D=tag_app_495570&category_753_cardborder%5B%5D=tag_cardborder_0&category_753_item_class%5B%5D=tag_item_class_2#p1_price_asc', 'https://steamcommunity.com/market/search?category_753_Game%5B%5D=tag_app_540190&category_753_cardborder%5B%5D=tag_cardborder_0&category_753_item_class%5B%5D=tag_item_class_2#p1_price_asc', 'https://steamcommunity.com/market/search?category_753_Game%5B%5D=tag_app_607210&category_753_cardborder%5B%5D=tag_cardborder_0&category_753_item_class%5B%5D=tag_item_class_2#p1_price_asc',]
PageCromos = [requests.get(x) for x in ListaUrl]
SoupCromos = [BeautifulSoup(x.content, "html.parser") for x in PageCromos]
PrecioCromos = [x.find_all("span", {"data-price": True}) for x in SoupCromos]
min_CromoList = []
for item in PrecioCromos:
CromoList = [float(i.text.strip('USD$')) for i in item]
min_CromoList.append(min(CromoList))
print(min_CromoList)
[0.04, 0.05, 0.05]
Related
I need to solve for a list (because there are two of values for each variable that are zipped in the proper order) of exmax, eymax, exymax and Nxmax given that all of these variables are some combination of the others.
I have the issue that the type is coming back as a 'finiteset' and it won't let me iterate properly as a result.
import math
import numpy as np
from astropy.table import QTable, Table, Column
from collections import Counter
import operator
from sympy import *
exmax= symbols('exmax')
eymax= symbols('eymax')
exymax= symbols('exymax')
Nxmax=symbols('Nxmax')
Stiffnessofplies=list(1,1) #This isn't the actual value, but it is important to have a len of two #here for later on
Nxmax=[78.4613527541947*exmax + 8.06201746514537e-15*exymax + 4.07395485454472*eymax,
69.4081197440953*exmax + 1.35798495151491*eymax]
exmax= [{(-1.0275144618526e-16*exymax - 0.0519230769230769*eymax,)},
{(-0.0195652173913043*eymax,)}]
eymax = [{(-0.0284210526315789*exmax + 8.11515424209734e-19*exymax,)},
{(-0.299999999999999*exmax,)}]
exymax = [{(-7.78938885521292e-17*exmax + 1.12391245013323e-18*eymax,)}, {(0,)}]
exmax2=[]
for i in emax:
for j in i:
exmax2.append(j)
eymax2=[]
for i in eymax:
for j in i:
eymax2.append(j)
exymax2=[]
for i in exymax:
for j in i:
exymax2.append(j)
I did these last three equations to try and flatten everything out to make it iterable. Here are other things I have tried:
#Pleasework=[]
#for i in range(0,len(Stiffnessofplies)):
# linsolve([exmax2[i]], [eymax2[i]], [exymax2[i]], [Nxmax[i]], (exmax, eymax, exymax,Nxmax))
#System= exmax2[0],eymax2[0],exymax2[0]
#linsolve(System, exmax,eymax,exymax,Nxmax)
#Masterlist=list(zip(exmax,eymax,exymax,Nxmax))
I think one of my main issues is the type I'm getting back 'finiteset' really doesn't work well when trying to iterate the list for both values in the list.
I have the following code:
U_abs = abs(U)
index_max = np.argmax(U_abs[k:n,k])
memory_1 = U[k:n,k]
memory_2 = U[k:n,indice_max]
print(memory_1)
print(memory_2)
U[k:n,k] = memory_2
U[k:n,indice_max]= memory_1
print(memory_1)
print(memory_2)
I need the values of memory_1 and memory_2 not to change, but when I change the values of U[k:n,k] and U[k:n,index_max] the values of memory_1 and memory_2 change. This is my first day in Python. Any idea in how to fix this?
I'm assuming that everything you're doing here is using NumPy. If so, you can replace lines 3 and 4 with the copy operator:
memory_1 = U[k:n,k].copy()
memory_2 = U[k:n,indice_max].copy()
I am new to Python, and am struggling with a task that I assume is an extremely simple one for an experienced programmer.
I am trying to create a list of lists of coordinates for different lines. For instance:
list = [ [(x,y), (x,y), (x,y)], [Line 2 Coordinates], ....]
I have the following code:
masterlist_x = list(range(-5,6))
oneline = []
data = []
numberoflines = list(range(2))
i = 1
for i in numberoflines:
slope = randint(-5,5)
y_int = randint(-10,10)
for element in masterlist_x:
oneline.append((element,slope * element + y_int))
data.append(oneline)
The output of the variable that should hold the coordinates to one line (oneline) holds two lines:
Output
I know this is an issue with the outer looping mechanism, but I am not sure how to proceed.
Any and all help is much appreciated. Thank you very much!
#khuynh is right, you simply had the oneline = [] in wrong place, you put all the coords in one line.
Also, you have a couple unnecessary things in your code:
you don't need list() the range(), you can just iterate them directly with for
also you don't need to declare the i for the for, it does it itself
that i is not actually used, which is fine. Python convention for unused variables is _
Fixed version:
from random import randint
masterlist_x = range(-5,6)
data = []
numberoflines = range(2)
for _ in numberoflines:
oneline = []
slope = randint(-5,5)
y_int = randint(-10,10)
for element in masterlist_x:
oneline.append((element,slope * element + y_int))
data.append(oneline)
print(data)
Also on-line there where you can run it: https://repl.it/repls/GreedyRuralProduct
I suspect the whole thing could be also made with much less code, and in a way in a simpler fashion, as a list comprehension ..
UPDATE: the inner loop is indeed very suitable for a list comprehension. Maybe the outer could be made into one as well, and the whole thing could two nested list comprehensions, but I only got confused when tried that. But this is clear:
from random import randint
masterlist_x = range(-5,6)
data = []
numberoflines = range(2)
for _ in numberoflines:
slope = randint(-5,5)
y_int = randint(-10,10)
oneline = [(element, slope * element + y_int)
for element in masterlist_x]
data.append(oneline)
print(data)
Again on repl.it too: https://repl.it/repls/SoupyIllustriousApplicationsoftware
I am trying to grep some results pages for work, and then eventually print them out to an html website so someone does not have to manually look through each section.
How I would eventually use: I feed this function a result page, it greps through the 5 different sections, then I can do a html output (thats what that print substitute area is for) with all the different results.
OK MASSIVE EDIT I actually removed the old code because I was asking too many questions. I fixed my code taking some suggestions, but I am still interested in the advantage of using human-readable dict instead of just list. Here is my working code that gets all the right results into a 'list of lists', I then outputted the first section in my eventual html block
import urllib
import re
import string
import sys
def ipv6_results(input_page):
sections = ['/spec.p2/summary.html', '/nd.p2/summary.html',
'/addr.p2/summary.html', '/pmtu.p2/summary.html',
'/icmp.p2/summary.html']
variables_output=[]
for s in sections:
temp_list = []
page = input_page + s
#print page
url_reference = urllib.urlopen(page)
html_page = url_reference.read()
m = re.search(r'TOTAL</B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
temp_list.append(int(m.group(1)) )
m = re.search(r'PASS</B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
temp_list.append(int(m.group(1)))
m = re.search(r'FAIL</FONT></B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
temp_list.append(int(m.group(1)))
variables_output.append(temp_list)
#print variables to check them :)
print "------"
print variables_output
print "Ready Logo Phase 2"
print "Section | Total | Pass | Fail |"
#this next part is eventually going to output an html block
output = string.Template("""
1 - RFC2460-IPv6 Specs $spec_total $spec_pass $spec_fail
""")
print output.substitute(spec_total=variables_output[0][0], spec_pass=variables_output[0][1],
spec_fail=variables_output[0][2])
return 1
imagine the tabbing is correct :( I wish this was more like paste bin, suggestions welcome on pasting code in here
Generally, you don't declare the shape of the list first, and then fill in the values. Instead, you build the list as you discover the values.
Your variables has a lot of structure. You've got inner lists of 3 elements, always in the order of 'total', 'pass', 'fail'. Perhaps these 3-tuples should be made namedtuples. That way, you can access the three parts with humanly-recogizable names (data.total, data.pass, data.fail), instead of cryptic index numbers (data[0], data[1], data[2]).
Next, your 3-tuples differ by prefixes: 'spec', 'nd', 'addr', etc.
These sound like keys to a dict rather than elements of a list.
So perhaps consider making variables a dict. That way, you can access the particular 3-tuple you want with the humanly-recognizable variables['nd'] instead of variables[1]. And you can access the nd_fail value with variables['nd'].fail instead of variables[1][2]:
import collections
# define the namedtuple class Point (used below).
Point = collections.namedtuple('Point', 'total pass fail')
# Notice we declare `variables` empty at first; we'll fill in the values later.
variables={}
keys=('spec','nd','addr','pmtu','icmp')
for s in sections:
for key in keys:
page = input_page + s
url_reference = urllib.urlopen(page)
html_page = url_reference.read()
m = re.search(r'TOTAL</B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
ntotal = int(m.group(1))
m = re.search(r'PASS</B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
npass = int(m.group(1))
m = re.search(r'FAIL</FONT></B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
nfail = int(m.group(1))
# We create an instance of the namedtuple on the right-hand side
# and store the value in `variables[key]`, thus building the
# variables dict incrementally.
variables[key]=Point(ntotal,npass,nfail)
The first thing is that those lists there will only be the values of the variables, at the time of assignment. You would be changing the list value, but not the variables.
I would seriously consider using classes and build structures of those, including lists of class instances.
For example:
class SectionResult:
def __init__(self, total = 0, pass = 0, fail = 0):
self.total = total
self.pass = pass
self.fail = fail
Since it looks like each group should link up with a section, you can create a list of dictionaries (or perhaps a list of classes?) with the bits associated with a section:
sections = [{'results' : SectionResult(), 'filename': '/addr.p2/summary.html'}, ....]
Then in the loop:
for section in sections:
page = input_page + section['filename']
url_reference = urllib.urlopen(page)
html_page = url_reference.read()
m = re.search(r'TOTAL</B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
section['results'].total = int(m.group(1))
m = re.search(r'PASS</B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
section['results'].pass = int(m.group(1))
m = re.search(r'FAIL</FONT></B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
section['results'].fail = int(m.group(1))
I would use a dictionary inside a list. Maybe something like:
def ipv6_results(input_page):
results = [{file_name:'/spec.p2/summary.html', total:0, pass:0, fail:0},
{file_name:'/nd.p2/summary.html', total:0, pass:0, fail:0},
{file_name:'/addr.p2/summary.html', total:0, pass:0, fail:0},
{file_name:'/pmtu.p2/summary.html', total:0, pass:0, fail:0},
{file_name:'/icmp.p2/summary.html', total:0, pass:0, fail:0}]
for r in results:
url_reference = urllib.urlopen(input_page + r[file_name])
html_page = url_reference.read()
m = re.search(r'TOTAL</B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
r[total] = int(m.group(1))
m = re.search(r'PASS</B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
r[pass] = int(m.group(1))
m = re.search(r'FAIL</FONT></B></TD><TD>:</TD><TD>([0-9,]+)', html_page)
r[fail] = int(m.group(1))
for r in results:
print r[total]
print r[pass]
print r[fail]
I have read this answer potentially as the best way to randomize a list of strings in Python. I'm just wondering then if that's the most efficient way to do it because I have a list of about 30 million elements via the following code:
import json
from sets import Set
from random import shuffle
a = []
for i in range(0,193):
json_data = open("C:/Twitter/user/user_" + str(i) + ".json")
data = json.load(json_data)
for j in range(0,len(data)):
a.append(data[j]['su'])
new = list(Set(a))
print "Cleaned length is: " + str(len(new))
## Take Cleaned List and Randomize it for Analysis
shuffle(new)
If there is a more efficient way to do it, I'd greatly appreciate any advice on how to do it.
Thanks,
A couple of possible suggestions:
import json
from random import shuffle
a = set()
for i in range(193):
with open("C:/Twitter/user/user_{0}.json".format(i)) as json_data:
data = json.load(json_data)
a.update(d['su'] for d in data)
print("Cleaned length is {0}".format(len(a)))
# Take Cleaned List and Randomize it for Analysis
new = list(a)
shuffle(new)
.
the only way to know if this is faster is to profile it!
do you prefer sets.Set to the built-in set() for a reason?
I have introduced a with clause (preferred way of opening files, as it guarantees they get closed)
it did not appear that you were doing anything with 'a' as a list except converting it to a set; why not make it a set from the start?
rather than iterate on an index, then do a lookup on the index, I just iterate on the data items...
which makes it easily rewriteable as a generator expression
If you think you're going to do shuffle, you're probably better off using the solution from this file. For realz.
randomly mix lines of 3 million-line file
Basically the shuffle algorithm has a very low period (meaning it can't hit all the possible combinations of 3 million files, let alone 30 million). If you can load the data in memory then your best bet is as they say. Basically assign a random number to each line and sort that badboy.
See this thread. And here, I did it for you so you didn't mess anything up (that's a joke),
import json
import random
from operator import itemgetter
a = set()
for i in range(0,193):
json_data = open("C:/Twitter/user/user_" + str(i) + ".json")
data = json.load(json_data)
a.update(d['su'] for d in data)
print "Cleaned length is: " + str(len(new))
new = [(random.random(), el) for el in a]
new.sort()
new = map(itemgetter(1), new)
I don't know if it will be any faster but you could try numpy's shuffle.