Compare two lists in python and obtain non-equality - python

This piece of code in theory have to compare two lists which have the ID of a tweet, and in this comparison if it already exists in screen printing , otherwise not.
But I print all or not being listed.
Any suggestions to compare these two lists of ID's and if not the ID of the first list in the second then print it ?
Sorry for the little efficient code . ( and my English )
What I seek is actually not do RT ( retweet ) repeatedly when I already have . I use Tweepy library , I read the timeline , and make the tweet RT I did not do RT
def analizarRT():
timeline = []
temp = []
RT = []
fileRT = openFile('rt.txt')
for status in api.user_timeline('cnn', count='6'):
timeline.append(status)
for i in range(6):
temp.append(timeline[i].id)
for a in range(6):
for b in range(6):
if str(temp[a]) == fileRT[b]:
pass
else:
RT.append(temp[a])
for i in RT:
print i
Solved add this function !
def estaElemento(tweetId, arreglo):
encontrado = False
for a in range(len(arreglo)):
if str(tweetId) == arreglo[a].strip():
encontrado = True
break
return encontrado

Its a simple program, don't complicate it. As per your comments, there are two lists:)
1. timeline
2. fileRT
Now, you want to compare the id's in both these lists. Before you do that, you must know the nature of these two lists.
I mean, what is the type of data in the lists?
Is it
list of strings? or
list of objects? or
list of integers?
So, find out that, debug it, or use print statements in your code. Or please add these details in your question. So, you can give a perfect answer.
Mean while, try this:
if timeline.id == fileRT.id should work.
Edited:
def analizarRT():
timeline = []
fileRT = openFile('rt.txt')
for status in api.user_timeline('cnn', count='6'):
timeline.append(status)
for i in range(6):
for b in range(6):
if timeline[i].id == fileRT[b].id:
pass
else:
newlist.append(timeline[i].id)
print newlist
As per your question, you want to obtain them, right?. I have appended them in a newlist. Now you can say print newlist to see the items

your else statement is associated with the for statement, you probably need to add one more indent to make it work on the if statement.

Related

Using Try Except to iterate through a list in Python

I'm trying to iterate through a list of NFL QBs (over 100) and add create a list of links that I will use later.
The links follow a standard format, however if there are multiple players with the same name (such as 'Josh Allen') the link format needs to change.
I've been trying to do this with different nested while/for loops with Try/Except with little to no success. This is what I have so far:
test = ['Josh Allen', 'Lamar Jackson', 'Derek Carr']
empty_list=[]
name_int = 0
for names in test:
try:
q_b_name = names.split()
link1=q_b_name[1][0].capitalize()
link2=q_b_name[1][0:4].capitalize()+q_b_name[0][0:2].capitalize()+f'0{name_int}'
q_b = pd.read_html(f'https://www.pro-football-reference.com/players/{link1}/{link2}/gamelog/')
q_b1 = q_b[0]
#filter_status is a function that only works with QB data
df = filter_stats(q_b1)
#triggers the try if the link wasn't a QB
df.head(5)
empty_list.append(f'https://www.pro-football-reference.com/players/{link1}/{link2}/gamelog/')
except:
#adds one to the variable to change the link to find the proper QB link
name_int += 1
The result only appends the final correct link. I need to append each correct link to the empty list.
Still a beginner in Python and trying to challenge myself with different projects. Thanks!
As stated, the try/except will work in that it will try the code under the try block. If at any point within that block it fails or raises and exception/error, it goes and executes the block of code under the except.
There are better ways to go about this problem (for example, I'd use BeautifulSoup to simply check the html for the "QB" position), but since you are a beginner, I think trying to learn this process will help you understand the loops.
So what this code does:
1 It formats your player name into the link format.
2 We initialize a while loop that will it will enter
3 It gets the table.
4a) It enters a function that checks if the table contains 'passing'
stats by looking at the column headers.
4b) If it finds 'passing' in the column, it will return a True statement to indicate it is a "QB" type of table (keep in mind sometimes there might be runningbacks or other positions who have passing stats, but we'll ignore that). If it returns True, the while loop will stop and go to the next name in your test list
4c) If it returns False, it'll increment your name_int and check the next one
5 To take care of a case where it never finds a QB table, the while loop will go to False if it tries 10 iterations
Code:
import pandas as pd
def check_stats(q_b1):
for col in q_b1.columns:
if 'passing' in col.lower():
return True
return False
test = ['Josh Allen', 'Lamar Jackson', 'Derek Carr']
empty_list=[]
for names in test:
name_int = 0
q_b_name = names.split()
link1=q_b_name[1][0].capitalize()
qbStatsInTable = False
while qbStatsInTable == False:
link2=q_b_name[1][0:4].capitalize()+q_b_name[0][0:2].capitalize()+f'0{name_int}'
url = f'https://www.pro-football-reference.com/players/{link1}/{link2}/gamelog/'
try:
q_b = pd.read_html(url, header=0)
q_b1 = q_b[0]
except Exception as e:
print(e)
break
#Check if "passing" in the table columns
qbStatsInTable = check_stats(q_b1)
if qbStatsInTable == True:
print(f'{names} - Found QB Stats in {link1}/{link2}/gamelog/')
empty_list.append(f'https://www.pro-football-reference.com/players/{link1}/{link2}/gamelog/')
else:
name_int += 1
if name_int == 10:
print(f'Did not find a link for {names}')
qbStatsInTable = False
Output:
print(empty_list)
['https://www.pro-football-reference.com/players/A/AlleJo02/gamelog/', 'https://www.pro-football-reference.com/players/J/JackLa00/gamelog/', 'https://www.pro-football-reference.com/players/C/CarrDe02/gamelog/']

Python difference between sets not working

I am working on a web-scraping project. As the code runs, first is saved a list of around 100 products. The links are saved into one list called current_products.
Then after some time this is done again but saved into other list called new_products. Then I compare if they are the same (no new products listed on the website) or the new_products list is different from the current_items list and there was item added. I have the following code:
if (new_products == current_products):
print("No new items found. Trying again in 5 seconds...")
new_products = []
else:
print("New products found")
products_new = set(new_products) - set(current_products)
print(products_new)
As a testing data i put a duplicate link into the new_items. The code prints that the lists are different but then doesnt print any new product link (the difference between these lists) just empty set ({}). Any idea how to fix it?
EDIT: It doesnt work with DUPLICATE test data. When i put brand new link (that is not in either list) it works perfectly. Is there a way how to IGNORE duplicates?
Here is a trivial example of how this could be happening:
>>> a = [1, 2]
>>> b = [2, 1]
>>> a == b
False
>>> set(a) - set(b)
set()
To avoid this, either sort the lists before comparing them, or just compare sets to start with:
ds = set(a) - set(b)
if ds:
print('diffs')
else:
print('no diffs')
Making a set is more efficient for large data than sorting because it's an O(n) operation vs O(n log n) for sorting.
I think a more efficient code would be
products_new = set(new_products) - set(current_products)
if (len(products_new) == 0):
print("No new items found. Trying again in 5 seconds...")
new_products = []
else:
print("New products found")
print(products_new)

Python won't detect function loop if statement correctly

I have the following code
def numTest():
getNum = "https://sms-activate.ru/stubs/handler_api.php?api_key=" + api + "&action=getNumber&service=go&country=3"
numReq = requests.get(getNum)
smsNum = numReq.text
cleanNum = smsNum.split(":")
print(cleanNum)
reply = cleanNum[:6]
if reply == "ACCESS":
numID = cleanNum[1]
smsNo = cleanNum[2].replace("1", "", 1)
print(numID)
print(smsNo)
else:
numTest()
When the code is ran it doesn't detect the reply properly. So the API can either get back something such as ['ACCESS_NUMBER', '379609689', '12165419985'] or this ['NO_NUMBERS']
If it is the first one I need to split it and just keep array [1] and [2] and if it says No Numbers I need to run the loop again. What happens as well is if I get a number on the first try it stops and works correctly but if I get No numbers it trys again and if it gets a number it keeps going.
cleannum is a list, you're looking to find out if the first elements first 6 characters are ACCESS, not the first 6 elements of the list (which will never equal a string)
reply = cleanNum[0][:6]
Not sure if I understood your problem right. But if the array you mentioned is "cleanNum", then you should be using cleanNum[0][:6] here.

Deleting/Removing element from a list when comparing to another list Python

So I have a good one. I'm trying to build two lists (ku_coins and bin_coins) of crypto tickers from two different exchanges, but I don't want to double up, so if it appears on both exchanges I want to remove it from ku_coins.
A slight complication occurs as Kucoin symbols come in as AION-BTC, while Binance symbols come in as AIONBTC, but it's no problem.
So firstly, I create the two lists of symbols, which runs fine, no problem. What I then try and do is loop through the Kucoin symbols and convert them to the Binance style symbol, so AIONBTC instead of AION-BTC. Then if it appears in the Binance list I want to remove it from the Kucoin list. However, it appears to randomly refuse to remove a handful of symbols that match the requirement. For example AION.
It removes the majority of doubled up symbols but in AIONs case for example it just won't delete it.
If I just do print(i) after this loop:
for i in ku_coins:
if str(i[:-4] + 'BTC') in bin_coins:
It will happily print AION-BTC as one of the symbols, as it fits the requirement perfectly. However, when I stick the ku_coins.remove(i) command in before printing, it suddenly decideds not to print AION suggesting it doesn't match the requirements. And it's doing my head in. Obviously the remove command is causing the problem, but I can't for the life of me figure out why. Any help really appreciated.
import requests
import json
ku_dict = json.loads(requests.get('https://api.kucoin.com/api/v1/market/allTickers').text)
ku_syms = ku_dict['data']['ticker']
ku_coins = []
for x in range(0, len(ku_syms)):
if ku_syms[x]['symbol'][-3:] == 'BTC':
ku_coins.append(ku_syms[x]['symbol'])
bin_syms = json.loads(requests.get('https://www.binance.com/api/v3/ticker/bookTicker').text)
bin_coins = []
for i in bin_syms:
if i['symbol'][-3:] == 'BTC':
bin_coins.append(i['symbol'])
ku_coins.sort()
bin_coins.sort()
for i in ku_coins:
if str(i[:-4] + 'BTC') in bin_coins:
ku_coins.remove(i)
#top bantz, #Fourier has already mentioned that you shouldn't modify a list you're iterating over. What you can do in this case is to create a copy of ku_coins first then iterate over that, and then remove the element from the original ku_coins that matches your if condition. See below:
ku_coins.sort()
bin_coins.sort()
# Create a copy
ku_coins_ = ku_coins[:]
# Then iterate over that copy
for i in ku_coins_:
if str(i[:-4] + 'BTC') in bin_coins:
ku_coins.remove(i)
How about modifying the code to:
while ku_coins:
i = ku_coins.pop()
if str(i[:-4] + 'BTC') in bin_coins:
pass
else:
# do something
the pop() method removes i from the ku_coins list
pop()

Python: How to speed up creating of objects?

I'm creating objects derived from a rather large txt file. My code is working properly but takes a long time to run. This is because the elements I'm looking for in the first place are not ordered and not (necessarily) unique. For example I am looking for a digit-code that might be used twice in the file but could be in the first and the last row. My idea was to check how often a certain code is used...
counter=collections.Counter([l[3] for l in self.body])
...and then loop through the counter. Advance: if a code is only used once you don't have to iterate over the whole file. However You are stuck with a lot of iterations which makes the process really slow.
So my question really is: how can I improve my code? Another idea of course is to oder the data first. But that could take quite long as well.
The crucial part is this method:
def get_pc(self):
counter=collections.Counter([l[3] for l in self.body])
# This returns something like this {'187':'2', '199':'1',...}
pcode = []
#loop through entries of counter
for k,v in counter.iteritems():
i = 0
#find post code in body
for l in self.body:
if i == v:
break
# find fist appearence of key
if l[3] == k:
#first encounter...
if i == 0:
#...so create object
self.pc = CodeCana(k,l[2])
pcode.append(self.pc)
i += 1
# make attributes
self.pc.attr((l[0],l[1]),l[4])
if v <= 1:
break
return pcode
I hope the code explains the problem sufficiently. If not, let me know and I will expand the provided information.
You are looping over body way too many times. Collapse this into one loop, and track the CodeCana items in a dictionary instead:
def get_pc(self):
pcs = dict()
pcode = []
for l in self.body:
pc = pcs.get(l[3])
if pc is None:
pc = pcs[l[3]] = CodeCana(l[3], l[2])
pcode.append(pc)
pc.attr((l[0],l[1]),l[4])
return pcode
Counting all items first then trying to limit looping over body by that many times while still looping over all the different types of items defeats the purpose somewhat...
You may want to consider giving the various indices in l names. You can use tuple unpacking:
for foo, bar, baz, egg, ham in self.body:
pc = pcs.get(egg)
if pc is None:
pc = pcs[egg] = CodeCana(egg, baz)
pcode.append(pc)
pc.attr((foo, bar), ham)
but building body out of a namedtuple-based class would help in code documentation and debugging even more.

Categories

Resources