I'm making a bot that compares the last buys and sells orders received by fetching a cryptocurrencie's exchange and prints the difference.
My problem, right now, is that it prints the last order received over and over, i think it's because of the while loop. Is there a way to make it print only the last two without printing the same thing more times? I was thinking of using OrderedDict but i don't know how to use it on Json. Here is the code involved:
import time, requests, json
> while True:
> BU = requests.session()
> URL = 'https://bittrex.com/api/v1.1/public/getmarkethistory?market=BTC-DOGE'
> r = BU.get(URL, timeout=(15, 10))
> time.sleep(1)
> MarketPairs = json.loads(r.content)
> for element in MarketPairs['result']:
> id = element['Id']
> price = element['Price']
> tot = element['Total']
> time = element['TimeStamp']
> type = element['OrderType']
>
>
> if time > '2017-12-11T21:37:01.103':
> print type, id, tot, price, time
> time.sleep(1)
I guess this is what you want...
It will print only last price, and only if it's different from previous one
import requests as req
import time
previous=None
while 1:
url='https://bittrex.com/api/v1.1/public/getmarkethistory?market=BTC-DOGE'
response = req.get(url,timeout=(15,10)).json()
result = response["result"]
last_price_dict = result[0]
id=last_price_dict["Id"]
price = last_price_dict["Price"]
total = last_price_dict["Total"]
timestamp = last_price_dict["TimeStamp"]
order_type=last_price_dict["OrderType"]
this_one = (id, total, price, timestamp)
if id != previous:
print(this_one)
previous = id
time.sleep(3)
Related
This is my assignment:
You need to write a Python code that will read the current price of the XRP/USDT futures on the Binance exchange in real time (as fast as possible). If the price falls by 1% from the maximum price in the last hour, the program should print a message to the console, and the program should continue to work, constantly reading the current price.
I learned how to receive data, but how can I go further?
import requests
import json
import pandas as pd
import datetime
base = 'https://testnet.binancefuture.com'
path = '/fapi/v1/klines'
url = base + path
param = {'symbol': 'XRPUSDT', 'interval': '1h', 'limit': 10}
r = requests.get(url, params = param)
if r.status_code == 200:
data = pd.DataFrame(r.json())
print(data)
else:
print('Error')
You can try this, I've defined a function for price check and the rest is the main operation
def price_check(df):
max_value = max(df['Price']) #max price within 1 hour
min_value = min(df['Price']) #min price within 1 hour
if min_value/max_value < 0.99: #1% threshold
print("Alert")
while True: # you can adjust the check frequency by sleep() function
response = requests.get(url)
if response.status_code==200:
data = pd.Dataframe(response.json())
price_check(data)
import requests
import time
def get_price():
url = "https://api.binance.com/api/v3/ticker/price?symbol=XRPUSDT"
response = requests.get(url)
return float(response.json()["price"])
def check_price_drop(price, highest_price):
if price / highest_price < 0.99:
print("Price dropped by 1%!")
highest_price = 0
while True:
price = get_price()
if price > highest_price:
highest_price = price
check_price_drop(price, highest_price)
time.sleep(10)
Basically, if i loop a datetime performing an scan with date range per-day, like:
table_hook = dynamodb_resource.Table('table1')
date_filter = Key('date_column').between('2021-01-01T00:00:00+00:00', '2021-01-01T23:59:59+00:00')
response = table_hook.scan(FilterExpression=date_filter)
incoming_data = response['Items']
if (response['Count']) == 0:
return
_counter = 1
while 'LastEvaluatedKey' in response:
response = table_hook.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
if (
parser.parse(response['Items'][0]['date_column']).replace(tzinfo=None) < parser.parse('2021-01-01T00:00:00+00:00').replace(tzinfo=None)
or
parser.parse(response['Items'][0]['date_column']).replace(tzinfo=None).replace(tzinfo=None) > parser.parse('2021-06-07T23:59:59+00:00').replace(tzinfo=None)
):
break
incoming_data.extend(response['Items'])
_counter+=1
print("|-> Getting page %s" % _counter)
At the end of Day1 to Day2 loop, it retrieve me X rows,
But if i perform the same scan at the same way (paginating), with the same range (Day1 to Day2), without doing a loop, it retrieve me Y rows,
And to become better, when i perform a table.describe_table(TableName='table1'), row_count field comes with Z rows, i literally dont understand what is going on!
Based on help of above guys, i found my error, basically i'm not passing the filter again when performing pagination so the fixed code are:
table_hook = dynamodb_resource.Table('table1')
date_filter = Key('date_column').between('2021-01-01T00:00:00+00:00', '2021-01-01T23:59:59+00:00')
response = table_hook.scan(FilterExpression=date_filter)
incoming_data = response['Items']
_counter = 1
while 'LastEvaluatedKey' in response:
response = table_hook.scan(FilterExpression=date_filter,
ExclusiveStartKey=response['LastEvaluatedKey'])
incoming_data.extend(response['Items'])
_counter+=1
print("|-> Getting page %s" % _counter)
I am a student working on a scraping project and I am having trouble completing my script because it fills my computer's memory with all of the data is stores.
It currently stores all of my data until the end, so my solution to this would be to break up the scrape into smaller bits and then write out the data periodically so it does not just continue to make one big list and then write out at the end.
In order to do this, I would need to stop my scroll method, scrape the loaded profiles, write out the data that I have collected, and then repeat this process without duplicating my data. It would be appreciated if someone could show me how to do this. Thank you for your help :)
Here's my current code:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from time import sleep
from selenium.common.exceptions import NoSuchElementException
Data = []
driver = webdriver.Chrome()
driver.get("https://directory.bcsp.org/")
count = int(input("Number of Pages to Scrape: "))
body = driver.find_element_by_xpath("//body")
profile_count = driver.find_elements_by_xpath("//div[#align='right']/a")
while len(profile_count) < count: # Get links up to "count"
body.send_keys(Keys.END)
sleep(1)
profile_count = driver.find_elements_by_xpath("//div[#align='right']/a")
for link in profile_count: # Calling up links
temp = link.get_attribute('href') # temp for
driver.execute_script("window.open('');") # open new tab
driver.switch_to.window(driver.window_handles[1]) # focus new tab
driver.get(temp)
# scrape code
Name = driver.find_element_by_xpath('/html/body/table/tbody/tr/td/table/tbody/tr/td[5]/div/table[1]/tbody/tr/td[1]/div[2]/div').text
IssuedBy = "Board of Certified Safety Professionals"
CertificationorDesignaationNumber = driver.find_element_by_xpath('/html/body/table/tbody/tr/td/table/tbody/tr/td[5]/div/table[1]/tbody/tr/td[3]/table/tbody/tr[1]/td[3]/div[2]').text
CertfiedorDesignatedSince = driver.find_element_by_xpath('/html/body/table/tbody/tr/td/table/tbody/tr/td[5]/div/table[1]/tbody/tr/td[3]/table/tbody/tr[3]/td[1]/div[2]').text
try:
AccreditedBy = driver.find_element_by_xpath('/html/body/table/tbody/tr/td/table/tbody/tr/td[5]/div/table[1]/tbody/tr/td[3]/table/tbody/tr[5]/td[3]/div[2]/a').text
except NoSuchElementException:
AccreditedBy = "N/A"
try:
Expires = driver.find_element_by_xpath('/html/body/table/tbody/tr/td/table/tbody/tr/td[5]/div/table[1]/tbody/tr/td[3]/table/tbody/tr[5]/td[1]/div[2]').text
except NoSuchElementException:
Expires = "N/A"
info = Name, IssuedBy, CertificationorDesignaationNumber, CertfiedorDesignatedSince, AccreditedBy, Expires + "\n"
Data.extend(info)
driver.close()
driver.switch_to.window(driver.window_handles[0])
with open("Spredsheet.txt", "w") as output:
output.write(','.join(Data))
driver.close()
Test.py
Displaying Test.py.
Try the below approach using requests and beautifulsoup. In the below script i have used the API URL fetched from website itself for ex:-API URL
First it will create the URL(refer first url) for first iteration, add headers and data in .csv file.
Second iteration it will again create the URL(refer second url) with 2 extra params start_on_page=20 & show_per_page=20 where start_on_page number 20 is incremented by 20 on each iteration and show_per_page = 100 defaulted to extract 100 records per iteration so on till all the data dumped in to the .csv file.second iteration API URL
Script is dumping 4 things number, name, location and profile url.
On each iteration data will be appended to .csv file , so your memory issue will get resolved by this approach.
Do not forget to add your system path in file_path variable where do you want to create .csv file before running the script.
import requests
from urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
from bs4 import BeautifulSoup as bs
import csv
def scrap_directory_data():
list_of_credentials = []
file_path = ''
file_name = 'credential_list.csv'
count = 0
page_number = 0
page_size = 100
create_url = ''
main_url = 'https://directory.bcsp.org/search_results.php?'
first_iteration_url = 'first_name=&last_name=&city=&state=&country=&certification=&unauthorized=0&retired=0&specialties=&industries='
number_of_records = 0
csv_headers = ['#','Name','Location','Profile URL']
while True:
if count == 0:
create_url = main_url + first_iteration_url
print('-' * 100)
print('1 iteration URL created: ' + create_url)
print('-' * 100)
else:
create_url = main_url + 'start_on_page=' + str(page_number) + '&show_per_page=' + str(page_size) + '&' + first_iteration_url
print('-' * 100)
print('Other then first iteration URL created: ' + create_url)
print('-' * 100)
page = requests.get(create_url,verify=False)
extracted_text = bs(page.text, 'lxml')
result = extracted_text.find_all('tr')
if len(result) > 0:
for idx, data in enumerate(result):
if idx > 0:
number_of_records +=1
name = data.contents[1].text
location = data.contents[3].text
profile_url = data.contents[5].contents[0].attrs['href']
list_of_credentials.append({
'#':number_of_records,
'Name':name,
'Location': location,
'Profile URL': profile_url
})
print(data)
with open(file_path + file_name ,'a+') as cred_CSV:
csvwriter = csv.DictWriter(cred_CSV, delimiter=',',lineterminator='\n',fieldnames=csv_headers)
if idx == 0 and count == 0:
print('Writing CSV header now...')
csvwriter.writeheader()
else:
for item in list_of_credentials:
print('Writing data rows now..')
print(item)
csvwriter.writerow(item)
list_of_credentials = []
count +=1
page_number +=20
scrap_directory_data()
I'm using a medium API to get a some information but after some API calls the python script ended with this error:
IndexError: list index out of range
Here is my Python code:
def get_post_responses(posts):
#start = time.time()
count = 0
print('Retrieving the post responses...')
responses = []
for post in posts:
url = MEDIUM + '/_/api/posts/' + post + '/responses'
count = count + 1
print("number of times api called",count)
response = requests.get(url)
response_dict = clean_json_response(response)
responses += response_dict['payload']['value']
#end = time.time()
#four = end - start
#global time_cal
#time_cal.append(four)
return responses
def check_if_high_recommends(response, recommend_min):
if response['virtuals']['recommends'] >= recommend_min:
return True
def check_if_recent(response):
limit_date = datetime.now() - timedelta(days=360)
creation_epoch_time = response['createdAt'] / 1000
creation_date = datetime.fromtimestamp(creation_epoch_time)
if creation_date >= limit_date:
return True
It needs to work for more then 10000 followers for a user.
i got an ans for my question...
just i need to use try catch exception ...
response_dict = clean_json_response(response)
try:
responses += response_dict['payload']['value']
catch:
continue
I have a stream from an API that constantly updates the price. the goal is to compare the last two prices and if x > Y then do something. I can get the prices into an array, however, the array grows very large very quick.. How can I limit the number of elements to 2, then compare them?
My code:
def stream_to_queue(self):
response = self.connect_to_stream()
if response.status_code != 200:
return
prices = []
for line in response.iter_lines(1):
if line:
try:
msg = json.loads(line)
except Exception as e:
print "Caught exception when converting message into json\n" + str(e)
return
if msg.has_key("instrument") or msg.has_key("tick"):
price = msg["tick"]["ask"]
prices.append(price)
print prices
Thanks in advance for help!
You could use a deque with maxlen set to 2:
from collections import deque
deq = deque(maxlen=2)
You could also manually check the size and rearrange:
if len(arr) == 2:
arr[0], arr[1] = arr[1], new_value
if msg.has_key("instrument") or msg.has_key("tick"):
price = msg["tick"]["ask"]
last_price = None
if prices:
last_price = prices[-1]
prices = [last_price]
if last_price > price:
#do stuff
prices.append(price)