I need to use a file as a Queue but I don't know how to start (also any other aproach is welcome), because I have a non-secure transmission between my device and a computer, and I need all the data to be saved until it is sent and successfully recieved. The DATA is a list which always holds the same type and amount of elements. I imagine something like this to be the file structure:
FILE
DATA 0 <- send_pointer
DATA 1
DATA 2
DATA 3
<- new_item
So the code will look like:
while True:
DATA = data_gather()
FILE.write(DATA, new_item)
new_item += 1
x = FILE.read(send_pointer)
if send_function(x):
FILE.delete(send_pointer)
send_pointer += 1
else:
print('error sending x')
I hope you understand my issue, my english is not the best.
EDIT
I installed this module: https://pypi.python.org/pypi/pqueue/0.1.1
But I don't know how to use it well. I can't find a way to delete the data I have already read from the file.
Thanks!
EDIT 2
Solved with pqueue.
#!/usr/bin/python
import time
offset = 0
while True:
infile=open("./log.txt")
infile.seek(offset)
for line in infile:
print line // do something
offset=infile.tell()
infile.close()
time.sleep(10)
Only updates to log.txt are printed using this method
Related
I am new to Python. Anyone help with how to generate auto-increment like B00001, B00002, B00003...... which can autosave the excel file name with a button in a specific folder.
I have tried with
global numXlsx
numXlsx = 1
wb.save(f'invoice/B{numXlsx}.xlsx')
numXlsx += 1
But when I click the button for few times with different data, it still keeps overwriting the B1.xlsx file. Anyone help with this :)
It sounds like the biggest problem you're having is that each button click is re-starting the execution of your python script, so using a global variable won't work since that doesn't persist across executions. In this case, I'd suggest using something like the pickle module to store and reload your counter value each time you execute your script. Using that module, your solution could look something like this:
import pickle
from pathlib import Path
# creates file if it doesn't exist
myfile = Path("save.p")
myfile.touch(exist_ok=True)
persisted = {}
with (open(myfile, "rb")) as f:
try:
persisted = pickle.load(f)
except EOFError:
print("file was empty, nothing to load")
# use get() to avoid KeyError if key doesn't exist
if persisted.get('counter') is None:
persisted['counter'] = 1
wb.save(f"invoice/B{persisted.get('counter')}.xlsx")
persisted['counter'] += 1
# save everything back into the same file to be used next execution
pickle.dump(persisted, open(myfile, "wb"))
BONUS: If you want the count to be padded with zeros in the filename, use persisted.get('counter'):05d in the curly brackets when saving the file. The 5 indicates you want the resulting value to be at least 5 characters long, so for example 2 would become 00002 and 111 would become 00111.
You can try using a global variable and incrementing it everytime.
Try with something like:
(inizialize it to 0)
global numXlsx # this is like your counter variable)
wb.save(f'folder/B{numXlsx}.xlsx')
numXlsx += 1 # Incrementing the variable so it does not overwrite the file as your code is doing
Have a nice day!
I'm new to the platform, this is my first message and I need your help.
I'm working on a school project where I have to analyze data. I chose to analyze the Binance stream, especially the trades. I had no problem using their web socket, I get lines.
The problem is that I get a lot of lines. For 1 second for example I can recover 10, 15 lines or much more.
I would like to recover 1 line per second for example.
I tried to put a time.sleep(1) but it doesn't work. It just "pauses" the stream but resumes at the line where it stopped. I want to avoid processing some lines, that's why I would like to get 1 line per second.
I use this library
https://python-binance.readthedocs.io/en/latest/websockets.html
def handle_message(msg):
if msg['e'] == 'error':
print(msg['m'])
else:
bitcoins_exchanged = float(msg['p']) * float(msg['q'])
timestamp = msg['T'] / 1000
timestamp = datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
print("{} - {} - Price: {}".format(timestamp,msg['s'],msg['p']))
conn_key = bm.start_trade_socket(BTCUSDT, handle_message)
bm.start()
Thanks for your help. For information, I use Python 2.7.
Im newbie in Python, this is my first work with REST API in python. First let me explain what i wanted to do. I have a csv file which have name of a product and some other details, these are missing data after migration. So now my job is to check in the downstream application1 if they contain these product or it is missing there too. if it is missing there should dig up back and back.
So Now I have API of Application 1(this would give the productname and details if that exists) and have an API for OAuth 2. This will create me a token and im using that token to access API of Application 1(it would look like this https://Applciationname/rest/< productname >) i get this < productname > from a list which is retrieved from first column of csv file. Everthing is working fine but my list is having 3000 entries it is taking almost 2 hours for me to complete.
Is there any fastest way to check this, BTW im calling token API only once. This is how my code looks like
list=[]
reading csv and appedning to list #using with open and csv reader here
get_token=requests.get(tokenurl,OAuthdetails) #similar type of code
token_dict=json.loads(get_token.content.decode())
token=token_dict['access_token']
headers={
'Authorization': 'Bearer'+' '+str(token)
}
url= https://Applciationname/rest/
for element in list:
full_url=url+element
api_response=requests.get(full_url,headers)
recieved_data=json.loads(api_response.content.decode())
if api_response.status_code=200 and len(recieved_data)!=0:
writing the element value to text file "successcall" text file #using with open here
else:
writing the element value to text file "failurecall" text file #using with open here
Now could you please help me optimizing this, so that ill be finding the product names which are not in APP 1 faster
You could see Threading for your for loop. Like so:
import threading
lock = threading.RLock()
thread_list = []
def check_api(full_url):
api_response=requests.get(full_url,headers)
recieved_data=json.loads(api_response.content.decode())
if api_response.status_code=200 and len(recieved_data)!=0:
# dont forget to add a lock to writing to the file
with lock:
with open("successcall.txt", "a") as f:
f.write(recieved_data)
else:
# again, dont forget to add with lock like the one above
# writing the element value to text file "failurecall" text file #using with open here
for element in list:
full_url = url+element
t = threading.Thread(target=check_api, args=(full_url, ))
thread_list.append(t)
# start all threads
for thread in thread_list:
thread.start()
# wait for them all to finish
for thread in thread_list:
thread.finish()
You should also not write to the same file while using Threads since it might cause some problems unless you use locks
I apologize if this is a very beginner-ish question. But I have a multivariate data set from reddit ( https://files.pushshift.io/reddit/submissions/), but the files are way too big. Is it possible to downsample one of these files down to 20% or less, and either save it as a new file (json or csv) or directly read it as a pandas dataframe? Any help will be very appreciated!
Here is my attempt thus far
def load_json_df(filename, num_bytes = -1):
'''Load the first `num_bytes` of the filename as a json blob, convert each line into a row in a Pandas data frame.'''
fs = open(filename, encoding='utf-8')
df = pd.DataFrame([json.loads(x) for x in fs.readlines(num_bytes)])
fs.close()
return df
january_df = load_json_df('RS_2019-01.json')
january_df.sample(frac=0.2)
However this gave me a memory error while trying to open it. Is there a way to downsample it without having to open the entire file?
The problem is, it is not possible to determine exactly what the 20% of the data is. In order to do that you must first read the entire length of the file and only then you can get an idea of what a 20% would look like.
Reading a large file into memory all at once throws this error generally. You can process this by reading the file line-by-line with below code:
data = []
counter = 0
with open('file') as f:
for line in f:
data.append(json.loads(line))
counter +=1
You should then be able to do this
df = pd.DataFrame([x for x in data]) #you can set a range here with counter/5 if you want to get 20%
I downloaded first of the files, i.e. https://files.pushshift.io/reddit/submissions/RS_2011-01.bz2
decompressed it and looked at the contents. As it happens, it is not a proper JSON but rather JSON-lines - a series of JSON objects, one per line (see http://jsonlines.org/ ). This means you can just cut out as many lines as you want, using any tool you want (for example, a text editor). Or you can just process the file sequentially in your Python script, taking into account every fifth line, like this:
with open('RS_2019-01.json', 'r') as infile:
for i, line in enumerate(infile):
if i % 5 == 0:
j = json.loads(line)
# process the data here
I am very new to programing and trying to learn by doing creating a text adventure game and reading Python documentation/blogs.
My issue is I'm attempting to save/load data in a text game to create some elements which carry over from game to game and are passed as arguments. Specifically with this example my goal recall, update and load an incrementing iteration each time the game is played past the intro. Specially my intention here is to import the saved march_iteration number, display it to the user as a default name suggestion, then iterate the iteration number and save the updated saved march_iteration number.
From my attempts at debugging this I seem to be updating the value and saving the updated value of 2 to the game.sav file correctly, so I believe my issues is either I'm failing to load the data properly or overwriting the saved value with the static one somehow. I've read as much documentation as I can find but from the articles I've read on saving and loading to json I cannot identify where my code is wrong.
Below is a small code snippet I wrote just to try and get the save/load working. Any insight would be greatly appreciated.
import json
def _save(dummy):
f = open("game.sav", 'w+')
json.dump(world_states, f)
f.close
def _continue(dummy):
f = open("game.sav", 'r+')
world_states = json.load(f)
f.close
world_states = {
"march_iteration" : 1
}
def _resume():
_continue("")
_resume()
print ("world_states['march_iteration']", world_states['march_iteration'])
current_iteration = world_states["march_iteration"]
def name_the_march(curent_iteration=world_states["march_iteration"]):
march_name = input("\nWhat is the name of your march? We suggest TrinMar#{}. >".format(current_iteration))
if len(march_name) == 0:
print("\nThe undifferentiated units shift nerviously, unnerved and confused, perhaps even angry.")
print("\nPlease give us a proper name executor. The march must not be nameless, that would be chaos.")
name_the_march()
else:
print("\nThank you Executor. The {} march begins its long journey.".format(march_name))
world_states['march_iteration'] = (world_states['march_iteration'] +1)
print ("world_states['march_iteration']", world_states['march_iteration'])
#Line above used only for debugging purposed
_save("")
name_the_march()
I seem to have found a solution which works for my purposes allowing me to load, update and resave. It isn't the most efficient but it works, the prints are just there to display the number being properly loaded and updated before being resaved.
Pre-requisite: This example assumes you've already created a file for this to open.
import json
#Initial data
iteration = 1
#Restore from previously saved from a file
with open('filelocation/filename.json') as f:
iteration = json.load(f)
print(iteration)
iteration = iteration + 1
print(iteration)
#save updated data
f = open("filename.json", 'w')
json.dump(iteration, f)
f.close