I am using the python 3 splunk API to export some massive logs.
My code essentially follows the splunk API guidelines:
import splunklib.client as client
import splunklib.results as results
import pandas as pd
kwargs_export = {"earliest_time": "2019-08-19T12:00:00.000-00:00",
"latest_time": "2019-08-19T14:00:00.000-00:00",
"search_mode": "normal"}
exportsearch_results = service.jobs.export(mysearchquery, **kwargs_export)
reader = results.ResultsReader(exportsearch_results)
df = pd.DataFrame(list(reader))
But this is extremely slow...
Ultimately I want to store the output of the search as a csv to disk. Is there any way to speed the export?
Thanks!
Check this as it works
kwargs_export = {"earliest_time": "-1d",
"latest_time": "now",
"search_mode": "normal"}
service = client.connect(**args)
job = service.jobs.create(query, **kwargs_export)
with open(filename, 'wb') as out_f:
try:
job_results = job.results(output_mode="csv", count=0)
for result in job_results:
out_f.write(result)
except :
print("Session timed out. Reauthenticating")
Related
I am looking to use pipeline to insert data into a redis Timeseries but cannot find a way to ts.add via pipeline.
I can do basic example with get / set:
import redis
import json
redis_client = redis.Redis(host='xxx.xxx.xxx.xxx', port='xxxxx', password='xxxx')
pipe = redis_client.pipeline()
pipe.set(1,'apple')
pipe.set(2,'orange')
pipe.execute()
I cant find a way to insert into a timeseries:
import redis
import json
redis_client = redis.Redis(host='xxx.xxx.xxx.xxx', port='xxxxx', password='xxxx')
pipe = redis_client.pipeline()
pipe.ts.add(TS1,1652683016,55) #<----- this is what I want to do!
pipe.ts.add(TS1,1652683017,59) #<----- this is what I want to do!
pipe.execute()
As of this writing (redis-py 4.3.1) there exists another pipeline object on the timeseries class itself. The following will work:
import redis
r = redis.Redis()
pipe = r.ts().pipeline()
pipe.add("TS1", 1, 123123123123)
pipe.add("TS1", 2, 123123123451)
...
pipe.add("TS1", 15, 123123126957)
pipe.execute()
I have this following python code for a Flask server. I am trying to have this part of the code list all my vehicles that match the horsepower that I put in through my browser. I want it to return all the car names that match the horsepower, but what I have doesn't seem to be working? It returns nothing. I know the issue is somewhere in the "for" statement, but I don't know how to fix it.
This is my first time doing something like this and I've been trying multiple things for hours. I can't figure it out. Could you please help?
from flask import Flask
from flask import request
import os, json
app = Flask(__name__, static_folder='flask')
#app.route('/HORSEPOWER')
def horsepower():
horsepower = request.args.get('horsepower')
message = "<h3>HORSEPOWER "+str(horsepower)+"</h3>"
path = os.getcwd() + "/data/vehicles.json"
with open(path) as f:
data = json.load(f)
for record in data:
horsepower=int(record["Horsepower"])
if horsepower == record:
car=record["Car"]
return message
The following example should meet your expectations.
from flask import Flask
from flask import request
import os, json
app = Flask(__name__)
#app.route('/horsepower')
def horsepower():
# The type of the URL parameters are automatically converted to integer.
horsepower = request.args.get('horsepower', type=int)
# Read the file which is located in the data folder relative to the
# application root directory.
path = os.path.join(app.root_path, 'data', 'vehicles.json')
with open(path) as f:
data = json.load(f)
# A list of names of the data sets is created,
# the performance of which corresponds to the parameter passed.
cars = [record['Car'] for record in data if horsepower == int(record["Horsepower"])]
# The result is then output separated by commas.
return f'''
<h3>HORSEPOWER {horsepower}</h3>
<p>{','.join(cars)}<p>
'''
There are many different ways of writing the loop. I used a short variant in the example. In more detail, you can use these as well.
cars = []
for record in data:
if horsepower == int(record['Horsepower']):
cars.append(record['Car'])
As a tip:
Pay attention to when you overwrite the value of a variable by using the same name.
I added a line in the python code “speedtest.py” that I found at pimylifeup.com. I hoped it would allow me to track the internet provider and IP address along with all the other speed information his code provides. But when I execute it, the code only grabs the next word after the find all call. I would also like it to return the IP address that appears after the provider. I have attached the code below. Can you help me modify it to return what I am looking for.
Here is an example what is returned by speedtest-cli
$ speedtest-cli
Retrieving speedtest.net configuration...
Testing from Biglobe (111.111.111.111)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by GLBB Japan (Naha) [51.24 km]: 118.566 ms
Testing download speed................................................................................
Download: 4.00 Mbit/s
Testing upload speed......................................................................................................
Upload: 13.19 Mbit/s
$
And this is an example of what it is being returned by speediest.py to my .csv file
Date,Time,Ping,Download (Mbit/s),Upload(Mbit/s),myip
05/30/20,12:47,76.391,12.28,19.43,Biglobe
This is what I want it to return.
Date,Time,Ping,Download (Mbit/s),Upload (Mbit/s),myip
05/30/20,12:31,75.158,14.29,19.54,Biglobe 111.111.111.111
Or may be,
05/30/20,12:31,75.158,14.29,19.54,Biglobe,111.111.111.111
Here is the code that I am using. And thank you for any help you can provide.
import os
import re
import subprocess
import time
response = subprocess.Popen(‘/usr/local/bin/speedtest-cli’, shell=True, stdout=subprocess.PIPE).stdout.read().decode(‘utf-8’)
ping = re.findall(‘km]:\s(.*?)\s’, response, re.MULTILINE)
download = re.findall(‘Download:\s(.*?)\s’, response, re.MULTILINE)
upload = re.findall(‘Upload:\s(.*?)\s’, response, re.MULTILINE)
myip = re.findall(‘from\s(.*?)\s’, response, re.MULTILINE)
ping = ping[0].replace(‘,’, ‘.’)
download = download[0].replace(‘,’, ‘.’)
upload = upload[0].replace(‘,’, ‘.’)
myip = myip[0]
try:
f = open(‘/home/pi/speedtest/speedtestz.csv’, ‘a+’)
if os.stat(‘/home/pi/speedtest/speedtestz.csv’).st_size == 0:
f.write(‘Date,Time,Ping,Download (Mbit/s),Upload (Mbit/s),myip\r\n’)
except:
pass
f.write(‘{},{},{},{},{},{}\r\n’.format(time.strftime(‘%m/%d/%y’), time.strftime(‘%H:%M’), ping, download, upload, myip))
Let me know if this works for you, it should do everything you're looking for
#!/usr/local/env python
import os
import csv
import time
import subprocess
from decimal import *
file_path = '/home/pi/speedtest/speedtestz.csv'
def format_speed(bits_string):
""" changes string bit/s to megabits/s and rounds to two decimal places """
return (Decimal(bits_string) / 1000000).quantize(Decimal('.01'), rounding=ROUND_UP)
def write_csv(row):
""" writes a header row if one does not exist and test result row """
# straight from csv man page
# see: https://docs.python.org/3/library/csv.html
with open(file_path, 'a+', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=',', quotechar='"')
if os.stat(file_path).st_size == 0:
writer.writerow(['Date','Time','Ping','Download (Mbit/s)','Upload (Mbit/s)','myip'])
writer.writerow(row)
response = subprocess.run(['/usr/local/bin/speedtest-cli', '--csv'], capture_output=True, encoding='utf-8')
# if speedtest-cli exited with no errors / ran successfully
if response.returncode == 0:
# from the csv man page
# "And while the module doesn’t directly support parsing strings, it can easily be done"
# this will remove quotes and spaces vs doing a string split on ','
# csv.reader returns an iterator, so we turn that into a list
cols = list(csv.reader([response.stdout]))[0]
# turns 13.45 ping to 13
ping = Decimal(cols[5]).quantize(Decimal('1.'))
# speedtest-cli --csv returns speed in bits/s, convert to bytes
download = format_speed(cols[6])
upload = format_speed(cols[7])
ip = cols[9]
date = time.strftime('%m/%d/%y')
time = time.strftime('%H:%M')
write_csv([date,time,ping,download,upload,ip])
else:
print('speedtest-cli returned error: %s' % response.stderr)
$/usr/local/bin/speedtest-cli --csv-header > speedtestz.csv
$/usr/local/bin/speedtest-cli --csv >> speedtestz.csv
output:
Server ID,Sponsor,Server Name,Timestamp,Distance,Ping,Download,Upload,Share,IP Address
Does that not get you what you're looking for? Run the first command once to create the csv with header row. Then subsequent runs are done with the append '>>` operator, and that'll add a test result row each time you run it
Doing all of those regexs will bite you if they or a library that they depend on decides to change their debugging output format
Plenty of ways to do it though. Hope this helps
I have a little issue here. I want to read from a text file using python and create queue and then send these lines from the text file into Amazon web services SQS(Simple Queue service). First of all, I`ve actually managed to do this using boto, but the problem is that the lines dont come in order, just randomly, like line 4,line 1, line 5 etc etc..
Here is my code:
import boto.sqs
conn = boto.sqs.connect_to_region("us-east-2",
aws_access_key_id='AKIAIJIQZG5TR3NMW3LQ',
aws_secret_access_key='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')
q = conn.create_queue('test')
with open('read.txt', 'r') as read_file:
from boto.sqs.message import RawMessage
for line in read_file:
m = RawMessage()
m.set_body(line)
q.write(m)
So, what to do? Well, we need to create an FIFO queue(which I also managed to do using boto3 in python), but now the problem is that I have problems reading the text file.. Here is the code i used to create a FIFO queue in SQS:
import boto3
AWS_ACCESS_KEY = 'AKIAIJIQZG5TR3NMW3LQ'
AWS_SECRET_ACCESS_KEY = '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'
sqs_client = boto3.resource(
'sqs',
aws_access_key_id=AWS_ACCESS_KEY,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
region_name='us-east-2'
)
queue_name = ('demo_queue.fifo')
response = sqs_client.create_queue(
QueueName=queue_name,
Attributes={
'FifoQueue': 'true',
'ContentBasedDeduplication': 'true'
}
)
with open('read.txt', 'r') as read_file:
from boto.sqs.message import RawMessage
for line in read_file:
m = RawMessage()
m.set_body(line)
queue_name.write(m)
Somebody know how to solve this? thanks.
Try replacing
queue_name.write(m)
with
response.write(m)
in your first piece of code. You should use the actual queue returned by get_queue_by_name
Also, when only specifying MessageBody and MessageGroupID in boto3, make sure that Content-Based deduplication is enabled for the queue, or specify a MessageDeduplicationId string, otherwise it will fail
Is there an equivalent function in PyMongo or mongoengine to MongoDB's mongodump? I can't seem to find anything in the docs.
Use case: I need to periodically backup a remote mongo database. The local machine is a production server that does not have mongo installed, and I do not have admin rights, so I can't use subprocess to call mongodump. I could install the mongo client locally on a virtualenv, but I'd prefer an API call.
Thanks a lot :-).
For my relatively small small database, I eventually used the following solution. It's not really suitable for big or complex databases, but it suffices for my case. It dumps all documents as a json to the backup directory. It's clunky, but it does not rely on other stuff than pymongo.
from os.path import join
import pymongo
from bson.json_utils import dumps
def backup_db(backup_db_dir):
client = pymongo.MongoClient(host=<host>, port=<port>)
database = client[<db_name>]
authenticated = database.authenticate(<uname>,<pwd>)
assert authenticated, "Could not authenticate to database!"
collections = database.collection_names()
for i, collection_name in enumerate(collections):
col = getattr(database,collections[i])
collection = col.find()
jsonpath = collection_name + ".json"
jsonpath = join(backup_db_dir, jsonpath)
with open(jsonpath, 'wb') as jsonfile:
jsonfile.write(dumps(collection))
The accepted answer is not working anymore. Here is a revised code:
from os.path import join
import pymongo
from bson.json_util import dumps
def backup_db(backup_db_dir):
client = pymongo.MongoClient(host=..., port=..., username=..., password=...)
database = client[<db_name>]
collections = database.collection_names()
for i, collection_name in enumerate(collections):
col = getattr(database,collections[i])
collection = col.find()
jsonpath = collection_name + ".json"
jsonpath = join(backup_db_dir, jsonpath)
with open(jsonpath, 'wb') as jsonfile:
jsonfile.write(dumps(collection).encode())
backup_db('.')