Saving API text to textfile PYTHON - python

I have an api that allows you to download proxies. Every time I try to save this to a note file on python, it saves it with spaces. If I print it however, there are no spaces to be found. Why is this happening and how can I remove the spaces?
import requests
proxyrequest = requests.get("https://api.proxyscrape.com?request=getproxies&proxytype=http")
with open("proxies.txt", "w") as proxywrite:
proxywrite.write(proxyrequest.text)
What I get:
1.10.189.84:44452
1.0.160.41:4145
1.0.150.125:4145
1.10.188.93:37389
1.0.142.155:4145
1.0.155.32:4145
1.0.220.235:4145
1.0.161.67:4145
114.104.137.34:1080
What I need:
1.10.189.84:44452
1.0.160.41:4145
1.0.150.125:4145
1.10.188.93:37389
1.0.142.155:4145
1.0.155.32:4145
1.0.220.235:4145
1.0.161.67:4145
114.104.137.34:1080

Here is a solution that works fine :)
import requests
proxyrequest = requests.get("https://api.proxyscrape.com?request=getproxies&proxytype=http")
proxyrequest_format = proxyrequest.text.strip()
proxyrequest_format = proxyrequest_format.replace("\r","")
list_proxies = list(proxyrequest_format.split("\n"))
with open("proxies.txt", "w") as proxywrite:
for proxy in list_proxies:
proxywrite.write("%s\n" % proxy)
Output:
1.10.188.202:8080
1.0.210.16:8080
1.119.166.180:8080
1.10.188.85:8080
1.197.204.40:9999
01.10.188.202:8080
1.0.190.69:8080
1.2.254.185:8080

It is because of the '\n' in the text. You can fix it by removing '\n' from the text. Use the following code instead:
proxywrite.write(proxyrequest.text.replace('\n',''))

Related

Python writing to file and json returns None/null instead of value

I'm trying to write data to a file with the following code
#!/usr/bin/python37all
print('Content-type: text/html\n\n')
import cgi
from Alarm import *
import json
htmldata = cgi.FieldStorage()
alarm_time = htmldata.getvalue('alarm_time')
alarm_date = htmldata.getvalue('alarm_date')
print(alarm_time,alarm_date)
data = {'time':alarm_time,'date':alarm_date}
# print(data['time'],data['date'])
with open('alarm_data.txt','w') as f:
json.dump(data,f)
...
but when opening the the file, I get the following output:
{'time':null,'date':null}
The print statement returns what I except it to: 14:26 2020-12-12.
I've tried this same method with f.write() but it returns both values as None. This is being run on a raspberry pi. Why aren't the correct values being written?
--EDIT--
The json string I expect to see is the following:{'time':'14:26','date':'2020-12-12'}
Perhaps you meant:
data = {'time':str(alarm_time), 'date':str(alarm_date)}
I would expect to see your file contents like this:
{"time":"14:26","date":"2020-12-12"}
Note the double quotes: ". json is very strict about these things, so don't fool yourself into having single quotes ' in a file and expecting json to parse it.

Changing output of speedtest.py and speedtest-cli to include IP address in output .csv file

I added a line in the python code “speedtest.py” that I found at pimylifeup.com. I hoped it would allow me to track the internet provider and IP address along with all the other speed information his code provides. But when I execute it, the code only grabs the next word after the find all call. I would also like it to return the IP address that appears after the provider. I have attached the code below. Can you help me modify it to return what I am looking for.
Here is an example what is returned by speedtest-cli
$ speedtest-cli
Retrieving speedtest.net configuration...
Testing from Biglobe (111.111.111.111)...
Retrieving speedtest.net server list...
Selecting best server based on ping...
Hosted by GLBB Japan (Naha) [51.24 km]: 118.566 ms
Testing download speed................................................................................
Download: 4.00 Mbit/s
Testing upload speed......................................................................................................
Upload: 13.19 Mbit/s
$
And this is an example of what it is being returned by speediest.py to my .csv file
Date,Time,Ping,Download (Mbit/s),Upload(Mbit/s),myip
05/30/20,12:47,76.391,12.28,19.43,Biglobe
This is what I want it to return.
Date,Time,Ping,Download (Mbit/s),Upload (Mbit/s),myip
05/30/20,12:31,75.158,14.29,19.54,Biglobe 111.111.111.111
Or may be,
05/30/20,12:31,75.158,14.29,19.54,Biglobe,111.111.111.111
Here is the code that I am using. And thank you for any help you can provide.
import os
import re
import subprocess
import time
response = subprocess.Popen(‘/usr/local/bin/speedtest-cli’, shell=True, stdout=subprocess.PIPE).stdout.read().decode(‘utf-8’)
ping = re.findall(‘km]:\s(.*?)\s’, response, re.MULTILINE)
download = re.findall(‘Download:\s(.*?)\s’, response, re.MULTILINE)
upload = re.findall(‘Upload:\s(.*?)\s’, response, re.MULTILINE)
myip = re.findall(‘from\s(.*?)\s’, response, re.MULTILINE)
ping = ping[0].replace(‘,’, ‘.’)
download = download[0].replace(‘,’, ‘.’)
upload = upload[0].replace(‘,’, ‘.’)
myip = myip[0]
try:
f = open(‘/home/pi/speedtest/speedtestz.csv’, ‘a+’)
if os.stat(‘/home/pi/speedtest/speedtestz.csv’).st_size == 0:
f.write(‘Date,Time,Ping,Download (Mbit/s),Upload (Mbit/s),myip\r\n’)
except:
pass
f.write(‘{},{},{},{},{},{}\r\n’.format(time.strftime(‘%m/%d/%y’), time.strftime(‘%H:%M’), ping, download, upload, myip))
Let me know if this works for you, it should do everything you're looking for
#!/usr/local/env python
import os
import csv
import time
import subprocess
from decimal import *
file_path = '/home/pi/speedtest/speedtestz.csv'
def format_speed(bits_string):
""" changes string bit/s to megabits/s and rounds to two decimal places """
return (Decimal(bits_string) / 1000000).quantize(Decimal('.01'), rounding=ROUND_UP)
def write_csv(row):
""" writes a header row if one does not exist and test result row """
# straight from csv man page
# see: https://docs.python.org/3/library/csv.html
with open(file_path, 'a+', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=',', quotechar='"')
if os.stat(file_path).st_size == 0:
writer.writerow(['Date','Time','Ping','Download (Mbit/s)','Upload (Mbit/s)','myip'])
writer.writerow(row)
response = subprocess.run(['/usr/local/bin/speedtest-cli', '--csv'], capture_output=True, encoding='utf-8')
# if speedtest-cli exited with no errors / ran successfully
if response.returncode == 0:
# from the csv man page
# "And while the module doesn’t directly support parsing strings, it can easily be done"
# this will remove quotes and spaces vs doing a string split on ','
# csv.reader returns an iterator, so we turn that into a list
cols = list(csv.reader([response.stdout]))[0]
# turns 13.45 ping to 13
ping = Decimal(cols[5]).quantize(Decimal('1.'))
# speedtest-cli --csv returns speed in bits/s, convert to bytes
download = format_speed(cols[6])
upload = format_speed(cols[7])
ip = cols[9]
date = time.strftime('%m/%d/%y')
time = time.strftime('%H:%M')
write_csv([date,time,ping,download,upload,ip])
else:
print('speedtest-cli returned error: %s' % response.stderr)
$/usr/local/bin/speedtest-cli --csv-header > speedtestz.csv
$/usr/local/bin/speedtest-cli --csv >> speedtestz.csv
output:
Server ID,Sponsor,Server Name,Timestamp,Distance,Ping,Download,Upload,Share,IP Address
Does that not get you what you're looking for? Run the first command once to create the csv with header row. Then subsequent runs are done with the append '>>` operator, and that'll add a test result row each time you run it
Doing all of those regexs will bite you if they or a library that they depend on decides to change their debugging output format
Plenty of ways to do it though. Hope this helps

parsing API with Python - how to handle JSON with BOM

I'm using Python 2.7.11 on windows to get JSON data from API (data on trees in Warsaw, Poland, but nevermind that). I want to generate output csv file with all the data provided by the api, for further analysis. I started with a script I used for another project (also discussed here on Stackoverflow and corrected for me by #Martin Taylor).That script didn't work so I tried to modify it using my very basic understanding, googling around and applying pdb debugger. At the moment, the result looks like this:
import pdb
import json
import urllib2
import csv
pdb.set_trace()
url = "https://api.um.warszawa.pl/api/action/datastore_search/?resource_id=ed6217dd-c8d0-4f7b-8bed-3b7eb81a95ba"
myfile = 'C:/dane/drzewa.csv'
csv_myfile = csv.writer(open(myfile, 'wb'))
cols = ['numer_adres', 'stan_zdrowia', 'y_wgs84', 'dzielnica', 'adres', 'lokalizacja', 'wiek_w_dni', 'srednica_k', 'pnie_obwod', 'miasto', 'jednostka', 'x_pl2000', 'wysokosc', 'y_pl2000', 'numer_inw', 'x_wgs84', '_id', 'gatunek_1', 'gatunek', 'data_wyk_pom']
csv_myfile.writerow(cols)
def api_iterate(myfile):
while True:
global url
print url
json_page = urllib2.urlopen(url)
data = json.load(json_page)
json_page.close()
for data_object in data ['result']['records']:
csv_myfile.writerow([data_object[col] for col in cols])
try:
url = data['_links']['next']
except KeyError as e:
break
with open(myfile, 'wb'):
api_iterate(myfile)
I'm a very fresh Python user so I get confused all the time. Now I got to the point when, while reading the objects in json dictionary, I get a Keyerror message associated with the 'x_wgs84' element. I suppose it has something to do with the fact that in the source url this element is preceded by a U+FEFF unicode character. I tried to get around this but I got stuck and would appreciate assistance.
I suspect the code may be corrupt in several other ways - as I mentioned, I'm a very unskilled programmer (yet).
You need to put the key with the unicode character:
To know how to do it, one easy way is to print the keys:
>>> import requests
>>> res = requests.get('https://api.um.warszawa.pl/api/action/datastore_search/?resource_id=ed6217dd-c8d0-4f7b-8bed-3b7eb81a95ba')
>>> data = res.json()
>>> records = data['result']['records']
>>> records[0]
{u'numer_adres': u'', u'stan_zdrowia': u'dobry', u'y_wgs84': u'52.21865', u'y_pl2000': u'5787241.04475524', u'adres': u'ul. ALPEJSKA', u'x_pl2000': u'7511793.96937063', u'lokalizacja': u'Ulica ALPEJSKA', u'wiek_w_dni': u'60', u'miasto': u'Warszawa', u'jednostka': u'Dzielnica Wawer', u'pnie_obwod': u'73', u'wysokosc': u'14', u'data_wyk_pom': u'20130709', u'dzielnica': u'Wawer', u'\ufeffx_wgs84': u'21.172584', u'numer_inw': u'D386200', u'_id': 125435, u'gatunek_1': u'Quercus robur', u'gatunek': u'd\u0105b szypu\u0142kowy', u'srednica_k': u'7'}
>>> records[0].keys()
[u'numer_adres', u'stan_zdrowia', u'y_wgs84', u'y_pl2000', u'adres', u'x_pl2000', u'lokalizacja', u'wiek_w_dni', u'miasto', u'jednostka', u'pnie_obwod', u'wysokosc', u'data_wyk_pom', u'dzielnica', u'\ufeffx_wgs84', u'numer_inw', u'_id', u'gatunek_1', u'gatunek', u'srednica_k']
>>> records[0][u'\ufeffx_wgs84']
u'21.172584'
As you can see, to get your key, you need to write it as '\ufeffx_wgs84' with the unicode character that is causing trouble.
Note: I don't know if you are using python2 or 3, but you might need to put a u before your string declaration in python2 to declare it as unicode string.

gammu receive sms message python fails

I have found a script on this website http://wammu.eu/docs/manual/smsd/run.html
#!/usr/bin/python
import os
import sys
numparts = int(os.environ['DECODED_PARTS'])
# Are there any decoded parts?
if numparts == 0:
print('No decoded parts!')
sys.exit(1)
# Get all text parts
text = ''
for i in range(1, numparts + 1):
varname = 'DECODED_%d_TEXT' % i
if varname in os.environ:
text = text + os.environ[varname]
# Do something with the text
f = open('/home/pi/output.txt','w')
f.write('Number %s have sent text: %s' % (os.environ['SMS_1_NUMBER'], text))
And i know that my gammu-smsd is working fine, because i can turn of my ledlamp on raspberry by sending sms to the raspberry, but my question is why is this script failing? nonthing is happening. and when I try to run the script by it self it also fails.
What I would like to do is just receive the sms and then read the content and save the content and phonenumber which sent the sms to a file.
I hope you understand my issue.
Thank you in advance, all the best.
In the gammu-smsd config file, you can use the file backend which does this for you automatically.
See this example from the gammu documentation
http://wammu.eu/docs/manual/smsd/config.html#files-service
[smsd]
Service = files
PIN = 1234
LogFile = syslog
InboxPath = /var/spool/sms/inbox/
OutboPpath = /var/spool/sms/outbox/
SentSMSPath = /var/spool/sms/sent/
ErrorSMSPath = /var/spool/sms/error/
Also see options for the file backend to tailor to your needs.
http://wammu.eu/docs/manual/smsd/config.html#files-backend-options
Hope this helps :)

expected string or buffer when processing text file with python

I'm processing a large (120mb) text file from my thunderbird imap directory and attempting to extract to/from info from the headers using mbox and regex. the process runs for a while until I eventually get an exception: "TypeError: expected string or buffer".
The exception references the fifth line of this code:
PAT_EMAIL = re.compile(r"[0-9A-Za-z._-]+\#[0-9A-Za-z._-]+")
temp_list = []
mymbox = mbox("data.txt")
for email in mymbox.values():
from_address = PAT_EMAIL.findall(email["from"])
to_address = PAT_EMAIL.findall(email["to"])
for item in from_address:
temp_list.append(item) #items are added to a temporary list where they are sorted then written to file
I've run the code on other (smaller) files, so I'm guessing the issue is my file. The file appears to be just a bunch of text. Can someone point me in the write direction for debugging this?
There can only be one from address (I think!):
In the following:
from_address = PAT_EMAIL.findall(email["from"])
I have a feeling you're trying to duplicate the work of email.message_from_file and email.utils.parseaddr
from email.utils import parseaddr
>>> s = "Jon Clements <jon#example.com>"
>>> from email.utils import parseaddr
>>> parseaddr(s)
('Jon Clements', 'jon#example.com')
So you can use parseaddr(email['from'])[1] to get the email address and use that.
Similarly, you may wish to look at email.utils.getaddresses to handle to and cc addresses...
Well, I didn't solve the issue but have worked around it for my own purposes. I inserted a try statement so that the iteration just continues past any TypeError. For every thousand email addresses I'm getting about 8 failures, which will suffice. Thanks for your input!
PAT_EMAIL = re.compile(r"[0-9A-Za-z._-]+\#[0-9A-Za-z._-]+")
temp_list = []
mymbox = mbox("data.txt")
for email in mymbox.values():
try:
from_address = PAT_EMAIL.findall(email["from"])
except(TypeError):
print "TypeError!"
try:
to_address = PAT_EMAIL.findall(email["to"])
except(TypeError):
print "TypeError!"
for item in from_address:
temp_list.append(item) #items are added to a temporary list where they are sorted then written to file

Categories

Resources