I need to get Geo data for a bunch of IPs (eventually I will need data for 3k+ IPs). I was able to successfully get Geo data for individual IPs. Now I'm truing to create a loop which iterates through IPs stored as separate lines in a text file and then calls ipstack API for getting Geo data. But the code returns data only for last IP in the file with 'missing_access_key' error for the other ones. I'm a python beginner - so any help would be appreciated.
fh = open('IPs.txt')
for line in fh:
ip = line
api = 'http://api.ipstack.com/' + ip + '?access_key=' + access_key
result = urllib.request.urlopen(api).read()
result = result.decode()
result = json.loads(result)
print (result)
fh = open('IPs.txt,'r')
Lines = fh.readlines()
for line in Lines:
ip = line
api = 'http://api.ipstack.com/' + ip + '?access_key=' + access_key
result = urllib.request.urlopen(api).read()
result = result.decode()
result = json.loads(result)
print (result)
Related
I'm trying grab twitter user data by their screen name using python.
The entire script does is to loop over each of the Twitter accounts in the ids variable — and for each one it will grab its profile information and add that to a row of the output file.
but I'm getting an error.
This is my code
// LIST OF TWITTER USER IDS
ids = "4816,9715012,13023422, 13393052, 14226882, 14235041, 14292458, 14335586, 14730894,\
15029174, 15474846, 15634728, 15689319, 15782399, 15946841, 16116519, 16148677, 16223542,\
16315120, 16566133, 16686673, 16801671, 41900627, 42645839, 42731742, 44157002, 44988185,\
48073289, 48827616, 49702654, 50310311, 50361094,"
// THE VARIABLE USERS IS A JSON FILE WITH DATA ON THE 32 TWITTER USERS LISTED ABOVE
users = t.lookup_user(user_id = ids)
//NAME OUR OUTPUT FILE - %i WILL BE REPLACED BY CURRENT MONTH, DAY, AND YEAR
outfn = "twitter_user_data_%i.%i.%i.txt" % (now.month, now.day, now.year)
// NAMES FOR HEADER ROW IN OUTPUT FILE
fields = "id screen_name name created_at url followers_count friends_count statuses_count \
favourites_count listed_count \
contributors_enabled description protected location lang expanded_url".split()
// INITIALIZE OUTPUT FILE AND WRITE HEADER ROW
outfp = open(outfn, "w")
//outfp.write(string.join(fields, "\t") + "\n") # header
outfp.write("\t".join(fields) + "\n") # header
// THIS BLOCK WILL LOOP OVER EACH OF THESE IDS, CREATE VARIABLES, AND OUTPUT TO FILE
for entry in users:
// CREATE EMPTY DICTIONARY
r = {}
for f in fields:
r[f] = ""
// ASSIGN VALUE OF 'ID' FIELD IN JSON TO 'ID' FIELD IN OUR DICTIONARY
r['id'] = entry['id']
// SAME WITH 'SCREEN_NAME' HERE, AND FOR REST OF THE VARIABLES
r['screen_name'] = entry['screen_name']
r['name'] = entry['name']
r['created_at'] = entry['created_at']
r['url'] = entry['url']
r['followers_count'] = entry['followers_count']
r['friends_count'] = entry['friends_count']
r['statuses_count'] = entry['statuses_count']
r['favourites_count'] = entry['favourites_count']
r['listed_count'] = entry['listed_count']
r['contributors_enabled'] = entry['contributors_enabled']
r['description'] = entry['description']
r['protected'] = entry['protected']
r['location'] = entry['location']
r['lang'] = entry['lang']
// NOT EVERY ID WILL HAVE A 'URL' KEY, SO CHECK FOR ITS EXISTENCE WITH IF CLAUSE
if 'url' in entry['entities']:
r['expanded_url'] = entry['entities']['url']['urls'][0]['expanded_url']
else:
r['expanded_url'] = ''
print(r)
// CREATE EMPTY LIST
lst = []
// ADD DATA FOR EACH VARIABLE
for f in fields:
lst.append(str(r[f]).replace("\/", "/"))
// WRITE ROW WITH DATA IN LIST
//outfp.write(string.join(lst, "\t").encode("utf-8") + "\n")
outfp.write("\t".join(lst).encode('utf-8') + '\n')
outfp.close()
The error message
TypeError Traceback (most recent call last)
<ipython-input-54-441137b1bb4d> in <module>()
37 #WRITE ROW WITH DATA IN LIST
38 #outfp.write(string.join(lst, "\t").encode("utf-8") + "\n")
---> 39 outfp.write("\t".join(lst).encode('utf-8') + '\n')
40
41 outfp.close()
TypeError: can't concat str to bytes
Any idea on how to fix this? The version of Python is 3.6.5
Your help will be greatly appreciated. Thanks.
Edite:
This screenshot of part of my file after I opened the output file in the binary mode
outfp.write("\t".join(lst).encode('utf-8') + '\n')
After you do .encode() on a string you get an instance of bytes. You can't add another string (like \n) to bytes. That's what the error is telling you.
So you need to add the \n before you encode the string. Like below:
outfp.write(("\t".join(lst) + '\n').encode('utf-8'))
I have script which basically checks domain from the text file and finds its email. I want to add multiple domain names(line by line) then script should take each domain run the function and goes to second line after finishing. I tried to google for specific solution but not sure how do i find appropriate answer.
f = open("demo.txt", "r")
url = f.readline()
extractUrl(url)
def extractUrl(url):
try:
print("Searching emails... please wait")
count = 0
listUrl = []
req = urllib.request.Request(
url,
data=None,
headers={
'User-Agent': ua.random
})
try:
conn = urllib.request.urlopen(req, timeout=10)
status = conn.getcode()
contentType = conn.info().get_content_type()
html = conn.read().decode('utf-8')
emails = re.findall(
r '[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,4}', html)
for email in emails:
if (email not in listUrl):
count += 1
print(str(count) + " - " + email)
listUrl.append(email)
print(str(count) + " emails were found")
Python files are iterable, so it's basically a simple as:
for line in f:
extractUrl(line)
But you may want to do it right (ensure you close the file whatever happens, ignore possible empty lines etc):
# use `with open(...)` to ensure the file will be correctly closed
with open("demo.txt", "r") as f:
# use `enumerate` to get line numbers too
#- we might need them for information
for lineno, line in enumerate(f, 1):
# make sure the line is clean (no leading / trailing whitespaces)
# and not empty:
line = line.strip()
# skip empty lines
if not line:
continue
# ok, this one _should_ match - but something could go wrong
try:
extractUrl(line)
except Exception as e:
# mentioning the line number in error report might help debugging
print("oops, failed to get urls for line {} ('{}') : {}".format(lineno, line, e))
I have multiple router/switches. I want to read router Ip address from csv file and write the outputs into an excel. In excel I want to create sheets per device.
I can connect and get the outputs with the code below but couldnt create excel and multiple dynamic sheets in it. I tried xlsxwriter and xlwt, what is your suggestion?
router = {}
output_dict = {}
with open('Devices.csv', mode='r') as devicesFile:
devicesDict = csv.DictReader(devicesFile, dialect = 'excel')
for row in devicesDict:
devicetype = row['device_type']
hostname = row['hostname']
ipaddress = row['ip']
username = row['username']
password = row['password']
router = {'host':hostname,'device_type':devicetype,'ip':ipaddress,'username':username,'password':password, }
net_connect = ConnectHandler(**router)
output = net_connect.send_command('display clock')
print('\n\n>>> Hostname {0} <<<'.format(row['hostname']))
print(output)
print('>>>>>>>>> End <<<<<<<<<')
def net_connect(row, output_q):
ipaddress = row['ip']
output_dict[ipaddress] = output
output_q.put(output_dict)
My working code is as below. This splits the output lines in excel.
for y, line in enumerate(output.splitlines()):
rowx = y + 3
for x, value in enumerate(line.splitlines()):
colx = x + 2
if value.isdigit():
value = int(value)
sh.write(rowx, colx, value)
I'm using Tweepy to collect tweets from the Twitter API by their Tweet ID.
Im trying to read in a file full of the IDs, get the previous tweet from the conversation stream, then store that tweet and its author's screen name etc in a text file. Some of the tweets have been deleted or the user's profile has been set to private, in which case I want to ignore that tweet and move on to the next. However, for some reason, I'm not collecting all accessible tweets. Its storing maybe 3/4 of all tweets that aren't private and haven't been deleted. Any ideas why its not catching everything?
Thanks in advance.
def getTweet(tweetID, tweetObj, callTweetObj, i):
tweet = callTweetObj.text.encode("utf8")
callUserName = callTweetObj.user.screen_name
callTweetID = tweetObj.in_reply_to_status_id_str
with open("call_tweets.txt", "a") as calltweets:
output = (callTweetObj.text.encode('utf-8')+ "\t" + callTweetID + "\t" + tweetID)
calltweets.write(output)
print output
with open("callauthors.txt", "a") as callauthors:
cauthors = (callUserName+ "\t" + "\t" + callTweetID + "\n")
callauthors.write(cauthors)
with open("callIDs.txt", "a") as callIDs:
callIDs.write(callTweetID + "\n")
with open("newResponseIDs.txt", "a") as responseIDs:
responseIDs.write(tweetID)
count = 0
file = "Response_IDs.txt"
with open(file, 'r+') as f:
lines = f.readlines()
for i in range(0, len(lines)):
tweetID = lines[i]
sleep(5)
try:
tweetObj = api.get_status(tweetID)
callTweetID = tweetObj.in_reply_to_status_id_str
callTweetObj = api.get_status(callTweetID)
getTweet(tweetID, tweetObj, callTweetObj, i)
count = count+1
print count
except:
pass
You haven't specified information regarding the response coming back from api.get_status, so it's hard to detect what the error is.
However, it might be you have reached the rate limit for the statuses/show/:id request. The API specifies this request is limited to 180 requests a window.
You can use Tweepy to call application/rate_limit_status:
response = api.rate_limit_status()
remaining = response['resources']['statuses']['/statuses/show/:id']['remaining']
assert remaining > 0
I have a function which takes a list of custom objects, conforms some values then writes them to a CSV file. Something really strange is happening in that when the list only contains a few objects, the resulting CSV file is always blank. When the list is longer, the function works fine. Is it some kind of weird anomaly with the temporary file perhaps?
I have to point out that this function returns the temporary file to a web server allowing the user to download the CSV. The web server function is below the main function.
def makeCSV(things):
from tempfile import NamedTemporaryFile
# make the csv headers from an object
headers = [h for h in dir(things[0]) if not h.startswith('_')]
# this just pretties up the object and returns it as a dict
def cleanVals(item):
new_item = {}
for h in headers:
try:
new_item[h] = getattr(item, h)
except:
new_item[h] = ''
if isinstance(new_item[h], list):
if new_item[h]:
new_item[h] = [z.__str__() for z in new_item[h]]
new_item[h] = ', '.join(new_item[h])
else:
new_item[h] = ''
new_item[h] = new_item[h].__str__()
return new_item
things = map(cleanVals, things)
f = NamedTemporaryFile(delete=True)
dw = csv.DictWriter(f,sorted(headers),restval='',extrasaction='ignore')
dw.writer.writerow(dw.fieldnames)
for t in things:
try:
dw.writerow(t)
# I can always see the dicts here...
print t
except Exception as e:
# and there are no exceptions
print e
return f
Web server function:
f = makeCSV(search_results)
response = FileResponse(f.name)
response.headers['Content-Disposition'] = (
"attachment; filename=export_%s.csv" % collection)
return response
Any help or advice greatly appreciated!
Summarizing eumiro's answer: the file needs to be flushed. Call f.flush() at the end of makeCSV().