Traceback error when using search_engine_parser - python

I am writing a simple script to search google using the search engine parser it was working fine until yesterday.
All search queries are stored in test.csv file
from search_engine_parser.core.engines.google import Search as GoogleSearch
import csv
with open('/Users/John/Desktop/test.csv') as csv_file:
csv_reader = csv.reader(csv_file)
header = next(csv_reader)
# Check file as empty
if header != None:
for row in csv_reader:
gsearch = GoogleSearch()
gresults = gsearch.search(row)
print(gresults["titles"][0])
getting the error below:
"/Users/John/Documents/python scripts/venv/bin/python" "/Users/John/Documents/python scripts/search_parser2.py"
Samsung Galaxy J7 - Full phone specifications - GSMArena.com
Search for samsung sm-n920c - GSMArena.com
Traceback (most recent call last):
File "/Users/John/Documents/python scripts/venv/lib/python3.7/site-packages/search_engine_parser/core/base.py", line 240, in get_results
search_results = self.parse_result(results, **kwargs)
File "/Users/John/Documents/python scripts/venv/lib/python3.7/site-packages/search_engine_parser/core/base.py", line 151, in parse_result
rdict = self.parse_single_result(each, **kwargs)
File "/Users/John/Documents/python scripts/venv/lib/python3.7/site-packages/search_engine_parser/core/engines/google.py", line 74, in parse_single_result
title = r_elem.find('div', class_='BNeawe').text
AttributeError: 'NoneType' object has no attribute 'text'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/John/Documents/python scripts/search_parser2.py", line 10, in <module>
gresults = gsearch.search(row)
File "/Users/John/Documents/python scripts/venv/lib/python3.7/site-packages/search_engine_parser/core/base.py", line 270, in search
return self.get_results(soup, **kwargs)
File "/Users/John/Documents/python scripts/venv/lib/python3.7/site-packages/search_engine_parser/core/base.py", line 244, in get_results
"The returned results could not be parsed. This might be due to site updates or "
search_engine_parser.core.exceptions.NoResultsOrTrafficError: The returned results could not be parsed. This might be due to site updates or server errors. Drop an issue at https://github.com/bisoncorps/search-engine-parser if this persists
Process finished with exit code 1

Related

Merging PDFs in python using convertapi

I'm trying to use the module convertapi to merge PDFs in Python3.8. I've tried multiple ways but I'm unable to figure out the origin of the returned error. Here is my function:
def merger(output_path, input_paths):
dictFiles = {}
for i,path in enumerate(input_paths):
dictFiles[f'File[{i}]'] = path
convertapi.api_secret = 'my-api-secret'
result = convertapi.convert('merge', dictFiles, from_format = 'pdf')
result.save_files(output_path)
And here is the error that is returned:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 46, in handle_response
r.raise_for_status()
File "C:\Python\Python38\lib\site-packages\requests\models.py", line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url:
https://v2.convertapi.com/convert/pdf/to/merge?Secret=my-api-secret
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
File "D:\Desktop\merger.py", line 46, in merger
result = convertapi.convert('merge', dictFiles, from_format = 'pdf')
File "C:\Python\Python38\lib\site-packages\convertapi\api.py", line 7, in convert
return task.run()
File "C:\Python\Python38\lib\site-packages\convertapi\task.py", line 26, in run
response = convertapi.client.post(path, params, timeout = timeout)
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 16, in post
return self.handle_response(r)
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 49, in handle_response
raise ApiError(r.json())
convertapi.exceptions.ApiError: Parameter validation error. Code: 4000. {'Files': ['Files array item
count must be greater than 0.']}
I am suspecting the error to come from the fact that the dict is created before the merging because when entering directly the dictionary in the covertapi.convert(), I'm not getting the same error:
def merger(output_path, input_paths):
convertapi.api_secret = 'my-api-secret'
convertapi.convert('merge', {
'Files[0]': 'path/to/file1.pdf',
'Files[1]': 'path/to/file2.pdf'
}, from_format = 'pdf').save_files(output_path)
Here a different error:
Exception in Tkinter callback
Traceback (most recent call last):
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 46, in handle_response
r.raise_for_status()
File "C:\Python\Python38\lib\site-packages\requests\models.py", line 941, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url:
https://v2.convertapi.com/convert/pdf/to/merge?Secret=my-api-secret
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
...
File "D:\Desktop\merger.py", line 50, in merger
convertapi.convert('merge', {
File "C:\Python\Python38\lib\site-packages\convertapi\api.py", line 7, in convert
return task.run()
File "C:\Python\Python38\lib\site-packages\convertapi\task.py", line 26, in run
response = convertapi.client.post(path, params, timeout = timeout)
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 16, in post
return self.handle_response(r)
File "C:\Python\Python38\lib\site-packages\convertapi\client.py", line 49, in handle_response
raise ApiError(r.json())
convertapi.exceptions.ApiError: Unable to download remote file. Code: 5007.
Note that here I'm note using PyPDF2 to merge files because I'm having some errors when the file contains some specific characters (mostly chinese characters).
If you will go to https://www.convertapi.com/pdf-to-merge and scroll down you easily will find snippet builder and amount all programming snippets you will find Python one.
convertapi.api_secret = 'Your_secret'
convertapi.convert('merge', {
'Files[0]': '/path/to/dpa.pdf',
'Files[1]': '/path/to/sample.pdf'
}, from_format = 'pdf').save_files('/path/to/dir')
And if you take some time to analyze snippet you will find that plural is used for Files array and not singular like in your code.
def merger(output_path, input_paths):
dictFiles = {}
for i,path in enumerate(input_paths):
dictFiles[f'File[{i}]'] = path
convertapi.api_secret = 'my-api-secret'
result = convertapi.convert('merge', dictFiles, from_format = 'pdf')
result.save_files(output_path)
convertapi.exceptions.ApiError: Parameter validation error. Code: 4000. {'Files': ['Files array item
count must be greater than 0.']}
As for the second error, you didn't provide the code so I can't help you.

AttributeError: 'AudioFileClip' object has no attribute 'reader'

when i tried to run this code, i got the "AttributeError" but idk why? please help me.
import os
import random
from moviepy.editor import *
List_image = os.listdir("müzik\kapaklar")
kapak_num = random.sample(range(1,len(List_image)+1),len(List_image))
List_songs = os.listdir("müzik\hazırlanacaklar")
for x in range(len(List_songs)):
songs = List_songs[x+1]
kapak = List_image[kapak_num[x+1]]
audio = AudioFileClip(songs) ### problem is here ###
image = ImageClip(kapak).set_duration(audio.duration)
video = image.set_audio(audio)
outfile = f"müzik\yapılanlar\%n_with_image.mp4",songs
video.write_videofile(outfile, fps=1)
error message:
C:\Users\erkan\Desktop\ChatCop\venv\Scripts\python.exe "C:/Users/erkan/Desktop/ChatCop/deus ex machina.py"
Traceback (most recent call last):
File "C:/Users/erkan/Desktop/ChatCop/deus ex machina.py", line 15, in <module>
audio = AudioFileClip(songs) # sıkıntı burada
File "C:\Users\erkan\Desktop\ChatCop\venv\lib\site-packages\moviepy\audio\io\AudioFileClip.py", line 72, in __init__
buffersize=buffersize)
File "C:\Users\erkan\Desktop\ChatCop\venv\lib\site-packages\moviepy\audio\io\readers.py", line 50, in __init__
infos = ffmpeg_parse_infos(filename)
File "C:\Users\erkan\Desktop\ChatCop\venv\lib\site-packages\moviepy\video\io\ffmpeg_reader.py", line 276, in ffmpeg_parse_infos
"path.")%filename)
OSError: MoviePy error: the file 70s Japanese Jazz Mix (Rare Groove, Jazz-Funk, Hard Bop, Modal, Fusion, Breaks).mp3 could not be found!
Please check that you entered the correct path.
Exception ignored in: <bound method AudioFileClip.__del__ of <moviepy.audio.io.AudioFileClip.AudioFileClip object at 0x0000026210D45710>>
Traceback (most recent call last):
File "C:\Users\erkan\Desktop\ChatCop\venv\lib\site-packages\moviepy\audio\io\AudioFileClip.py", line 94, in __del__
self.close()
File "C:\Users\erkan\Desktop\ChatCop\venv\lib\site-packages\moviepy\audio\io\AudioFileClip.py", line 89, in close
if self.reader:
AttributeError: 'AudioFileClip' object has no attribute 'reader'
Process finished with exit code 1

memory error when retrieving data from Songkick

I have built a scraper to retrieve concert data from songkick by using their api. However, it takes a lot of time to retrieve all the data from these artists. After scraping for approximately 15 hours the script is still running but the JSON file doesn’t change anymore. I interrupted the script and I checked if I could access my data with TinyDB. Unfortunately I get the following error. Does anybody know why this is happening?
Error:
('cannot fetch url', 'http://api.songkick.com/api/3.0/artists/8689004/gigography.json?apikey=###########&min_date=2015-04-25&max_date=2017-03-01')
8961344
Traceback (most recent call last):
File "C:\Users\rmlj\Dropbox\Data\concerts.py", line 42, in <module>
load_events()
File "C:\Users\rmlj\Dropbox\Data\concerts.py", line 27, in load_events
print(artist)
File "C:\Python27\lib\idlelib\PyShell.py", line 1356, in write
return self.shell.write(s, self.tags)
KeyboardInterrupt
>>> mydat = db.all()
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
mydat = db.all()
File "C:\Python27\lib\site-packages\tinydb\database.py", line 304, in all
return list(itervalues(self._read()))
File "C:\Python27\lib\site-packages\tinydb\database.py", line 277, in _read
return self._storage.read()
File "C:\Python27\lib\site-packages\tinydb\database.py", line 31, in read
raw_data = (self._storage.read() or {})[self._table_name]
File "C:\Python27\lib\site-packages\tinydb\storages.py", line 105, in read
return json.load(self._handle)
File "C:\Python27\lib\json\__init__.py", line 287, in load
return loads(fp.read(),
MemoryError
below you can find my script
import urllib2
import requests
import json
import csv
import codecs
from tinydb import TinyDB, Query
db = TinyDB('events.json')
def load_events():
MIN_DATE = "2015-04-25"
MAX_DATE = "2017-03-01"
API_KEY= "###############"
with open('artistid.txt', 'r') as f:
for a in f:
artist = a.strip()
print(artist)
url_base = 'http://api.songkick.com/api/3.0/artists/{}/gigography.json?apikey={}&min_date={}&max_date={}'
url = url_base.format(artist, API_KEY, MIN_DATE, MAX_DATE)
# url = u'http://api.songkick.com/api/3.0/search/artists.json?query='+artist+'&apikey=WBmvXDarTCEfqq7h'
try:
r = requests.get(url)
resp = r.json()
if(resp['resultsPage']['totalEntries']):
results = resp['resultsPage']['results']['event']
for x in results:
print(x)
db.insert(x)
except:
print('cannot fetch url',url);
load_events()
db.close()
print ("End of script")
MemoryError is a built in Python exception (https://docs.python.org/3.6/library/exceptions.html#MemoryError) so it looks like the process is out of memory and this isn't really related to Songkick.
This question probably has the information you need to debug this: How to debug a MemoryError in Python? Tools for tracking memory use?

Access neo4j via python-joern

When using Joern, I accessed the Neo4j database via python-joern with the following code.
from joern.all import JoernSteps
j = JoernSteps()
j.setGraphDbURL('http://localhost:7474/db/data/')
j.connectToDatabase()
res = j.runGremlinQuery('getFunctionsByName("main")')
for r in res: print r
Error like this
Traceback (most recent call last):
File "test.py", line 11, in <module>
res = j.runGremlinQuery('getFunctionsByName("main")')
File "/home/binbin/Downloads/python-joern-0.3.1/joern/all.py", line 44, in runGremlinQuery
return self.gremlin.execute(finalQuery)
File "/usr/local/lib/python2.7/dist-packages/py2neo-2.0-py2.7-linux-x86_64.egg/py2neo/ext/gremlin/__init__.py", line 36, in execute
response = self.resources["execute_script"].post({"script": script})
File "/usr/local/lib/python2.7/dist-packages/py2neo-2.0-py2.7-linux-x86_64.egg/py2neo/core.py", line 288, in post
raise_from(self.error_class(message, **content), error)
File "/usr/local/lib/python2.7/dist-packages/py2neo-2.0-py2.7-linux-x86_64.egg/py2neo/util.py", line 215, in raise_from
raise exception
py2neo.error.NoClassDefFoundError: javax/transaction/SystemException
How to fix it?
I had searched a lot for my question. Finally I found the solution here: https://github.com/fabsx00/python-joern/issues/14. Anyone who has got the same problem can see it.

KeyError when assigning ''praw.Reddit'' to variable

I could successfully connect to reddit's servers with oauth2 some time ago, but when running my script just now, I get a KeyError followed by a NoSectionError. Code is below followed by exceptions, (The code has been reduced to its essentials).
import praw
# Configuration
APP_UA = 'useragent'
...
...
...
r = praw.Reddit(APP_UA)
Error message:
Traceback (most recent call last):
File "D:\Directory\Python\lib\configparser.py", line 843, in items
d.update(self._sections[section])
KeyError: 'useragent'
A NoSectionError occurred when handling the above exception.
"During handling of the above exception, another exception occurred:"
'Traceback (most recent call last):
File "D:\Directory\Python\Projects\myprj for Reddit, globaloffensive\oddshotcrawler.py", line 19, in <module>
r = praw.Reddit(APP_UA)
File "D:\Directory\Python\lib\site-packages\praw\reddit.py", line 84, in __init__
**config_settings)
File "D:\Directory\Python\lib\site-packages\praw\config.py", line 47, in __init__
raw = dict(Config.CONFIG.items(site_name), **settings)
File "D:\Directory\Python\lib\configparser.py", line 846, in items
raise NoSectionError(section)
configparser.NoSectionError: No section: 'useragent'
[Finished in 0.2s]
Try giving it a user_agent kwarg.
r = praw.Reddit(useragent=APP_UA)

Categories

Resources