I have installed a module called Google Scraper (https://github.com/NikolaiT/GoogleScraper) and I copied this code from his profile:
import sys
from GoogleScraper import scrape_with_config, GoogleSearchError
from GoogleScraper.database import ScraperSearch, SERP, Link
def basic_usage():
# See in the config.cfg file for possible values
config = {
'SCRAPING': {
'use_own_ip': 'True',
'keyword': 'Let\'s go bubbles!',
'search_engines': 'yandex',
'num_pages_for_keyword': 1
},
'SELENIUM': {
'sel_browser': 'chrome',
},
'GLOBAL': {
'do_caching': 'False'
}
}
try:
sqlalchemy_session = scrape_with_config(config)
except GoogleSearchError as e:
print(e)
# let's inspect what we got
for search in sqlalchemy_session.query(ScraperSearch).all():
for serp in search.serps:
print(serp)
for link in serp.links:
print(link)
There should a list of links printed, but nothing happens. I guess that the SQLalchemy part is not recognized... I have not used this before. I installed a module called SQLalchemy but do I need to set up something else? What could cause this code not to work?
Related
I've been following a tutorial to learn Python and smart contracts (I'm totally new to coding), and while following every step to the letter, VSCode keeps returning the following message >INFO: Could not find files for the given pattern(s).
Although it still returns whatever action I ask it to do:
from solcx import compile_standard, install_solc
import json
from web3 import Web3
import os
from dotenv import load_dotenv
load_dotenv()
install_solc("0.6.0")
with open("./SimpleStorage.sol", "r") as file:
simple_storage_file = file.read()
compiled_sol = compile_standard(
{
"language": "Solidity",
"sources": {"SimpleStorage.sol": {"content": simple_storage_file}},
"settings": {
"outputSelection": {
"*": {"*": ["abi", "metadata", "evm.bytecode", "evm.sourceMap"]}
}
},
},
solc_version="0.6.0",
)
with open("compiled_code.json", "w") as file:
json.dump(compiled_sol, file)
# get bytecode
bytecode = compiled_sol["contracts"]["SimpleStorage.sol"]["SimpleStorage"]["evm"][
"bytecode"
]["object"]
# get ABI
abi = compiled_sol["contracts"]["SimpleStorage.sol"]["SimpleStorage"]["abi"]
w3 = Web3(Web3.HTTPProvider("HTTP://127.0.0.1:7545"))
chain_id = 1337
my_address = "0x237d38135A752544a4980438c3dd9dFDe409Fb49"
private_key = os.getenv("PRIVATE_KEY")
# create the contract in python
SimpleStorage = w3.eth.contract(abi=abi, bytecode=bytecode)
# get the latest transaction
nonce = w3.eth.getTransactionCount(my_address)
# 1. build a transation
# 2. Sign a transation
# 3. Send a transation
transaction = SimpleStorage.constructor().buildTransaction(
{"chainId": chain_id, "from": my_address, "nonce": nonce})
signed_txn = w3.eth.account.sign_transaction(
transaction, private_key=private_key)
private_key = os.getenv("PRIVATE_KEY")
# Send the signed transaction
print("Deploying contract...")
tx_hash = w3.eth.send_raw_transaction(signed_txn.rawTransaction)
tx_receipt = w3.eth.wait_for_transaction_receipt(tx_hash)
print("Deployed!")
# working with the contract
# contract address
# Contract ABI
simple_storage = w3.eth.contract(address=tx_receipt.contractAddress, abi=abi)
# Call > simulate making the call and getting the return value, doesn't make a change on the blockchain
# Transact > actually makes a state change
# Initial value of favorite number
print(simple_storage.functions.retrieve().call())
print("Updating contract...")
store_transaction = simple_storage.functions.store(15).buildTransaction(
{"chainId": chain_id, "from": my_address, "nonce": nonce + 1}
)
signed_store_txn = w3.eth.account.sign_transaction(
store_transaction, private_key=private_key)
send_store_tx = w3.eth.send_raw_transaction(signed_store_txn.rawTransaction)
tx_receipt = w3.eth.wait_for_transaction_receipt(send_store_tx)
print("Updated!")
print(simple_storage.functions.retrieve().call())
And the result in the terminal is :
PS C:\Users\chret\Documents\demo\web3_py_simple_storage> python deploy.py
INFO: Could not find files for the given pattern(s).
Deploying contract...
Deployed!
0
Updating contract...
Updated!
15
So, I'm fairly confused, should I just ignore the warning "Could not find files for the given pattern(s)" ? Or is there anything I can do to fix it/is it going to create issues as I keep coding in those files? I've tried relocating the folders, including the path in Environment variables/PATH, but it doesn't stop this message from showing up.
It's been doing this from the beginning and nowhere does it show on the video I'm following (freecodecamp 16h video tutorial on youtube about blockchain).
Thank you!
you're importing solcx
during the import it runs solcx\install.py
near the end of that file it has this code
try:
# try to set the result of `which`/`where` as the default
_default_solc_binary = _get_which_solc()
except Exception:
# if not available, use the most recent solcx installed version
if get_installed_solc_versions():
set_solc_version(get_installed_solc_versions()[0], silent=True)
the _get_which_colc() function is defined earlier in the file, and runs this line (for windows)
response = subprocess.check_output(["where.exe", "solc"], encoding="utf8").strip()
which errors and sends the message you are worried about to the console
INFO: Could not find files for the given pattern(s).
This error is expected, and handled in the except Exception: clause (see above)
So nothing to worry about, you can ignore the warning :)
I too had this problem.
I wasn't able to fix it on my windows environment. But,
The github links to a tutorial to setup the brownie stack in an Ubuntu environment on Windows, and this has been working flawlessly for me. And it's easy to setup.
https://medium.com/#cromewar/how-to-setup-windows-10-11-for-smart-contract-development-and-brownie-e7d8d13555b3
It's not mentioned in the article, but currently (26/11/2021), you will want to install Node v16, and ganache v7.0.0-alpha.2 instead, due to compatibility issues.
Refer to link for NVM & node versions.
https://learn.microsoft.com/en-us/windows/dev-environment/javascript/nodejs-on-wsl
a smart contract man.sol, and it contains two(or one) contract in a file, like this:
pragma solidity ^0.8.0;
import "./SafeERC20.sol";
contract mainContract {
... (Any code can be here ...)
}contract childContract {
... (Other code here)}
and our python file. a.py:
import json
import os
import web3.eth
from web3 import Web3, HTTPProvider
from solcx import install_solc, set_solc_version,compile_standard
from dotenv import load_dotenv#here install solidity version
install_solc('v0.8.0')
set_solc_version('v0.8.0')
file_path = "."
name = "main.sol"
input = {
'language': 'Solidity',
'sources': {
name: {'urls': [file_path + "/" + name]}},
'settings': {
'outputSelection': {
'*': {
'*': ["abi", "metadata", "evm.bytecode", "evm.bytecode.sourceMap"],
},
'def': {name: ["abi", "evm.bytecode.opcodes"]},
}
}
}
output = compile_standard(input, allow_paths=file_path)
contracts = output["contracts"]
with open('compiled_code.json', "w") as file:
json.dump(output, file)
bytecode = contracts["SC-.sol"]["mainContract"]["evm"]["bytecode"]["object"]
abi = contracts["main.sol"]["mainContract"]["abi"]
# Deploy on local ganache# w3 = Web3(Web3.HTTPProvider("HTTP://127.0.0.1:7545"))
# chainId = 1337
# myAddress = "0x6235207DE426B0E3739529F1c53c14aaA271D..."
# privateKey = "0xdbe7f5a9c95ea2df023ad9......."
#Deploy on rinkeby infura rinkebyw3 = Web3(Web3.HTTPProvider("https://rinkeby.infura.io/v3/......"))
chainId = 4
myAddress = "0xBa842323C4747609CeCEd164d61896d2Cf4..."
privateKey ="0x99de2de028a52668d3e94a00d47c4500db0afed3fe8e40..."
SCOnline = w3.eth.contract(abi=abi, bytecode=bytecode)
nonce = w3.eth.getTransactionCount(myAddress)
transaction = SCOnline.constructor().buildTransaction({
"gasPrice": w3.eth.gas_price, "chainId": chainId, "from": myAddress, "nonce": nonce
})
signedTrx = w3.eth.account.sign_transaction(transaction, private_key= privateKey)
txHash = w3.eth.send_raw_transaction(signedTrx.rawTransaction)
txReceipt = w3.eth.wait_for_transaction_receipt(txHash)
I'm using scrapy and splash to scrape this website, for some reasons I'm using splash and scrapy even though I know I can scrape its API. My problem is that I only want my lua script to return only job listing's urls rather than the whole splash:html() page, I've been trying to do that but I'm getting the error message below:-
{
"error": 400,
"description": "Error happened while executing Lua script",
"type": "ScriptError",
"info": {
"message": "Lua error: /app/splash/lua_modules/libs/treat.lua:45: cannot change a protected metatable",
"type": "LUA_ERROR"
}
}
The lua script I've been using is also shown below:-
function main(splash, args)
assert(splash:go(args.url))
splash:wait(5.0)
local treat = require('treat')
listings = assert(splash:select_all("ul.job_listings > li> a"))
return {
listing_urls = treat.as_array(listings)
}
end
function treat.as_array(tbl)
-- the same function is available in
-- Splash Python code as lua._mark_table_as_array
if type(tbl) ~= 'table' or wraputils.is_wrapped(tbl) then
error('as_array argument must be a table', 2)
end
setmetatable(tbl, {__metatable="array"})
return tbl
end
treat.as_array tries to change the metatable of its argument.
The error is caused because listings's metatable has the __metatable field set.
From https://www.lua.org/manual/5.3/manual.html#pdf-setmetatable
If the original metatable has a __metatable field, raises an error.
Here's a simplified version of the JSON I am working with:
{
"libraries": [
{
"library-1": {
"file": {
"url": "foobar.com/.../library-1.bin"
}
}
},
{
"library-2": {
"application": {
"url": "barfoo.com/.../library-2.exe"
}
}
}
]
}
Using json, I can json.loads() this file. I need to be able to find the 'url', download it, and save it to a local folder called library. In this case, I'd create two folders within libraries/, one called library-1, the other library-2. Within these folder's would be whatever was downloaded from the url.
The issue, however, is being able to get to the url:
my_json = json.loads(...) # get the json
for library in my_json['libraries']:
file.download(library['file']['url']) # doesn't access ['application']['url']
Since the JSON I am using uses a variety of accessors, sometimes 'file', other times 'dll' etc, I can't use one specific dictionary key. How can I use multiple. Would there be a modular way to do this?
Edit: There are numerous accessors, 'file', 'application' and 'dll' are only some examples.
You can just iterate through each level of the dictionary and download the files if you find a url.
urls = []
for library in my_json['libraries']:
for lib_name, lib_data in library.items():
for module_name, module_data in lib_data.items():
url = module_data.get('url')
if url is not None:
# create local directory with lib_name
# download files from url to local directory
urls.append(url)
# urls = ['foobar.com/.../library-1.bin', 'barfoo.com/.../library-2.exe']
This should work:
for library in my_json['libraries']:
for value in library.values():
for url in value.values():
file.download(url)
I would suggest doing it like this:
for library in my_json['libraries']:
library_data = library.popitem()[1].popitem()[1]
file.download(library_data['url'])
Try this
for library in my_json['libraries']:
if 'file' in library:
file.download(library['file']['url'])
elif 'dll' in library:
file.download(library['dll']['url'])
It just sees if your dict(created by parsing JSON) has a key named 'file'. If so, then use 'url' of the dict corresponds to the 'file' key. If not, try the same with 'dll' keyword.
Edit: If you don't know the key to access the dict containing the url, try this.
for library in my_json['libraries']:
for key in library:
if 'url' in library['key']:
file.download(library[key]['url'])
This iterates over all the keys in your library. Then, whichever key contains an 'url', downloads using that.
How to add post-processing options equivalent of --embed-thumbnails and --add-metadata when using youtube-dl in a python script?
I read the following documentation, but couldn't find post processing 'key value' options.
https://github.com/rg3/youtube-dl/blob/master/README.md#embedding-youtube-dl
The full list of options is documented in YoutubeDL.py. If you only want to replicate command-line options, you can also have a look in __init__.py.
To replicate --embed-thumbnail and --add-metadata, use the following:
from __future__ import unicode_literals
import youtube_dl
ydl_opts = {
'writethumbnail': True,
'postprocessors': [{
'key': 'FFmpegMetadata'
}, {
'key': 'EmbedThumbnail',
'already_have_thumbnail': True, # overwrite any thumbnails already present
}],
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
ydl.download(['https://www.youtube.com/watch?v=BaW_jenozKc'])
I tried to upload video file using python, the problem is the system cannot find the file even though i write the path of file. my code is like this:
import os
import requests
#step 1
host = 'https://blablabla.com'
test = {
"upload_phase" : "start",
"file_size" : 1063565
}
params = {
"access_token":my_access_token,
"fields":"video_id, start_offset, end_offset, upload_session_id",
}
vids = requests.post(host, params=params, data=test)
vids = vids.json()
try:
video_id= vids["video_id"],
start_offset= vids["start_offset"],
end_offset= vids["end_offset"],
upload_session_id= vids["upload_session_id"]
except:
pass
print(vids)
###############################################################################
#step 2
###############################################################################
test = {
"upload_phase" : "transfer",
"start_offset" : start_offset,
"upload_session_id": upload_session_id,
"video_file_chunk": os.path.realpath('/home/def/Videos/test.mp4')
}
params = {
"access_token":my_access_token,
"fields":"start_offset, end_offset",
}
vids = requests.post(host, params=params, data=test)
vids = vids.json()
try:
start_offset= vids["start_offset"],
end_offset= vids["end_offset"]
except:
pass
print(vids)
Many way i tried, like os.path.abspath, os.path, os.path.dirname, os.path.basename, os.path.isfile, os.path.isabs, os.path.isdir it's still doesn't work. even i give import os.path or import os.
In your code you just send path to your file as the string to server, but not file itself. You should try something like:
my_file = {'file_to_upload': open(os.path.realpath('/home/def/Videos/test.mp4'),'rb')}
# You should replace 'file_to_upload' with the name server actually expect to receive
# If you don't know what server expect to get, check browser's devconsole while uploading file manually
vids = requests.post(host, params=params, files=my_file)
Also note that you might need to use requests.Session() to be able to handle cookies, access token...