netCDF4 - Python error - python

Can anyone tell me what I did wrong? I am using python-conda, and the files I have from http://meop40.troja.mff.cuni.cz:11180/gw.projekt/data.stratopauza/netcdf.profily/
Why it tells me that file doesn't exist?
>>> import netCDF4
>>> import pandas as pd
>>> import matplotlib.pyplot as plt
>>> url = 'http://meop40.troja.mff.cuni.cz:11180/gw.projekt/data.stratopauza/netcdf.profily/atmPrf_C001.2010.227.00.03.G04_2013.3520_nc'
>>> nc = netCDF4.Dataset(url)
**syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: <!DOCTYPE^ HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><head><title>404 Not Found</title></head><body><h1>Not Found</h1><p>The requested URL /gw.projekt/data.stratopauza/netcdf.profily/atmPrf_C001.2010.227.00.03.G04_2013.3520_nc.dds was not found on this server.</p><hr><address>Apache/2.4.12 (Ubuntu) Server at meop40.troja.mff.cuni.cz Port 11180</address></body></html>
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "netCDF4\_netCDF4.pyx", line 1811, in netCDF4._netCDF4.Dataset.__init__ (netCDF4\_netCDF4.c:12626)
IOError: NetCDF: file not found**

NetCDF4.Dataset() can only access remote NetCDF files which are served by an OPeNDAP service, which can return metadata about the file. The error message returned is incorrect and misleading.
There is a brief tutorial, which mentions this and gives basic information at: http://unidata.github.io/netcdf4-python/#section1
I downloaded the file and had no problem opening the file. You should use the method in the answer to your previous question https://stackoverflow.com/a/44622713/1211981
Update:
Go to:
http://meop40.troja.mff.cuni.cz:11180/gw.projekt/data.stratopauza/netcdf.profily/
Click one or more of the links and save to a folder where you will run your script. Change your script or python commands to:
>>> url = 'atmPrf_C001.2010.227.00.03.G04_2013.3520_nc'
>>> nc = netCDF4.Dataset(url)
netCDF4.Dataset() will take either a url or a local file name and work the same way. In this case it will recognize the file as a NetCDF / OPeNDAP compatible.

Related

Flink Python Datastream API Kafka Consumer

Im new to pyflink. Im tryig to write a python program to read data from kafka topic and prints data to stdout. I followed the link Flink Python Datastream API Kafka Producer Sink Serializaion. But i keep seeing NoSuchMethodError due to version mismatch. I have added the flink-sql-kafka-connector available at https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/1.13.0/flink-sql-connector-kafka_2.11-1.13.0.jar. Can someone help me in with a proper example to do this? Following is my code
import json
import os
from pyflink.common import SimpleStringSchema
from pyflink.datastream import StreamExecutionEnvironment
from pyflink.datastream.connectors import FlinkKafkaConsumer
from pyflink.common.typeinfo import Types
def my_map(obj):
json_obj = json.loads(json.loads(obj))
return json.dumps(json_obj["name"])
def kafkaread():
env = StreamExecutionEnvironment.get_execution_environment()
env.add_jars("file:///automation/flink/flink-sql-connector-kafka_2.11-1.10.1.jar")
deserialization_schema = SimpleStringSchema()
kafkaSource = FlinkKafkaConsumer(
topics='test',
deserialization_schema=deserialization_schema,
properties={'bootstrap.servers': '10.234.175.22:9092', 'group.id': 'test'}
)
ds = env.add_source(kafkaSource).print()
env.execute('kafkaread')
if __name__ == '__main__':
kafkaread()
But python doesnt recognise the jar file and throws the following error.
Traceback (most recent call last):
File "flinkKafka.py", line 31, in <module>
kafkaread()
File "flinkKafka.py", line 20, in kafkaread
kafkaSource = FlinkKafkaConsumer(
File "/automation/flink/venv/lib/python3.8/site-packages/pyflink/datastream/connectors.py", line 186, in __init__
j_flink_kafka_consumer = _get_kafka_consumer(topics, properties, deserialization_schema,
File "/automation/flink/venv/lib/python3.8/site-packages/pyflink/datastream/connectors.py", line 336, in _get_kafka_consumer
j_flink_kafka_consumer = j_consumer_clz(topics,
File "/automation/flink/venv/lib/python3.8/site-packages/pyflink/util/exceptions.py", line 185, in wrapped_call
raise TypeError(
TypeError: Could not found the Java class 'org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer'. The Java dependencies could be specified via command line argument '--jarfile' or the config option 'pipeline.jars'
What is the correct location to add the jar file?
I see that you downloaded flink-sql-connector-kafka_2.11-1.13.0.jar, but the code loades flink-sql-connector-kafka_2.11-1.10.1.jar.
May be you can have a check
just need to check the path to flink-sql-connector jar
You should add jar file of flink-sql-connector-kafka, it depends on your pyflink and scala version. If versions are true, check your path in add_jars function if the jar package is here.

HDFS: Read data from HDFS to parse XML files in HDFS using Python3

I have about 1500 XML files in HDFS, each of them is about 2-3Gb. I need to write a python script to parse the XML files to perform MapReduce. However, I am facing issue to access the files in HDFS using python.
I tried the following script, and receive an error.
from snakebite.client import Client
def connection():
hadoop_client = Client('HDFS_hostname', 'HDFS_port', use_trash=False)
for x in hadoop_client.ls(['/']):
print(x)
Following is the error:
Traceback (most recent call last):
File "/home/ubuntu/PycharmProjects/textmining/read_data_from_HDFS.py", line 5, in <module>
from snakebite.client import Client
File "/usr/local/lib/python3.6/dist-packages/snakebite/client.py", line 1473
baseTime = min(time * (1L << retries), cap);
^
SyntaxError: invalid syntax
What is the best recommended way to access files from HDFS using python?
pip install snakebite-py3
this will help you to solve that issue...
i came acroos the same issue.
snakebite is not comaptible with python 3.x u can use it with python 2.

How to make screenshot from web page? [duplicate]

This question already has answers here:
ModuleNotFoundError: What does it mean __main__ is not a package?
(6 answers)
Closed 4 years ago.
How to make screenshot from any url (web page)?
I was trying:
from .ghost import Ghost
ghost = Ghost(wait_timeout=4)
ghost.open('http://www.google.com')
ghost.capture_to('screen_shot.png')
Result:
No module named '__main__.ghost'; '__main__' is not a package
I was trying also:
Python Webkit making web-site screenshots using virtual framebuffer
Take screenshot of multiple URLs using selenium (python)
Fastest way to take a screenshot with python on windows
Take a screenshot of open website in python script
I've also tried other methods that are not listed here.
Nothing succeeded. Or an error or module is not found .. or or or.
I'm tired. Is there an easy way to make a screenshot of a web page using Python 3.X?
upd1:
C:\prg\PY\PUMA\tests>py save-web-html.py
Traceback (most recent call last):
File "save-web-html.py", line 2, in <module>
from .ghost import Ghost
ModuleNotFoundError: No module named '__main__.ghost'; '__main__' is not a package
upd2:
C:\prg\PY\PUMA\tests>py save-web-html.py
Exception ignored in: <bound method Ghost.__del__ of <ghost.ghost.Ghost object at 0x0000020A169CF860>>
Traceback (most recent call last):
File "C:\Users\Coar\AppData\Local\Programs\Python\Python36\lib\site-packages\ghost\ghost.py", line 325, in __del__
self.exit()
File "C:\Users\Coar\AppData\Local\Programs\Python\Python36\lib\site-packages\ghost\ghost.py", line 315, in exit
self._app.quit()
AttributeError: 'NoneType' object has no attribute 'quit'
Traceback (most recent call last):
File "save-web-html.py", line 4, in <module>
ghost = Ghost(wait_timeout=4)
TypeError: __init__() got an unexpected keyword argument 'wait_timeout'
In the late 80's this may have been a simple task, just render some html to an image instead of the screen.
But these days web-pages require client-side execution to build parts of its DOM and re-render based on client-side initiated AJAX (or equivalent) requests... it's a whole thing "web 2.0" thing.
Rendering a web-site such as http://google.com as a simple html return should be easy, but rendering something like https://www.facebook.com/ or https://www.kogan.com/ will have many back & fourth comms to display what you're expecting to see.
So restricting this to a pure python solution may not be plausible; I'm not aware of a python-based browser.
Consider running a separate service to take the screenshots, and use your core application (in python) to fetch requested screenshots.
I just tried a few with docker, many of them struggle with https and the aforementioned ajax behaviour.
earlyclaim/docker-manet appears to work demo page
edit: from your comments, you need the data from a graph that's rendered using a 2nd request.
you just need the json return from https://www.minnowbooster.net/limit/chart
try:
from urllib.request import urlopen # py3
except ImportError:
from urllib2 import urlopen # py2
import json
url = 'https://www.minnowbooster.net/limit/chart'
response = urlopen(url)
data_str = response.read().decode()
data = json.loads(data_str)
print(data)

Adding new portgroups to vmware virtual switches using pysphere

I'm trying to automate the addition of new portgroups to ESXi hosts using pysphere. I'm using the following code fragment:
from pysphere import MORTypes
from pysphere import VIServer, VIProperty
from pysphere.resources import VimService_services as VI
s = VIServer()
s.connect(vcenter, user, password)
host_system = s.get_hosts().keys()[17]
prop = VIProperty(s, host_system)
propname = prop.configManager._obj.get_element_networkSystem()
vswitch = prop.configManager.networkSystem.networkInfo.vswitch[0]
network_system = VIMor(propname, MORTypes.HostServiceSystem)
def add_port_group(name, vlan_id, vswitch, network_system):
request = VI.AddPortGroupRequestMsg()
_this = request.new__this(network_system)
_this.set_attribute_type(network_system.get_attribute_type())
request.set_element__this(_this)
portgrp = request.new_portgrp()
portgrp.set_element_name(name)
portgrp.set_element_vlanId(vlan_id)
portgrp.set_element_vswitchName(vswitch)
portgrp.set_element_policy(portgrp.new_policy())
request.set_element_portgrp(portgrp)
s._proxy.AddPortGroup(request)
However, when I attempt to run it, I get the following error:
>>> add_port_group(name, vlan_id, vswitch, network_system)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 12, in add_port_group
File "/usr/lib/python2.6/site-packages/pysphere-0.1.8- py2.6.egg/pysphere/resources/VimService_services.py", line 4344, in AddPortGroup
response = self.binding.Receive(AddPortGroupResponseMsg.typecode)
File "/usr/lib/python2.6/site-packages/pysphere-0.1.8- py2.6.egg/pysphere/ZSI/client.py", line 545, in Receive
return _Binding.Receive(self, replytype, **kw)
File "/usr/lib/python2.6/site-packages/pysphere-0.1.8- py2.6.egg/pysphere/ZSI/client.py", line 464, in Receive
raise FaultException(msg)
pysphere.ZSI.FaultException: The object has already been deleted or has not been completely created
I've attempted to swap in different values for "vswitch" and "network_system", but I haven't had any success. Has anyone attempted to do something similar with pysphere successfully?
I can accomplish what I need through Powershell, which demonstrates that it isn't a vmware issue, but I don't want to use Powershell in this particular case.
I tried your code on one of our vSpheres.
It seems you are passing the object to set_element_vswitchName rather than the name. Maybe this will help:
vswitch = prop.configManager.networkSystem.networkInfo.vswitch[0].name

How to interact with pynessus

I am using http://code.google.com/p/pynessus/ so that I can interact with nessus using python but I run into problems trying to connect to the server. I am not sure what I need to set pynessus too?
I try connecting to the server using the following syntax as directed by the documentation on the site but I receive the following error:
n = pynessus.NessusServer(localhost, 8834, root, password123)
Error:
root#bt:~/Desktop# ./nessus.py
Traceback (most recent call last):
File "./nessus.py", line 634, in
n = pynessus.NessusServer(localhost, 8834, root, password123)
NameError: name 'pynessus' is not defined
The problem is that you didn't import the pynessus module. To solve this problem, simply place the downloaded pynessus.py in the same folder as your Python script and add the line
import pynessus
at the top of that script. You can reference the pynessus library in your script only after that line.

Categories

Resources