boto does not like EMR BootstrapAction paramater

boto does not like EMR BootstrapAction paramater - python

I'm trying to launch AWS EMR cluster using boto library, everything works well.
Because of that I need to install required python libraries, tried to add bootstrap action step using boto.emr.bootstrap_action
But It gives error below;
Traceback (most recent call last):
File "run_on_emr_cluster.py", line 46, in <module>
steps=[step])
File "/usr/local/lib/python2.7/dist-packages/boto/emr/connection.py", line 552, in run_jobflow
bootstrap_action_args = [self._build_bootstrap_action_args(bootstrap_action) for bootstrap_action in bootstrap_actions]
File "/usr/local/lib/python2.7/dist-packages/boto/emr/connection.py", line 623, in _build_bootstrap_action_args
bootstrap_action_params['ScriptBootstrapAction.Path'] = bootstrap_action.path AttributeError: 'str' object has no attribute 'path'
Code below;
from boto.emr.connection import EmrConnection
conn = EmrConnection('...', '...')
from boto.emr.step import StreamingStep
step = StreamingStep(name='mapper1',
mapper='s3://xxx/mapper1.py',
reducer='s3://xxx/reducer1.py',
input='s3://xxx/input/',
output='s3://xxx/output/')
from boto.emr.bootstrap_action import BootstrapAction
bootstrap_action = BootstrapAction(name='install related packages',path="s3://xxx/bootstrap.sh", bootstrap_action_args=None)
job = conn.run_jobflow(name='emr_test',
log_uri='s3://xxx/logs',
master_instance_type='m1.small',
slave_instance_type='m1.small',
num_instances=1,
action_on_failure='TERMINATE_JOB_FLOW',
keep_alive=False,
bootstrap_actions='[bootstrap_action]',
steps=[step])
What's the proper way of passing bootstrap arguments?

You are passing the bootstrap_actions argument as a literal string rather than as a list containing the BootstrapAction object you just created. Try this:
job = conn.run_jobflow(name='emr_test',
log_uri='s3://xxx/logs',
master_instance_type='m1.small',
slave_instance_type='m1.small',
num_instances=1,
action_on_failure='TERMINATE_JOB_FLOW',
keep_alive=False,
bootstrap_actions=[bootstrap_action],
steps=[step])
Notice that the ``bootstrap_action` argument is different here.

Related

Flink Python Datastream API Kafka Consumer

Im new to pyflink. Im tryig to write a python program to read data from kafka topic and prints data to stdout. I followed the link Flink Python Datastream API Kafka Producer Sink Serializaion. But i keep seeing NoSuchMethodError due to version mismatch. I have added the flink-sql-kafka-connector available at https://repo.maven.apache.org/maven2/org/apache/flink/flink-sql-connector-kafka_2.11/1.13.0/flink-sql-connector-kafka_2.11-1.13.0.jar. Can someone help me in with a proper example to do this? Following is my code
import json
import os
from pyflink.common import SimpleStringSchema
from pyflink.datastream import StreamExecutionEnvironment
from pyflink.datastream.connectors import FlinkKafkaConsumer
from pyflink.common.typeinfo import Types
def my_map(obj):
json_obj = json.loads(json.loads(obj))
return json.dumps(json_obj["name"])
def kafkaread():
env = StreamExecutionEnvironment.get_execution_environment()
env.add_jars("file:///automation/flink/flink-sql-connector-kafka_2.11-1.10.1.jar")
deserialization_schema = SimpleStringSchema()
kafkaSource = FlinkKafkaConsumer(
topics='test',
deserialization_schema=deserialization_schema,
properties={'bootstrap.servers': '10.234.175.22:9092', 'group.id': 'test'}
)
ds = env.add_source(kafkaSource).print()
env.execute('kafkaread')
if __name__ == '__main__':
kafkaread()
But python doesnt recognise the jar file and throws the following error.
Traceback (most recent call last):
File "flinkKafka.py", line 31, in <module>
kafkaread()
File "flinkKafka.py", line 20, in kafkaread
kafkaSource = FlinkKafkaConsumer(
File "/automation/flink/venv/lib/python3.8/site-packages/pyflink/datastream/connectors.py", line 186, in __init__
j_flink_kafka_consumer = _get_kafka_consumer(topics, properties, deserialization_schema,
File "/automation/flink/venv/lib/python3.8/site-packages/pyflink/datastream/connectors.py", line 336, in _get_kafka_consumer
j_flink_kafka_consumer = j_consumer_clz(topics,
File "/automation/flink/venv/lib/python3.8/site-packages/pyflink/util/exceptions.py", line 185, in wrapped_call
raise TypeError(
TypeError: Could not found the Java class 'org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer'. The Java dependencies could be specified via command line argument '--jarfile' or the config option 'pipeline.jars'
What is the correct location to add the jar file?

I see that you downloaded flink-sql-connector-kafka_2.11-1.13.0.jar, but the code loades flink-sql-connector-kafka_2.11-1.10.1.jar.
May be you can have a check

just need to check the path to flink-sql-connector jar

You should add jar file of flink-sql-connector-kafka, it depends on your pyflink and scala version. If versions are true, check your path in add_jars function if the jar package is here.

Attribute Error: module 'provider' has no attribute 'getDataFiles'

I'm running a code and it gives me an error I can't solve !
how can I add the missing attribute?
the relevant part of the code :
ALL_FILES = provider.getDataFiles('indoor3d_sem_seg_hdf5_data/all_files.txt') #line 63
room_filelist = [line.rstrip() for line in open('indoor3d_sem_seg_hdf5_data/room_filelist.txt')]
The error:
Traceback (most recent call last):
File "E:\Research\Codes\pointnet\pointnet-master\sem_seg\train.py", line 63, in <module>
ALL_FILES = provider.getDataFiles('indoor3d_sem_seg_hdf5_data/all_files.txt')
AttributeError: module 'provider' has no attribute 'getDataFiles'

First, check if you have import provider in your code, you can also do from model import *
I found out that you are using pointnet. So I search the source code and I found this method is:
def getDataFiles(list_filename):
return [line.rstrip() for line in open(list_filename)]
You can search your library for this method. It might not be in the provider.py
You could just added this method to your code. But the best idea is to search for it.
For you case, the provider.py should be at \pointnet\pointnet-master\, and there is also a train.py at that location.

Problem solved ! All I had to do is to copy the provider.py file into the sem.seg.py file which I used. It appears it couldn't find it in the previous file.

ListRecommendationsRequest constructor error in Python script

I am trying to prepare a VM Resizing Recommendations Report using a Python 3.7 script. My code is as follows:
import datetime
import logging
from google.cloud import bigquery
from google.cloud import recommender
from google.cloud.exceptions import NotFound
from googleapiclient import discovery
def main(event, context):
client = recommender.RecommenderClient()
recommender_type = "google.compute.instance.MachineTypeRecommender"
projects=list_projects() #This gives list of projects
# I hard-code the zones below:
zones = ["us-east1-b","us-east1-c","us-east1-d","us-east4-c","us-east4-b","us-east4-a"
,"us-central1-c","us-central1-a","us-central1-f","us-central1-b"
,"us-west1-b","us-west1-c","us-west1-a"]
for zone in zones:
parent = client.recommender_path("my-project", zone, recommender_type)
for element in client.list_recommendations(parent): #In this line I am getting this error and these are logs
****Parent**** projects/my-project/locations/us-east1-b/recommenders/google.compute.instance.MachineTypeRecommender
Traceback (most recent call last):
File "main.py", line 187, in main
x=list(client.list_recommendations(parent))
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/google/cloud/recommender_v1/services/recommender/client.py", line 734, in list_recommendations
request = recommender_service.ListRecommendationsRequest(request)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/proto/message.py", line 441, in __init__
raise TypeError(
TypeError: Invalid constructor input for ListRecommendationsRequest: 'projects/my-project/locations/us-east1-b/recommenders/google.compute.instance.MachineTypeRecommender'
During handling of the above exception, another exception occurred:
"projects/my-project/locations/us-east1-b/recommenders/google.compute.instance.MachineTypeRecommender" is the parameter being passed to the list_recommendations function. I am not sure what is wrong in the constructor as I am getting this: Invalid constructor input for ListRecommendationsRequest
I am new to this Google api. What can I try next?

I have checked the code on my end an it seems to be a syntax error when calling the method. As per the library documentation the first parameter should be a request object or None and the parent parameter has to be passed by name. Changing the line as follows I didn't have the error:
client.list_recommendations(parent=parent)

How to get all inventory groups variables in hierarchy via Python API?

I want to collect all inventory hosts groups variables in hierarchy data struct and send them to Consul to make them available in runtime.
Calling this method - https://github.com/ansible/ansible/blob/devel/lib/ansible/inventory/manager.py#L160 I got the error
inventory.get_vars()
Traceback (most recent call last):
File "<input>", line 1, in <module>
inventory.get_vars()
File "<>/.virtualenvs/ansible27/lib/python2.7/site-packages/ansible/inventory/manager.py", line 160, in get_vars
return self._inventory.get_vars(args, kwargs)
AttributeError: 'InventoryData' object has no attribute 'get_vars'
my script
import pprint
pp = pprint.PrettyPrinter(indent=4).pprint
from ansible.parsing.dataloader import DataLoader
from ansible.vars.manager import VariableManager
from ansible.inventory.manager import InventoryManager
loader = DataLoader()
inventory = InventoryManager(loader=loader, sources='inventories/itops-vms.yml')
variable_manager = VariableManager(loader=loader, inventory=inventory)
# shows groups as well
pp(inventory.groups)
# shows dict as well with content
pp(variable_manager.get_vars())
# creates an unhandled exception
inventory.get_vars()
How to do that right way?
Python 2.7.15
ansible==2.6.2
OS Mac High Siera

The error itself seems to be caused by a bug - the get_vars method of the inventory object calls get_vars method of the InventoryData object which is not implemented.
You need to specify the group, for example:
>>> inventory.groups['all'].get_vars()
{u'my_var': u'value'}
You can create a dictionary with that data:
{g: inventory.groups[g].get_vars() for g in inventory.groups}
The above gets only the variables defined in the inventory itself (which is what the question asks about). If you wanted to get a structure with variables from group_vars, host_vars, etc. (as you indicated in your comment I want to get something similar to $ ansible-inventory -i inventories/itops-vms.yml --graph --vars you'd need to collect the data from different sources, just like Ansible does.

Adding new portgroups to vmware virtual switches using pysphere

I'm trying to automate the addition of new portgroups to ESXi hosts using pysphere. I'm using the following code fragment:
from pysphere import MORTypes
from pysphere import VIServer, VIProperty
from pysphere.resources import VimService_services as VI
s = VIServer()
s.connect(vcenter, user, password)
host_system = s.get_hosts().keys()[17]
prop = VIProperty(s, host_system)
propname = prop.configManager._obj.get_element_networkSystem()
vswitch = prop.configManager.networkSystem.networkInfo.vswitch[0]
network_system = VIMor(propname, MORTypes.HostServiceSystem)
def add_port_group(name, vlan_id, vswitch, network_system):
request = VI.AddPortGroupRequestMsg()
_this = request.new__this(network_system)
_this.set_attribute_type(network_system.get_attribute_type())
request.set_element__this(_this)
portgrp = request.new_portgrp()
portgrp.set_element_name(name)
portgrp.set_element_vlanId(vlan_id)
portgrp.set_element_vswitchName(vswitch)
portgrp.set_element_policy(portgrp.new_policy())
request.set_element_portgrp(portgrp)
s._proxy.AddPortGroup(request)
However, when I attempt to run it, I get the following error:
>>> add_port_group(name, vlan_id, vswitch, network_system)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 12, in add_port_group
File "/usr/lib/python2.6/site-packages/pysphere-0.1.8- py2.6.egg/pysphere/resources/VimService_services.py", line 4344, in AddPortGroup
response = self.binding.Receive(AddPortGroupResponseMsg.typecode)
File "/usr/lib/python2.6/site-packages/pysphere-0.1.8- py2.6.egg/pysphere/ZSI/client.py", line 545, in Receive
return _Binding.Receive(self, replytype, **kw)
File "/usr/lib/python2.6/site-packages/pysphere-0.1.8- py2.6.egg/pysphere/ZSI/client.py", line 464, in Receive
raise FaultException(msg)
pysphere.ZSI.FaultException: The object has already been deleted or has not been completely created
I've attempted to swap in different values for "vswitch" and "network_system", but I haven't had any success. Has anyone attempted to do something similar with pysphere successfully?
I can accomplish what I need through Powershell, which demonstrates that it isn't a vmware issue, but I don't want to use Powershell in this particular case.

I tried your code on one of our vSpheres.
It seems you are passing the object to set_element_vswitchName rather than the name. Maybe this will help:
vswitch = prop.configManager.networkSystem.networkInfo.vswitch[0].name

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

boto does not like EMR BootstrapAction paramater - python

Related

Flink Python Datastream API Kafka Consumer

Attribute Error: module 'provider' has no attribute 'getDataFiles'

ListRecommendationsRequest constructor error in Python script

How to get all inventory groups variables in hierarchy via Python API?

Adding new portgroups to vmware virtual switches using pysphere

Categories

Resources