Im trying to write a script that gets google's ajax search results (For example: http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=filetype:pdf ) and download every file. Right now I'm stuck trying to convert the response to a python dictionary so its easier to move through.
import subprocess
import ast
subprocess.call("curl -G -d 'q=filetype:pdf&v=1.0' http://ajax.googleapis.com/ajax/services/search/web > output",stderr=subprocess.STDOUT,shell=True)
file = open('output','r')
contents = file.read()
output_dict = ast.literal_eval(contents)
print output_dict
When I run it, I get:
$ python script.py
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2643 0 2643 0 0 15926 0 --:--:-- --:--:-- --:--:-- 26696
Traceback (most recent call last):
File "script.py", line 7, in <module>
output_dict = ast.literal_eval(contents)
File "/usr/lib/python2.7/ast.py", line 80, in literal_eval
return _convert(node_or_string)
File "/usr/lib/python2.7/ast.py", line 63, in _convert
in zip(node.keys, node.values))
File "/usr/lib/python2.7/ast.py", line 62, in <genexpr>
return dict((_convert(k), _convert(v)) for k, v
File "/usr/lib/python2.7/ast.py", line 79, in _convert
raise ValueError('malformed string')
ValueError: malformed string
The file looks like:
{"responseData": {"results":[{"GsearchResultClass":"GwebSearch",
"unescapedUrl":"http://www.foundationdb.com/AlphaLicenseAgreement.pdf",
"url":"http://www.foundationdb.com/AlphaLicenseAgreement.pdf",
"visibleUrl":"www.foundationdb.com",
"cacheUrl":"http://www.google.com/search?q\u003dcache:W7zhFlfbm6UJ:www.foundationdb.com",
"title":"FoundationDB Alpha Software Evaluation License Agreement",
"titleNoFormatting":"FoundationDB Alpha Software Evaluation License Agreement",
"content":"FOUNDATIONDB. ALPHA SOFTWARE EVALUATION LICENSE AGREEMENT. PLEASE READ CAREFULLY THE TERMS OF THIS ALPHA SOFTWARE \u003cb\u003e...\u003c/b\u003e",
"fileFormat":"PDF/Adobe Acrobat"
},
{"GsearchResultClass":"GwebSearch",
"unescapedUrl":"https://subreg.cz/registration_agreement.pdf",
"url":"https://subreg.cz/registration_agreement.pdf",
"visibleUrl":"subreg.cz",
"cacheUrl":"http://www.google.com/search?q\u003dcache:ODtRmQsiHD0J:subreg.cz",
"title":"Registration Agreement",
"titleNoFormatting":"Registration Agreement",
"content":"Registration Agreement. In order to complete the registration process you must read and agree to be bound by all terms and conditions herein. TERMS AND \u003cb\u003e...\u003c/b\u003e",
"fileFormat":"PDF/Adobe Acrobat"
},
{"GsearchResultClass":"GwebSearch",
"unescapedUrl":"http://supportdetails.com/export.pdf",
"url":"http://supportdetails.com/export.pdf",
"visibleUrl":"supportdetails.com",
"cacheUrl":"http://www.google.com/search?q\u003dcache:h0LvxrTTKzIJ:supportdetails.com",
"title":"Export PDF - Support Details",
"titleNoFormatting":"Export PDF - Support Details",
"content":"",
"fileFormat":"PDF/Adobe Acrobat"
},
{"GsearchResultClass":"GwebSearch",
"unescapedUrl":"http://www.fws.gov/le/pdf/travelpetbird.pdf",
"url":"http://www.fws.gov/le/pdf/travelpetbird.pdf",
"visibleUrl":"www.fws.gov",
"cacheUrl":"",
"title":"pet bird",
"titleNoFormatting":"pet bird",
"content":"U.S. Fish \u0026amp; Wildlife Service. Traveling Abroad with. Your Pet Bird. The Wild Bird Conservation Act (Act), a significant step in international conservation efforts to \u003cb\u003e...\u003c/b\u003e",
"fileFormat":"PDF/Adobe Acrobat"
}],
"cursor":{"resultCount":"72,800,000",
"pages":[{"start":"0","label":1},
{"start":"4","label":2},
{"start":"8","label":3},
{"start":"12","label":4},
{"start":"16","label":5},
{"start":"20","label":6},
{"start":"24","label":7},
{"start":"28","label":8}],
"estimatedResultCount":"72800000",
"currentPageIndex":0,
"moreResultsUrl":"http://www.google.com/search?oe\u003dutf8\u0026ie\u003dutf8\u0026source\u003duds\u0026start\u003d0\u0026hl\u003den\u0026q\u003dfiletype:pdf","searchResultTime":"0.04"
}
},
"responseDetails": null,
"responseStatus": 200
}
God that took forever to format
Google returns JSON, so use the json module instead of the ast module you are using now.
file = open('output','r')
output_dict = json.load(file)
You may also want to study the urllib2 module to load the URL response instead of relying on curl.
Related
I am trying to implement ScroogeCoin using the fastecdsa library. I currently run into an error that happens when my create_coins function is called. The error points to the signing function (tx["signature"]) and says that it cannot convert an integer type to a byte.
import hashlib
import json
from fastecdsa import keys, curve, ecdsa
class ScroogeCoin(object):
def __init__(self):
self.private_key, self.public_key = keys.gen_keypair(curve.secp256k1)
self.address = hashlib.sha256(json.dumps(self.public_key.x).encode()).hexdigest()
self.chain = []
self.current_transactions = []
def create_coins(self, receivers: dict):
"""
Scrooge adds value to some coins
:param receivers: {account:amount, account:amount, ...}
"""
tx = {
"sender" : self.address, # address,
# coins that are created do not come from anywhere
"location": {"block": -1, "tx": -1},
"receivers" : receivers,
}
tx["hash"] = hashlib.sha256(json.dumps(tx).encode()).hexdigest()# hash of tx
tx["signature"] = ecdsa.sign(self.private_key, tx["hash"])# signed hash of tx
self.current_transactions.append(tx)
...
When this function is ran in the main function:
...
Scrooge = ScroogeCoin()
users = [User(Scrooge) for i in range(10)]
Scrooge.create_coins({users[0].address:10, users[1].address:20, users[3].address:50})
...
It produces this error:
Traceback (most recent call last):
File "D:\Scrooge_coin_assignmnet.py", line 216, in <module>
main()
File "D:\Scrooge_coin_assignmnet.py", line 197, in main
Scrooge.create_coins({users[0].address:10, users[1].address:20, users[3].address:50})
File "D:\Scrooge_coin_assignmnet.py", line 27, in create_coins
tx["signature"] = ecdsa.sign(self.private_key, tx["hash"])# signed hash of tx
File "C:\Users\d\AppData\Local\Programs\Python\Python311\Lib\site-packages\fastecdsa\ecdsa.py", line 36, in sign
rfc6979 = RFC6979(msg, d, curve.q, hashfunc, prehashed=prehashed)
File "C:\Users\d\AppData\Local\Programs\Python\Python311\Lib\site-packages\fastecdsa\util.py", line 25, in __init__
self.msg = msg_bytes(msg)
File "C:\Users\d\AppData\Local\Programs\Python\Python311\Lib\site-packages\fastecdsa\util.py", line 153, in msg_bytes
raise ValueError('Msg "{}" of type {} cannot be converted to bytes'.format(
ValueError: Msg "21783419755125685845542189331366569080312572314742637241373298325693730090205" of type <class 'int'> cannot be converted to bytes
I've tried to play around and change it to a byte by using encode on the tx["hash"] as well things like bytes.fromhex() but it still gives the same error. I wanted to ask others who are more skilled and see if they can see how I am messing up.
I am starting to get an overview about bitcoin and wanted to write a very simple programme converting private keys into public, keys, addresses etc. However, for some private keys it suddenly fails when I try to compute the compressed address for them. The reason it fails as far as I can see is that the compressed public key I am passing to the function bitcoin.pubkey_to_address() is one digit too short. As it is one of the most famous bitcoin libraries I assume that there is a fundamental error in my understanding. Thus, the question: what am I doing wrong in computing the compressed bitcoin address from a private key?
I have installed the following library: pip3 install bitcoin in the minimal example below, I am using Python 3.8.5 on Ubuntu 20.04.
import bitcoin
class WalletWithBalance:
def __init__(self, _private_key: str):
self.private_key_uncompressed_hex: str = _private_key
# public keys:
self.public_key_uncompressed_as_x_y_tuple_hex = self.get_private_key_as_x_y_tuple()
self.public_key_compressed_hex = self.get_compressed_public_key_hex()
# addresses:
self.address_compressed = bitcoin.pubkey_to_address(self.public_key_compressed_hex)
def get_public_key_as_raw_hex_str(self):
public_key_as_raw_hex_str = bitcoin.encode_pubkey(self.public_key_uncompressed_as_x_y_tuple_hex, 'hex')
return public_key_as_raw_hex_str
def get_private_key_as_x_y_tuple(self):
private_key_raw_decimal_number = bitcoin.decode_privkey(self.private_key_uncompressed_hex, 'hex')
return bitcoin.fast_multiply(bitcoin.G, private_key_raw_decimal_number)
def get_compressed_public_key_hex(self):
(public_key_x, public_key_y) = self.public_key_uncompressed_as_x_y_tuple_hex
return self.get_compressed_prefix(public_key_y) + bitcoin.encode(public_key_x, 16)
#staticmethod
def get_compressed_prefix(public_key_y):
if public_key_y % 2 == 0:
return "02"
else:
return "03"
if __name__ == "__main__":
wallet = WalletWithBalance(_private_key="0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c")
The stacktrace reads:
/usr/bin/python3 /foo/minimal_example.py
Traceback (most recent call last):
File "/foo/minimal_example.py", line 36, in <module>
wallet = WalletWithBalance(_private_key="0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c0c")
File "/foo/minimal_example.py", line 13, in __init__
self.address_compressed = bitcoin.pubkey_to_address(self.public_key_compressed_hex)
File "/DoeJohn/.local/lib/python3.8/site-packages/bitcoin/main.py", line 452, in pubkey_to_address
return bin_to_b58check(bin_hash160(pubkey), magicbyte)
File "/DoeJohn/.local/lib/python3.8/site-packages/bitcoin/main.py", line 334, in bin_hash160
intermed = hashlib.sha256(string).digest()
TypeError: Unicode-objects must be encoded before hashing
I'm using a beam pipeline to preprocess my text to integers bag of words, similar to this example https://github.com/GoogleCloudPlatform/cloudml-samples/blob/master/reddit_tft/reddit.py
words = tft.map(tf.string_split, inputs[name])
result[name + '_bow'] = tft.string_to_int(
words, frequency_threshold=frequency_threshold)
Preprocessing and training seem to work fine. I train a simple linear model and point to the transform function and run an experiment.
The saved_model.pbtxt seems to have the dictionary saved and my goal is to be able to deploy this model on google cloud ml for prediction and query it with raw text as input:
{"inputs" : { "title": "E. D. Abbott Ltd", "text" : "Abbott of Farnham E D Abbott Limited was a British coachbuilding business" }}
When running
gcloud ml-engine local predict \
--model-dir=$MODEL_DIR \
--json-instances="$DATA_DIR/test.json" \
I get the below error, no idea what I'm doing wrong.
Source code / logs
WARNING:root:MetaGraph has multiple signatures 2. Support for multiple signatures is
limited. By default we select named signatures.
ERROR:root:Exception during running the graph: Unable to get element from the feed a
s bytes.
Traceback (most recent call last):
File "lib/googlecloudsdk/command_lib/ml_engine/local_predict.py", line 136, in
main()
File "lib/googlecloudsdk/command_lib/ml_engine/local_predict.py", line 131, in mai
n
instances=instances)
File "/Users/xyz/Downloads/google-cloud-sdk/lib/third_party/cloud_ml_engin
e_sdk/prediction/prediction_lib.py", line 656, in local_predict
_, predictions = model.predict(instances)
File "/Users/xyz/Downloads/google-cloud-sdk/lib/third_party/cloud_ml_engin
e_sdk/prediction/prediction_lib.py", line 553, in predict
outputs = self._client.predict(columns, stats)
File "/Users/xyz/Downloads/google-cloud-sdk/lib/third_party/cloud_ml_engin
e_sdk/prediction/prediction_lib.py", line 382, in predict
"Exception during running the graph: " + str(e))
prediction_lib.PredictionError: (4, 'Exception during running the graph: Unable to g
et element from the feed as bytes.')
def feature_columns(vocab_size=100000):
result = []
for key in TEXT_COLUMNS:
column = tf.contrib.layers.sparse_column_with_integerized_feature(key, vocab_size, combiner='sum')
result.append(column)
return result
model_fn = tf.contrib.learn.LinearClassifier(
feature_columns=feature_columns(),
n_classes=15,
model_dir=output_dir
)
def get_transformed_reader_input_fn(transformed_metadata,
transformed_data_paths,
batch_size,
mode):
"""Wrap the get input features function to provide the runtime arguments."""
return input_fn_maker.build_training_input_fn(
metadata=transformed_metadata,
file_pattern=(
transformed_data_paths[0] if len(transformed_data_paths) == 1
else transformed_data_paths),
training_batch_size=batch_size,
label_keys=[LABEL_COLUMN],
reader=gzip_reader_fn,
key_feature_name='key',
reader_num_threads=4,
queue_capacity=batch_size * 2,
randomize_input=(mode != tf.contrib.learn.ModeKeys.EVAL),
num_epochs=(1 if mode == tf.contrib.learn.ModeKeys.EVAL else None))
transformed_metadata = metadata_io.read_metadata(
args.transformed_metadata_path)
raw_metadata = metadata_io.read_metadata(args.raw_metadata_path)
train_input_fn = get_transformed_reader_input_fn(
transformed_metadata, args.train_data_paths, args.batch_size,
tf.contrib.learn.ModeKeys.TRAIN)
eval_input_fn = get_transformed_reader_input_fn(
transformed_metadata, args.eval_data_paths, args.batch_size,
tf.contrib.learn.ModeKeys.EVAL)
serving_input_fn = input_fn_maker.build_parsing_transforming_serving_input_fn(
raw_metadata,
args.transform_savedmodel,
raw_label_keys=[],
raw_feature_keys=model.TEXT_COLUMNS)
export_strategy = tf.contrib.learn.utils.make_export_strategy(
serving_input_fn,
default_output_alternative_key=None,
exports_to_keep=5,
as_text=True)
return Experiment(
estimator=model_fn,
train_input_fn=train_input_fn,
eval_input_fn=eval_input_fn,
export_strategies=export_strategy,
eval_metrics=model.get_eval_metrics(),
train_monitors=[],
train_steps=args.train_steps,
eval_steps=args.eval_steps,
min_eval_frequency=1
)
The docs for build_parsing_transforming_serving_input_fn() says it makes an input function that applies transforms to raw data encoded as tf.Examples as a serialized string. Making things more complicated, that string then has to be base64 encoded to send to the prediction service (see section Data encoding)
I would recommend using build_default_transforming_serving_input_fn() which is for json input. Then your json file should just have
{ "title": "E. D. Abbott Ltd", "text" : "Abbott of Farnham E D Abbott Limited was a British coachbuilding business" }
{ "title": "another title", "text" : "more text" }
...
I'm trying to load the following JSON string in python:
{
"Motivo_da_Venda_Perdida":"",
"Data_Visita":"2015-03-17 08:09:55",
"Cliente":{
"Distribuidor1_Modelo":"",
"RG":"",
"Distribuidor1_Marca":"Selecione",
"PlataformaMilho1_Quantidade":"",
"Telefone_Fazenda":"",
"Pulverizador1_Quantidade":"",
"Endereco_Fazenda":"",
"Nome_Fazenda":"",
"Area_Total_Fazenda":"",
"PlataformaMilho1_Marca":"Selecione",
"Trator1_Modelo":"",
"Tipo_Cultura3":"Selecione",
"Tipo_Cultura4":"Selecione",
"Cultura2_Hectares":"",
"Colheitadeira1_Quantidade":"",
"Tipo_Cultura1":"Soja",
"Tipo_Cultura2":"Selecione",
"Plantadeira1_Marca":"Stara",
"Autopropelido1_Modelo":"",
"Email_Fazenda":"",
"Autopropelido1_Marca":"Stara",
"Distribuidor1_Quantidade":"",
"PlataformaMilho1_Modelo":"",
"Trator1_Marca":"Jonh deere",
"Email":"",
"CPF":"46621644000",
"Endereco_Rua":"PAQUINHAS, S/N",
"Caixa_Postal_Fazenda":"",
"Cidade_Fazenda":"",
"Plantadeira1_Quantidade":"",
"Colheitadeira1_Marca":"New holland",
"Data_Nascimento":"2015-02-20",
"Cultura4_Hectares":"",
"Nome_Cliente":"MILTON CASTIONE",
"Cep_Fazenda":"",
"Telefone":"5491290687",
"Cultura3_Hectares":"",
"Trator1_Quantidade":"",
"Cultura1_Hectares":"",
"Autopropelido1_Quantidade":"",
"Pulverizador1_Modelo":"",
"Caixa_Postal":"",
"Estado":"RS",
"Endereco_Numero":"",
"Cidade":"COLORADO",
"Colheitadeira1_Modelo":"",
"Pulverizador1_Marca":"Selecione",
"CEP":"99460000",
"Inscricao_Estadual":"0",
"Plantadeira1_Modelo":"",
"Estado_Fazenda":"RS",
"Bairro":""
},
"Quilometragem":"00",
"Modelo_Pretendido":"Selecione",
"Quantidade_Prevista_Aquisicao":"",
"Id_Revenda":"1",
"Contato":"05491290687",
"Pendencia_Para_Proxima_Visita":"",
"Data_Proxima_Visita":"2015-04-17 08:09:55",
"Valor_de_Venda":"",
"Maquina_Usada":"0",
"Id_Vendedor":"2",
"Propensao_Compra":"Propensao_Compra_Frio",
"Comentarios":"despertar compra",
"Sistema_Compra":"Sistema_Compra_Finame",
"Outro_Produto":"",
"Data_Prevista_Aquisicao":"2015-04-17 08:09:55",
"Objetivo_Visita":"Despertar_Interesse",
"Tipo_Contato":"Telefonico"}
however I get the following error when I try to load it
File "python_file.py", line 107, in busca_proxima_mensagem
Visita = json.loads(corpo)
File "/usr/lib/python2.7/json/__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "/usr/lib/python2.7/json/decoder.py", line 369, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 2 - line 6 column 84 (char 1 - 1020)
but this JSON seems to be valid according to this site: http://jsonformatter.curiousconcept.com/ What am I doing wrong? Why can't I load this string as a JSON object?
I'm trying to load the string from AWS SQS like this:
import json
...
result = fila.get_messages(1, 30, 'SentTimestamp')
for message in result:
corpo = message.get_body()
Visita = json.loads(corpo)
OK, so I figured out what is causing me problems: There is a slash as a value of a key
"Endereco_Rua":"PAQUINHAS, S/N",
However I'm telling python to filter that out (code below), but it's not working. How can I remove that? Can do it on the origin that created the data, as I don't have access to the interface the user uses to fill in.
result = fila.get_messages(1, 30, 'SentTimestamp')
for message in result:
corpo = message.get_body()
corpo = corpo.replace("/", "") #Filtering slashes
Visita = json.loads(corpo)
Found a solution! Beside the slash caracter, sometimes this error also happened with no visible cause. Ended up solving this by adding the following lines in my python code:
1) At the start of my code, along with other python imports
from boto.sqs.message import RawMessage
2) Changing my SQS queue to use/fetch raw data:
fila = sqs_conn.get_queue(constantes.fila_SQS)
fila.set_message_class(RawMessage)
Hope this helps anyone who is having the same issue.
I'm trying to use the ghmm python module on mac osx with Python 2.7. I've managed to get everything installed, and I can import ghmm in the python environment, but there are errors when I run this (from the ghmm 'tutorial') (UnfairCasino can be found here http://ghmm.sourceforge.net/UnfairCasino.py):
from ghmm import *
from UnfairCasino import test_seq
sigma = IntegerRange(1,7)
A = [[0.9, 0.1], [0.3, 0.7]]
efair = [1.0 / 6] * 6
eloaded = [3.0 / 13, 3.0 / 13, 2.0 / 13, 2.0 / 13, 2.0 / 13, 1.0 / 13]
B = [efair, eloaded]
pi = [0.5] * 2
m = HMMFromMatrices(sigma, DiscreteDistribution(sigma), A, B, pi)
v = m.viterbi(test_seq)
Specifically I get this error:
GHMM ghmm.py:148 - sequence.c:ghmm_dseq_free(1199): Attempted m_free on NULL pointer. Bad program, BAD! No cookie for you.
python(52313,0x7fff70940cc0) malloc: * error for object 0x74706d6574744120: pointer being freed was not allocated
* set a breakpoint in malloc_error_break to debug
Abort trap
and when I set the ghmm.py logger to "DEBUG", the log prints out the following just before:
GHMM ghmm.py:2333 - HMM.viterbi() -- begin
GHMM ghmm.py:849 - EmissionSequence.asSequenceSet() -- begin >
GHMM ghmm.py:862 - EmissionSequence.asSequenceSet() -- end >
Traceback (most recent call last):
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 842, in emit
msg = self.format(record)
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 719, in format
return fmt.format(record)
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 464, in format
record.message = record.getMessage()
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 328, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Logged from file ghmm.py, line 1159
Traceback (most recent call last):
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 842, in emit
msg = self.format(record)
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 719, in format
return fmt.format(record)
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 464, in format
record.message = record.getMessage()
File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/logging/init.py", line 328, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Logged from file ghmm.py, line 949
GHMM ghmm.py:2354 - HMM.viterbi() -- end
GHMM ghmm.py:1167 - del SequenceSubSet >
So I suspect it has something to do with the way Sequences are deleted once the Viterbi function is completed, but I'm not sure if this means I need to modify the Python code, the C code, or if I need to compile ghmm and the wrappers differently. Any help/suggestions would be appreciated greatly as I have been trying to get this library to work the last 4 days.
Given the age of this question, you've probably moved onto something else, but this seemed to be the only related result I found. The issue is that a double-free is happening, due to some weirdness in how the python function 'EmissionSequence::asSequenceSet' is executed. If you look at how ghmm.py is implemented (~lines 845 - 863)
def asSequenceSet(self):
"""
#returns this EmissionSequence as a one element SequenceSet
"""
log.debug("EmissionSequence.asSequenceSet() -- begin " + repr(self.cseq))
seq = self.sequenceAllocationFunction(1)
# checking for state labels in the source C sequence struct
if self.emissionDomain.CDataType == "int" and self.cseq.state_labels is not None:
log.debug("EmissionSequence.asSequenceSet() -- found labels !")
seq.calloc_state_labels()
self.cseq.copyStateLabel(0, seq, 0)
seq.setLength(0, self.cseq.getLength(0))
seq.setSequence(0, self.cseq.getSequence(0))
seq.setWeight(0, self.cseq.getWeight(0))
log.debug("EmissionSequence.asSequenceSet() -- end " + repr(seq))
return SequenceSetSubset(self.emissionDomain, seq, self)
This should probably raise some red flags, since it seems to be reaching into the C a bit much (not that I know for sure, I haven't looked to far into it).
Anyways, if you look a little above this function, there is another function called 'sequenceSet':
def sequenceSet(self):
"""
#return a one-element SequenceSet with this sequence.
"""
# in order to copy the sequence in 'self', we first create an empty SequenceSet and then
# add 'self'
seqSet = SequenceSet(self.emissionDomain, [])
seqSet.cseq.add(self.cseq)
return seqSet
It seems that it has the same purpose, but is implemented differently. Anyways, if you replace the body of 'EmissionSequence::asSequenceSet' in ghmm.py, with just:
def asSequenceSet(self):
"""
#returns this EmissionSequence as a one element SequenceSet
"""
return self.sequenceSet();
And then rebuild/reinstall the ghmm module, the code will work without crashing, and you should be able to go on your merry way. I'm not sure if this can be submitted as a fix, since the ghmm project looks a little dead, but hopefully this is simple enough to help anyone in dire straights using this library.