Dumbo(Python)/Hadoop unexpected output

Dumbo(Python)/Hadoop unexpected output - python

I'm trying to execute the following code with dumbo(Python) / haddop
https://github.com/klbostee/dumbo/wiki/Short-tutorial#jobs-and-runners
I followed the tutorial correctly, I have done every step but when I run code in hadoop environment I obtain as output as follows:
SEQ/org.apache.hadoop.typedbytes.TypedBytesWritable/org.apache.hadoop.typedbytes.TypedBytesWritable�������ޭǡ�q���%�O��������������172.16.1.10������������������172.16.1.12������������������172.16.1.30������
It should return a list of IP addresses with connections counter.
Why those characters appear? Is it an encoding problem? How do I fix it? Thanks
Also if I try other programs in the tutorial, I have the same problem.

I answer by myself. That output is the serialized form of Dumbo. There is no error.
To convert it into a readable text, it's sufficient the follow command (the answer was in the tutorial ! I don't saw it)
dumbo cat ipcounts/part* -hadoop /usr/local/hadoop | sort -k2,2nr | head -n 5

Related

Treetools package - computational linguistics

How can I obtain the LCFRS grammar using treetools? I used the following terminal command
treetools grammar wsj_0001.prd output leftright --dest-format rcg --markov v:1 h:2
where wsj_0001.prd is a tree and the output file I get is empty.
https://pypi.python.org/pypi/treetools/0.1.0 - I used the last command form the ones listed.
Thanks.

I would suggest using a sentence from the Clark paper

Simple split string not working in zapier

I'm using zapier to put different apps together. I need to split a string custom_id that has 6 parts that are separated by an underscore. For example, sk000_i093_14.50_5_MNE_2017-07-25
Here's my code:
split_str = input_data['custom_id'].split("_")
output = [{'sk':split_str[0], 'buy_invoice':split_str[1], 'sales_amt':split_str[2], 'UPI':split_str[3], 'buyer':split_str[4], 'date_buy':split_str[5]}]
I also tried it this way:
sk, buy_invoice, sales_amt, upi, buyer, date_buy = input_data['custom_id'].split("_")
output = [{'sk':sk, 'buy_invoice':buy_invoice, 'sales_amt':sales_amt, 'upi':upi, 'buyer':buyer, 'date_buy':date_buy}]
I've searched and searched and haven't found anything specific to zapier on why my simple split string isn't working with zapier. When I test the code zapier doesn't give a useful error message, just:
"Bargle. We hit an error creating a run python. Error: Your code had
an error!"
I've tried running it multiple ways, but whenever I try to retrieve the data from the split I get the very unhelpful error message.
Any help is very much appreciated! Thanks!
UPDATE:
When you go to test the code, Zapier shows test data for input_data. Even though this data is showing up correctly, during the actual test run input_data is empty! So there was nothing wrong with the split. Phew!
Thanks!

The split was correct. The problem was input_data wasn't being populated, even though Zapier showed the correct data was going to populate it, input_data was empty anyway. I added some more key:value pairs to input_data because I needed them, refreshed the webpage, refreshed the fields, and re-tested the code, and input_data finally got populated and the code ran perfectly.
Thanks to PRMoureu and E. Ducateme for giving me the idea to check my input_data (Duh!).

Tor API example not works correct

I'm trying to run example named "Using PycURL" from here https://stem.torproject.org/tutorials/to_russia_with_love.html
Everything works fine, but in the final i have this some kind of error:
TypeError : String argument expected, got 'bytes'
Unable to reach http://google.com <<23, 'Failed writing body <0 != 144>'>>
The question is, how can i fix these?
I've tried to use PyCurl as is without any proxy and it works fine.
But this example not works.
I'm running Python 3.4 under Windows, here is my source code http://pastebin.com/zFWrXU5E
Tnanks.
P.S. I need this to work exactly with PyCurl, cuz it is most usefull for my tasks.
P.S. #2 : I did little crutch, seems like it work http://pastebin.com/x8PtL9i3
Heh.
P.S. #3 : Hey! I get the error point, it's in the WRITEFUNCTION of PyCurl, somehow io.StringIO().write function not works ...

Solved.
Problem was in Python 3.4, cuz StringIO object was changed.
All you need is to change output var type from StringIO to BytesIO and then convert bytes to string for printing result.
Here is working source code : http://pastebin.com/Ad8ENTGe
Thanks.
P.S. Who placed -1 ???
haters...

gcutil moveinstances failing due to "KeyError: u'CPUS'"

I am trying to move my micro-sized Compute Engine from us-central2-a to us-central1-a, since Google will be doing maintenance on the first zone in a week. I am running gcutil-1.9.0 on my Windows machine, via Cygwin.
I ran the exact command they suggested:
gcutil moveinstances --replace_deprecated --source_zone=us-central2-a --destination_zone=us-central1-a ".*" --project=careful-isotope-239
and got the following result:
Checking destination zone...
Retrieving instances in us-central2-a matching: .*...
Checking disk preconditions...
Checking quotas...
KeyError: u'CPUS'
So, this is evidently a Python error, but I have no idea how to proceed. Anybody have ideas?
Thanks,
Tim

You should use --service_version=v1beta15 flag, they've broken the API for getzone (moveinstances is tryind to verify CPUS quota).

Starting John the Rippper via a python script

So yeah, I've been working on a python script that extracts the password hash from a Mac.
Now I wanna take it to the next level, crack it.
After some quick research i found John the Ripper(http://www.openwall.com/john/) and decided to try and use that. (Note: I have tried other softwares, but none of them have been able to crack my test-hash.
The problem is, when i try to start john the ripper, it fails me. (Im using some custom mac 1.7.3 version, haven't tried updating yet and I would prefer not to)
Current script(after about 1 000 000 changes and retries:
output__ = "1dc74ff22b199305242d62f76f6a5c5c47b4c2e3"
print output__
txt = file('john/sha1.txt','wt')
sha1textfile = "%s:%s" % (output2[0], output__)
txt.write(sha1textfile)
txt2 = file('startjohn.command', 'wt')
stjtextfile = """
#!/bin/bash
cd /hax/john
./run/john sha1.txt
"""
txt2.write(stjtextfile)
shell('chmod 777 startjohn.command')
shell('open startjohn.command')
Now I the error i get is:
/hax/startjohn.command ; exit;
My-MacBook:~ albertfreakman$ /hax/startjohn.command ; exit;
No password hashes loaded
logout
Help me solve this problem and save me from insanity!
Sincerely, Duke.
Some quick notes:
Output__ is my test hash, already got the extract hash part working.
If you have a solution that uses any other Hashcracker than John, thats even better! As long as it can either use a wordlist, or bruteforce.
The hash is SHA1
Thanks!

Okay I found the problem, my test hash didn't have CAPITAL LETTERS and therefore weren't accepted by john the ripper.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Dumbo(Python)/Hadoop unexpected output - python

I answer by myself. That output is the serialized form of Dumbo. There is no error. To convert it into a readable text, it's sufficient the follow command (the answer was in the tutorial ! I don't saw it) dumbo cat ipcounts/part* -hadoop /usr/local/hadoop | sort -k2,2nr | head -n 5

Related

Treetools package - computational linguistics

Simple split string not working in zapier

Tor API example not works correct

gcutil moveinstances failing due to "KeyError: u'CPUS'"

Starting John the Rippper via a python script

Categories

Resources