Error when implementing gensim.LdaMallet

Error when implementing gensim.LdaMallet - python

I was following the instructions on this link ("http://radimrehurek.com/2014/03/tutorial-on-mallet-in-python/"), however I came across an error when I tried to train the model:
model = models.LdaMallet(mallet_path, corpus, num_topics =10, id2word = corpus.dictionary)
IOError: [Errno 2] No such file or directory: 'c:\\users\\brlu\\appdata\\local\\temp\\c6a13a_state.mallet.gz'
Please share any thoughts you might have.
Thanks.

This can happen for two reasons:
1. You have space in your mallet path.
2. There is no MALLET_HOME environment variable.

In my case I forgot to import gensim's mallet wrapper. The following code resolved the error.
import os
from gensim.models.wrappers import LdaMallet
os.environ['MALLET_HOME'] = 'C:/.../mallet-2.0.8/'
A more detailed explanation can be found here:
https://github.com/RaRe-Technologies/gensim/issues/2137

Make sure that mallet properly works from command-line.
Look to your folder 'c:\users\brlu\appdata\local\temp\...' if there are some files, you can deduce at which step mallet-wrapper fails. Try this step at command line.

I had similar problems with gensim + MALLET on Windows:
Make sure that MALLET_HOME is set
Escape slashes when set mallet_path in Python
mallet_path = 'c:\\mallet-2.0.7\\bin\\mallet'
LDA_model = gensim.models.LdaMallet(mallet_path, ...
Also, it might be useful to modify line 142 in Python\Lib\site-packages\gensim\models\ldamallet.py: change --token-regex '\S+' to --token-regex \"\S+\"
Hope it helps

Try the following
import tempfile
tempfile.tempdir='some_other_non_system_temp_directory'

Related

SentiStrength: [WinError 2] The system cannot find the file specified

I tried to use SentiStrength with Python to classify text sentiment.
import sentistrength
from sentistrength import PySentiStr
senti = PySentiStr()
senti.setSentiStrengthPath('C:/Users/xx/SentiStrengthCom.jar')
senti.setSentiStrengthLanguageFolderPath ('C:/Users/xx/SentStrength_Data_Sept2011/')
str_arr = ['What a lovely day', 'What a bad day']
result = senti.getSentiment(str_arr, score='scale')
However, when I try to execute the last line, I get the error [WinError 2] The system cannot find the file specified. However, the file is found by the system, as there is no error message when trying the code below.
SentiStrengthLocation = "C:/Users/xx/SentiStrengthCom.jar"
SentiStrengthLanguageFolder = "C:/Users/xx/SentStrength_Data_Sept2011/"
if not os.path.isfile(SentiStrengthLocation):
print("SentiStrength not found at: ", SentiStrengthLocation)
if not os.path.isdir(SentiStrengthLanguageFolder):
print("SentiStrength data folder not found at: ", SentiStrengthLanguageFolder)
I am really looking forward to your help! Thank you a lot!
Also, do you have any recommendations about how to perform a good sentiment analysis on Python?
Edit: I tried it on colab and there it works, is it possible that there are any admin rights that make it impossible to get the file?

According to this comment of an issue on github, is it possible, that you don't have java installed? The package might be throwing the error because of that.

Import Skin Weight Maps (Python) - (Maya)

Like mentioned on this post, I would like to just import a skin weightmap (a .weightMap file) into a scene without having to open a dialogue box. Trying to reverse - engineer the script mentioned in the reply didn't get me anywhere.
When I do it manually thru maya's ui - the script history shows...
ImportSkinWeightMaps;
...as a command. But my searches on this keep leading me to the deformerWeights command.
Thing is, there is no example on the documentation as to how to correctly write the syntax. Writing the flags, the path thru trial and error with it didn't work out, plus additional searches keep giving me the hint that I need to use a .xml file for some reason? when all I want to do is import a .weightMap file.
I even ended up looking at weight importer scripts in highend3d.com in hopes at looking at what a proper importing syntax should look like.
All I need is the correct syntax (or command) for something like:
mel.eval("ImportSkinWeightMaps;")
or
cmds.deformerWeights (p = "path to my .weightMap file", im=True, )
or
from pymel.core import *
pymel.core.runtime.ImportSkinWeightMaps ( 'targetOject', 'path to .weightMap file' )
Any help would be greatly appreciated.
Thanks!

why not using some cmds.skinPercent ?
It is more reliable.
http://tech-artists.org/forum/showthread.php?5490-Faster-way-to-find-number-of-influences-Maya&p=27598#post27598

How to customize Stanford NER in python?

I learned how to customize Stanford NER (Named Entity Recognizer) in Java from here:
http://nlp.stanford.edu/software/crf-faq.shtml#a
But I am developing my project with Python and here I need to train my classier with some custom entities.
I searched a lot for a solution but could not find any. Any idea? If it is not possible, is there any other way to train my classifier with custom entities, i.e, with nltk or others in python?
EDIT: Code addition
This is what I did to set up and test Stanford NER which worked nicely:
from nltk.tag.stanford import StanfordNERTagger
path_to_model = "C:\..\stanford-ner-2016-10-31\classifiers\english.all.3class.distsim.crf.ser"
path_to_jar = "C:\..\stanford-ner-2016-10-31\stanford-ner.jar"
nertagger=StanfordNERTagger(path_to_model, path_to_jar)
query="Show me the best eye doctor in Munich"
print(nertagger.tag(query.split()))
This code worked successfully. Then, I downloaded the sample austen.prop file and both jane-austen-emma-ch1.tsv and jane-austen-emma-ch2.tsv file and put it in a custom folder in NerTragger library folder. I modified the jane-austen-emma-ch1.tsv file with my custom entity tags. The code of austen.prop file has link to jane-austen-emma-ch1.tsv file. Now, I modified the above code as follow but it is not working:
from nltk.tag.stanford import StanfordNERTagger
path_to_model = "C:\..\stanford-ner-2016-10-31\custom/austen.prop"
path_to_jar = "C:\..\stanford-ner-2016-10-31\stanford-ner.jar"
nertagger=StanfordNERTagger(path_to_model, path_to_jar)
query="Show me the best eye doctor in Munich"
print(nertagger.tag(query.split()))
But this code is producing the following error:
Exception in thread "main" edu.stanford.nlp.io.RuntimeIOException: java.io.StreamCorruptedException: invalid stream header: 236C6F63
raise OSError('Java command failed : ' + str(cmd))
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifierNoExceptions(AbstractSequenceClassifier.java:1507)
at edu.stanford.nlp.ie.crf.CRFClassifier.main(CRFClassifier.java:3017)
Caused by: java.io.StreamCorruptedException: invalid stream header: 236C6F63
OSError: Java command failed : ['C:\\Program Files\\Java\\jdk1.8.0_111\\bin\\java.exe', '-mx1000m', '-cp', 'C:/Users/HP/Desktop/Downloads1/Compressed/stanford-ner-2016-10-31/stanford-ner-2016-10-31\\stanford-ner-3.7.0-javadoc.jar;C:/Users/HP/Desktop/Downloads1/Compressed/stanford-ner-2016-10-31/stanford-ner-2016-10-31\\stanford-ner-3.7.0-sources.jar;C:/Users/HP/Desktop/Downloads1/Compressed/stanford-ner-2016-10-31/stanford-ner-2016-10-31\\stanford-ner-3.7.0.jar;C:/Users/HP/Desktop/Downloads1/Compressed/stanford-ner-2016-10-31/stanford-ner-2016-10-31\\stanford-ner.jar;C:/Users/HP/Desktop/Downloads1/Compressed/stanford-ner-2016-10-31/stanford-ner-2016-10-31\\lib\\joda-time.jar;C:/Users/HP/Desktop/Downloads1/Compressed/stanford-ner-2016-10-31/stanford-ner-2016-10-31\\lib\\jollyday-0.4.9.jar;C:/Users/HP/Desktop/Downloads1/Compressed/stanford-ner-2016-10-31/stanford-ner-2016-10-31\\lib\\stanford-ner-resources.jar', 'edu.stanford.nlp.ie.crf.CRFClassifier', '-loadClassifier', 'C:/Users/HP/Desktop/Downloads1/Compressed/stanford-ner-2016-10-31/stanford-ner-2016-10-31/custom/austen.prop', '-textFile', 'C:\\Users\\HP\\AppData\\Local\\Temp\\tmppk8_741f', '-outputFormat', 'slashTags', '-tokenizerFactory', 'edu.stanford.nlp.process.WhitespaceTokenizer', '-tokenizerOptions', '"tokenizeNLs=false"', '-encoding', 'utf8']
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:808)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:301)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1462)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifier(AbstractSequenceClassifier.java:1494)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.loadClassifierNoExceptions(AbstractSequenceClassifier.java:1505)
... 1 more

The Stanford NER classifier is a java program. The NLTK's module is only an interface to the java executable. So you train a model exactly as you did before (or as you saw done in the link you provide).
In your code, you are confusing the training of a model with its use to chunk new text. The .prop file contains instructions for training a new model; it is not itself a model. This is what I recommend:
Forget about python/nltk for the moment, and train a new model from the Windows command line (CMD prompt or whatever): Follow the how-to you mention in your question, to generate a serialized model (.ser file) named ner-model.ser.gz or whatever you decide to call it from your .prop file.
In your python code, set the path_to_model variable to point to the .ser file you generated in step 1.
If you really want to control the training process from python, you could use the subprocess module to issue the appropriate command line commands. But it sounds like you don't really need this; just try to understand what these steps do so that you can carry them out properly.

Saving NLTK Alignments

I am using NLTK 3.2 and I was wondering how you save NLTK alignments. I have found this link: How to save Python NLTK alignment models for later use?, but it seems that there is no align() method. Also, I figured out that nltk.align has been renamed to nltk.translate, but I still cannot access the align() method. Thanks!

Yeah, you are right. The method align became private in the current version. So, if you want to use that method, you have to modify the source code.
To modify the source code, you have to get to the directory of the file. You can find that directory by:
Open your terminal
Type these commands:
>>> python
>>> import nltk
>>> nltk.translate.ibm1.__file__
Here is a screen-shot of what it should look like:
Now, you have to go to that directory and find the file 'ibm1.py'. Open the file and modify the method __align to align.
It's the last method in the file.
CAUTION:
The align method returns Alignment class instead of AlignedSent in earlier versions.

How to download a file using Python

I tried to download something from the Internet using Python, I am using urllib.retriever from the urllib module but I just can't get it work. I would like to be able to save the downloaded file to a location of my choice.
If someone could explain to me how to do it with clear examples, that would be VERY appreciated.

I suggest using urllib2 like so:
source = urllib2.urlopen("http://someUrl.com/somePage.html").read()
open("/path/to/someFile", "wb").write(source)
You could even shorten it to (although, you wouldnt want to shorten it if you plan to enclose each individual call in a try - except):
open("/path/to/someFile", "wb").write(urllib2.urlopen("http://someUrl.com/somePage.html").read())

You can also use the urllib:
source = urllib.request.urlopen(("full_url")).read()
and then use what chown used above:
open("/path/to/someFile", "wb").write(source)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Error when implementing gensim.LdaMallet - python

This can happen for two reasons: 1. You have space in your mallet path. 2. There is no MALLET_HOME environment variable.

Make sure that mallet properly works from command-line. Look to your folder 'c:\users\brlu\appdata\local\temp\...' if there are some files, you can deduce at which step mallet-wrapper fails. Try this step at command line.

Try the following import tempfile tempfile.tempdir='some_other_non_system_temp_directory'

Related

SentiStrength: [WinError 2] The system cannot find the file specified

Import Skin Weight Maps (Python) - (Maya)

How to customize Stanford NER in python?

Saving NLTK Alignments

How to download a file using Python

Categories

Resources