Using WordNet with PyScript - python

I'm trying to use WordNet within PyScript but I can't seem to properly load Wordnet.
At first I tried:
<py-env>
- nltk
</py-env>
<py-script>
import nltk
from nltk.corpus import wordnet as wn
<py-script>
This gave me a LookupError(resource_not_found), along with the message
Please use the NLTK Downloader to obtain the resource: [31m>>> import nltk >>> nltk.download('wordnet')
I then tried:
<py-script>
import nltk
nltk.download('wordnet')
from nltk.corpus import wordnet as wn
<py-script>
which gave me this message in the console:
writing to py-3f0adca1-a38a-4161-c36f-7e6548260aa5 [nltk_data] Error loading wordnet: <urlopen error unknown url type:
[nltk_data] https> true
I looked at the responses here: Pyodide filesystem for NLTK resources : missing files
and tried to replicate their code
from js import fetch
from pathlib import Path
import asyncio, os, sys, io, zipfile
response = await fetch('https://github.com/nltk/wordnet/archive/refs/heads/master.zip')
js_buffer = await response.arrayBuffer()
py_buffer = js_buffer.to_py() # this is a memoryview
stream = py_buffer.tobytes() # now we have a bytes object
d = Path("/nltk/wordnet")
d.mkdir(parents=True, exist_ok=True)
Path('/nltk/wordnet/master.zip').write_bytes(stream)
zipfile.ZipFile('/nltk/wordnet/master.zip').extractall(
path='/nltk/wordnet/'
)
This is the error message that I got:
APPENDING: True ==> py-2880055f-8922-cb23-34e4-db404fb1d7a4 --> PythonError: Traceback (most recent call last):
File "/lib/python3.10/asyncio/futures.py", line 201, in result
raise self._exception
File "/lib/python3.10/asyncio/tasks.py", line 232, in __step
result = coro.send(None)
File "/lib/python3.10/site-packages/_pyodide/_base.py", line 500, in eval_code_async
await CodeRunner(
File "/lib/python3.10/site-packages/_pyodide/_base.py", line 353, in run_async
await coroutine
File "<exec>", line 21, in
File "/lib/python3.10/zipfile.py", line 1258, in init
self._RealGetContents()
File "/lib/python3.10/zipfile.py", line 1325, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
What am I doing wrong? Thanks!
UPDATE:
I tried installing the wn library from PyPi using
await micropip.install('https://files.pythonhosted.org/packages/ce/f1/53b07100f5c3d41fd33fc78ebb9e99d736b0460ced8acff94840311ffc60/wn-0.9.1-py3-none-any.whl')
But I get the error:
JsException(PythonError: Traceback (most recent call last): File "/lib/python3.10/asyncio/futures.py", line 201, in result raise self._exception File "/lib/python3.10/asyncio/tasks.py", line 232, in __step result = coro.send(None) File "/lib/python3.10/site-packages/_pyodide/_base.py", line 500, in eval_code_async await CodeRunner( File "/lib/python3.10/site-packages/_pyodide/_base.py", line 353, in run_async await coroutine File "", line 14, in File "/lib/python3.10/site-packages/wn/init.py", line 47, in from wn._add import add, remove File "/lib/python3.10/site-packages/wn/_add.py", line 21, in from wn.project import iterpackages File "/lib/python3.10/site-packages/wn/project.py", line 12, in import lzma File "/lib/python3.10/lzma.py", line 27, in from _lzma import * ModuleNotFoundError: No module named '_lzma' )

Related

Error in librosa.load('path.webm') RuntimeError: File contains data in an unknown format

I am trying to load sound files on a python script.
I installed, loaded the right version of modules, imported successfully librosa and soundfile, and even ffmpeg (which I found was a solution to this same error for mp3 files)
import os
import json
from scipy import signal
import librosa
[...]
y, sr = librosa.load(directory_path + filename + '.webm', mono = True)
My code works on notebooks (Kaggle) but somehow, and I can't figure out why, it doesn't work when uploaded on a computer cluster. And I really need the cluster for its computational power/memory. :/
The error:
Traceback (most recent call last):
File "/cluster/apps/nss/gcc-6.3.0/python/3.8.5/x86_64/lib64/python3.8/site-packages/librosa/core/audio.py", line 146, in load
with sf.SoundFile(path) as sf_desc:
File "/cluster/home/.local/lib/python3.8/site-packages/soundfile.py", line 740, in __init__
self._file = self._open(file, mode_int, closefd)
File "/cluster/home/.local/lib/python3.8/site-packages/soundfile.py", line 1264, in _open
_error_check(_snd.sf_error(file_ptr),
File "/cluster/home/.local/lib/python3.8/site-packages/soundfile.py", line 1455, in _error_check
raise RuntimeError(prefix + _ffi.string(err_str).decode('utf-8', 'replace'))
RuntimeError: Error opening 'input/00092.webm': File contains data in an unknown format.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "analysis_python.py", line 47, in <module>
y,sr = librosa.load(directory_path + filename + '.webm')
File "/cluster/apps/nss/gcc-6.3.0/python/3.8.5/x86_64/lib64/python3.8/site-packages/librosa/core/audio.py", line 163, in load
y, sr_native = __audioread_load(path, offset, duration, dtype)
File "/cluster/apps/nss/gcc-6.3.0/python/3.8.5/x86_64/lib64/python3.8/site-packages/librosa/core/audio.py", line 187, in __audioread_load
with audioread.audio_open(path) as input_file:
File "/cluster/apps/nss/gcc-6.3.0/python/3.8.5/x86_64/lib64/python3.8/site-packages/audioread/__init__.py", line 116, in audio_open
raise NoBackendError()
audioread.exceptions.NoBackendError
I perused the internet for a solution but no luck. Thank you for your help in advance.

ImportError: cannot import name 'dump_csp_header' from 'werkzeug.http' (C:\Users\karan\Anaconda3\lib\site-packages\werkzeug\http.py)

I am getting this error by running tenserbord
(base) C:\Users\karan\TF_2_Notebooks_and_Data\03-ANNs>tensorboard
--logdir logs\fit Traceback (most recent call last): File "C:\Users\karan\Anaconda3\Scripts\tensorboard-script.py", line 5, in
from tensorboard.main import run_main File "C:\Users\karan\Anaconda3\lib\site-packages\tensorboard\main.py", line
40, in
from tensorboard import default File "C:\Users\karan\Anaconda3\lib\site-packages\tensorboard\default.py",
line 38, in
from tensorboard.plugins.audio import audio_plugin File "C:\Users\karan\Anaconda3\lib\site-packages\tensorboard\plugins\audio\audio_plugin.py",
line 23, in
from werkzeug import wrappers File "C:\Users\karan\Anaconda3\lib\site-packages\werkzeug__init__.py",
line 151, in
import('werkzeug.exceptions') File "C:\Users\karan\Anaconda3\lib\site-packages\werkzeug\exceptions.py",
line 71, in
from werkzeug.wrappers import Response File "C:\Users\karan\Anaconda3\lib\site-packages\werkzeug\wrappers__init__.py",
line 26, in
from .common_descriptors import CommonRequestDescriptorsMixin File
"C:\Users\karan\Anaconda3\lib\site-packages\werkzeug\wrappers\common_descriptors.py",
line 7, in
from ..http import dump_csp_header ImportError: cannot import name 'dump_csp_header' from 'werkzeug.http'
(C:\Users\karan\Anaconda3\lib\site-packages\werkzeug\http.py)
and also by running,
board = TensorBoard(log_dir=log_directory,histogram_freq=1,
write_graph=True,
write_images=True,
update_freq='epoch',
profile_batch=2,embeddings_freq=1)
if I put embeddings_freq=1 instead of embeddings_freq=0 then I get an error
Train on 426 samples, validate on 143 samples Traceback (most recent
call last):
File
"C:\Users\karan\Anaconda3\lib\site-packages\tensorflow_core\python\keras\callbacks.py",
line 1561, in _configure_embeddings
from tensorboard.plugins import projector
File
"C:\Users\karan\Anaconda3\lib\site-packages\tensorboard\plugins\projector__init__.py",
line 32, in
from tensorboard.plugins.projector import projector_plugin as _projector_plugin
File
"C:\Users\karan\Anaconda3\lib\site-packages\tensorboard\plugins\projector\projector_plugin.py",
line 30, in
from werkzeug import wrappers
File
"C:\Users\karan\Anaconda3\lib\site-packages\werkzeug__init__.py",
line 151, in
import('werkzeug.exceptions')
File
"C:\Users\karan\Anaconda3\lib\site-packages\werkzeug\exceptions.py",
line 71, in
from werkzeug.wrappers import Response
File
"C:\Users\karan\Anaconda3\lib\site-packages\werkzeug\wrappers__init__.py",
line 26, in
from .common_descriptors import CommonRequestDescriptorsMixin
File
"C:\Users\karan\Anaconda3\lib\site-packages\werkzeug\wrappers\common_descriptors.py",
line 7, in
from ..http import dump_csp_header
ImportError: cannot import name 'dump_csp_header' from 'werkzeug.http'
(C:\Users\karan\Anaconda3\lib\site-packages\werkzeug\http.py)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "", line 5, in
callbacks=[board,early_stop]
File
"C:\Users\karan\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\training.py",
line 819, in fit
use_multiprocessing=use_multiprocessing)
File
"C:\Users\karan\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\training_v2.py",
line 307, in fit
mode=ModeKeys.TRAIN)
File
"C:\Users\karan\Anaconda3\lib\site-packages\tensorflow_core\python\keras\callbacks.py",
line 107, in configure_callbacks
callback_list.set_model(callback_model)
File
"C:\Users\karan\Anaconda3\lib\site-packages\tensorflow_core\python\keras\callbacks.py",
line 222, in set_model
callback.set_model(model)
File
"C:\Users\karan\Anaconda3\lib\site-packages\tensorflow_core\python\keras\callbacks.py",
line 1549, in set_model
self._configure_embeddings()
File
"C:\Users\karan\Anaconda3\lib\site-packages\tensorflow_core\python\keras\callbacks.py",
line 1563, in _configure_embeddings
raise ImportError('Failed to import TensorBoard. Please make sure that '
ImportError: Failed to import TensorBoard. Please make sure that
TensorBoard integration is complete."
What should I do> I am not an expert just a beginner, and I didn't create an anaconda env, I am running it in a (base), can someone please reply it fast? My exams are coming and I need to resolve this as fast it could be, my thanks to the community.

google cloud speech ImportError: cannot import name 'enums'

I'm using google-cloud-speech api for my project . I'm using pipenv for virtual environment i installed google-cloud-speech api with
pipenv install google-cloud-speech
and
pipenv update google-cloud-speech
i followed this docs https://cloud.google.com/speech-to-text/docs/reference/libraries
This is my code:
google.py:
# !/usr/bin/env python
# coding: utf-8
import argparse
import io
import sys
import codecs
import datetime
import locale
import os
from google.cloud import speech_v1 as speech
from google.cloud.speech import enums
from google.cloud.speech import types
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = os.path.join("alt_speech_dev_01-fa5fec6806d9.json")
def get_model_by_language_id(language_id):
model = ''
if language_id == 1:
model = 'ja-JP'
elif language_id == 2:
model = 'en-US'
elif language_id == 3:
model = "zh-CN"
else:
raise ('Not Match Lang')
return model
def transcribe_gcs_without_speech_contexts(audio_file_path, model):
client = speech.SpeechClient()
with io.open(audio_file_path, 'rb') as audio_file:
content = audio_file.read()
audio = types.RecognitionAudio(content=content)
config = {
"encoding": enums.RecognitionConfig.AudioEncoding.FLAC,
"sample_rate_hertz": 16000,
"languageCode": model
}
operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
operationResult = operation.result()
ret=''
for result in operationResult.results:
for alternative in result.alternatives:
ret = alternative.transcript
return ret
def transcribe_gcs(audio_file_path, model, keywords=None):
client = speech.SpeechClient()
with io.open(audio_file_path, 'rb') as audio_file:
content = audio_file.read()
audio = types.RecognitionAudio(content=content)
config = {
"encoding": enums.RecognitionConfig.AudioEncoding.FLAC,
"sample_rate_hertz": 16000,
"languageCode": model,
"speech_contexts":[{"phrases":keywords}]
}
operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
operationResult = operation.result()
ret=''
for result in operationResult.results:
for alternative in result.alternatives:
ret = alternative.transcript
return ret
transcribe_gcs_without_speech_contexts('alt_en.wav', get_model_by_language_id(2))
When i try to run the python file with
python google.py
it return error ImportError: cannot import name 'SpeechClient' with the following traceback:
Traceback (most recent call last):
File "google.py", line 11, in <module>
from google.cloud import speech_v1 as speech
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/google/cloud/speech_v1/__init__.py", line 17, in <module>
from google.cloud.speech_v1.gapic import speech_client
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/google/cloud/speech_v1/gapic/speech_client.py", line 18, in <module>
import pkg_resources
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3241, in <module>
#_call_aside
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3225, in _call_aside
f(*args, **kwargs)
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3269, in _initialize_master_working_set
for dist in working_set
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3269, in <genexpr>
for dist in working_set
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2776, in activate
declare_namespace(pkg)
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2275, in declare_namespace
_handle_ns(packageName, path_item)
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2208, in _handle_ns
loader.load_module(packageName)
File "/home/hoanglinh/Documents/practice_speech/google.py", line 12, in <module>
from google.cloud.speech import enums
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/google/cloud/speech.py", line 19, in <module>
from google.cloud.speech_v1 import SpeechClient
ImportError: cannot import name 'SpeechClient'
Am i doing something wrong ? when i search the error online there only 1 question with no answer to it
UPDATE:
i changed from
google.cloud import speech_v1 as speech
to this
from google.cloud import speech
now i got another return error with traceback like so
Traceback (most recent call last):
File "google.py", line 11, in <module>
from google.cloud import speech
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/google/cloud/speech.py", line 19, in <module>
from google.cloud.speech_v1 import SpeechClient
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/google/cloud/speech_v1/__init__.py", line 17, in <module>
from google.cloud.speech_v1.gapic import speech_client
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/google/cloud/speech_v1/gapic/speech_client.py", line 18, in <module>
import pkg_resources
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3241, in <module>
#_call_aside
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3225, in _call_aside
f(*args, **kwargs)
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3269, in _initialize_master_working_set
for dist in working_set
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 3269, in <genexpr>
for dist in working_set
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2776, in activate
declare_namespace(pkg)
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2275, in declare_namespace
_handle_ns(packageName, path_item)
File "/home/hoanglinh/Documents/practice_speech/.venv/lib/python3.6/site-packages/pkg_resources/__init__.py", line 2208, in _handle_ns
loader.load_module(packageName)
File "/home/hoanglinh/Documents/practice_speech/google.py", line 12, in <module>
from google.cloud.speech import enums
ImportError: cannot import name 'enums'
Have anyone tried this library before ? because it seem there so much errors just with following the docs of its
The following error message is seen
from google.cloud.speech import enums
ImportError: cannot import name 'enums'
if an 'new' installation of the google speech api was performed. Please see this page.
Along the same lines usage of nanos attributes would result in the following message if you have update the api
AttributeError: 'datetime.timedelta' object has no attribute 'nanos'
Please see this page. Use 'microseconds' instead of 'nanos'.
First solution try to check your python3.6/site-packages/google/cloud if there is speech_v1. if there is none, you need to install it first
Second solution try to check your python3.6/site-packages/google/cloud if there is an existing speech file, if it exists then the cause of the import is shadowing. since your alias is 'speech'
Hope this helps
try this line of codes if your using speech_v1:
from google.cloud import speech_v1 as speech
from google.cloud.speech_v1 import enums
from google.cloud.speech_v1 import types
speech:
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
If you can check this link.
Google has moved the AudioEncodings under google.cloud.speech_v1.types you can use it by importing types and then running the code below:
from google.cloud.speech_v1 import types
types.RecognitionConfig.AudioEncoding.LINEAR16
From Google Cloud documentation :
Enums and Types
WARNING: Breaking change
The submodules enums and types have been removed.
Before:
from google.cloud import videointelligence
features = [videointelligence.enums.Feature.SPEECH_TRANSCRIPTION]
video_context = videointelligence.types.VideoContext()
After:
from google.cloud import videointelligence
features = [videointelligence.Feature.SPEECH_TRANSCRIPTION]
video_context = videointelligence.VideoContext()

NLTK panlex_lite giving me error

I'm trying to use NLTK for my NLP learning in Python.
Certain package called "panlex_lite" keeps giving me error so I tried using the following:
import nltk
nltk.download('all', halt_on_error = False)
and it gives me the following error:
[nltk_data] | Downloading package panlex_lite to
[nltk_data] | /Users/Harshil/nltk_data...
[nltk_data] | Unzipping corpora/panlex_lite.zip.
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
nltk.download('all', halt_on_error = False)
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 664, in download
for msg in self.incr_download(info_or_id, download_dir, force):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 543, in incr_download
for msg in self.incr_download(info.children, download_dir, force):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 529, in incr_download
for msg in self._download_list(info_or_id, download_dir, force):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 572, in _download_list
for msg in self.incr_download(item, download_dir, force):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 549, in incr_download
for msg in self._download_package(info, download_dir, force):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 638, in _download_package
for msg in _unzip_iter(filepath, zipdir, verbose=False):
File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/nltk/downloader.py", line 2039, in _unzip_iter
outfile.write(contents)
OSError: [Errno 22] Invalid argument
Anyway to fix this? I've tried using "halt_on_error = False" method but it still gives me error.
Thanks.
Here's a "dirty" hack:
$ rm /Users/Harshil/nltk_data/corpora/panlex_lite.zip
$ rm -r /Users/Harshil/nltk_data/corpora/panlex_lite
$ python
>>> import nltk
>>> dler = nltk.downloader.Downloader()
>>> dler._update_index()
>>> dler._status_cache['panlex_lite'] = 'installed' # Trick the index to treat panlex_lite as it's already installed.
>>> dler.download('all')
Also, try earthy:
pip install earthy
TL;DR:
import earthy
path_to_nltk_data = '/home/yourusername/nltk_data/'
earthy.download('all', path_to_nltk_data) # Excludes the third party (non-NLTK) packages.
To download panlex_lite exclusively:
import earthy
earthy.download('panlex_lite', path_to_nltk_data)
To download all third-party datasets not natively hosted on nltk_data github:
import earthy
earthy.download('third_party', path_to_nltk_data')

Error with pydub in python

i have successfully imported pydub
but for the code:
from pydub import AudioSegment
song = AudioSegment.from_mp3("c:\mks.mp3")
first_ten_seconds = song[:10000]
song.export("d:\mks.mp3", format="mp3")
But it gives the following error:
python "C:\Users\mKs\Desktop\mks2.py"
Process started >>>
Traceback (most recent call last):
File "C:\Users\mKs\Desktop\mks2.py", line 2, in <module>
song=AudioSegment.from_mp3("c:\mks.mp3");
File "C:\Python27\lib\site-packages\pydub-0.5.2-py2.7.egg\pydub\audio_segment.py", line 194, in from_mp3
return cls.from_file(file, 'mp3')
File "C:\Python27\lib\site-packages\pydub-0.5.2-py2.7.egg\pydub\audio_segment.py", line 189, in from_file
return cls.from_wav(output)
File "C:\Python27\lib\site-packages\pydub-0.5.2-py2.7.egg\pydub\audio_segment.py", line 206, in from_wav
return cls(data=file)
File "C:\Python27\lib\site-packages\pydub-0.5.2-py2.7.egg\pydub\audio_segment.py", line 33, in __init__
raw = wave.open(StringIO(data), 'rb')
File "C:\Python27\lib\wave.py", line 498, in open
return Wave_read(f)
File "C:\Python27\lib\wave.py", line 163, in __init__
self.initfp(f)
File "C:\Python27\lib\wave.py", line 128, in initfp
self._file = Chunk(file, bigendian = 0)
File "C:\Python27\lib\chunk.py", line 63, in __init__
raise EOFError
EOFError
I would love to get help on this topic
The only issue that I see with your code is trailing ";" at the end of last 3 line. Please remove those, and see if you still get the error.
In addition, make sure you have ffmpeg (http://www.ffmpeg.org/) installed. It is required for the support of all of the none wav file formats.
ADDED:
I think you have broken module dependencies in your python installation.
I have tried code that you provided above with python 2.7.2. It worked fine for me:
>>> from pydub import AudioSegment
>>> song = AudioSegment.from_wav('goodbye.wav')
>>> first_ten_seconds = song[:10000]
>>> song.export('goodbye1.wav',format='wav')
<open file 'goodbye1.wav', mode 'wb+' at 0x10cf2b270>

Categories

Resources