Why can't I download a dataset with the Gensim download API

Why can't I download a dataset with the Gensim download API - python

When I do the below:
>>> import gensim.downloader as api
>>> model = api.load("glove-twitter-25") # load glove vectors
the gensim.downloader API throws the below error:
[Errno 2] No such file or directory:
'/Users/vtim/gensim-data/information.json'.
What am I doing wrong?

I had the same problem and I solved it in these steps. I am using mac, pycharm, and virtualenv. I don't have too much python experience but this is how I did it:
1.1 You have to create a folder named 'gensim-data' with directory '/Users/vtim/gensim-data'. This can be done by running command 'mkdir gensim-data' in your terminal (the same place where you can use pip install commands).
1.2 Then you have to add the folder to your project as a content root (so that the code can access it). From Pycharm go from the main application menu (next to Apple logo with mac) Pycharm -> Preferences and there Project -> Project Structure and from there on the right menu choose 'Add content root'. Find the gensim-data folder that you just made and add it.
1.3 Now you should see the 'gensim-data' folder in your project folder where, for example, venv (virtualenv) is also if you are using it. Now create a file to the 'gensim-data' folder named as 'information.json'. Then copy the code found from this link to the 'information.json' file: https://github.com/RaRe-Technologies/gensim-data/blob/master/list.json
(The problem that you have is that gensim.downloader api may not have access to write documents to the specific directory or it can not read them. In my case it couldn't do either.)
If your code is still not working, you should do the next step:
2.1 In my case I had also a problem that the api could not access files the right files from internet. This problem is solved here: https://stackoverflow.com/a/42098127/14075343 . So find the folder/application named Python 3.8 (if you are using 3.8 version) from your computer, open it and double click 'Install Certificates.command'. Or you can try to run from terminal 'open /Applications/Python\ 3.8/Install\ Certificates.command'
Now the code should work. If it still doesn't you can try to run these codes. I am not sure if it makes a difference but I run these on the way I found the solution:
sudo python3 -m pip install --upgrade gensim
sudo -H pip install virtualenv
sudo chown -R $USERNAME /Users/$USERNAME/Library/Caches/pip

I had both the issues 'information.json' related as well as the certificate one and was able to resolve it by following the steps above. As a tip you can also try testing it in command line by doing
python3 -m gensim.downloader -i word2vec-google-news-300
replace word2vec-google-news-300 with the dataset that you want to download in
https://github.com/RaRe-Technologies/gensim-data/blob/master/list.json

Related

openai command not found (mac)

I'm trying to follow the fine tuning guide for Openai here.
I ran:
pip install --upgrade openai
Which install without any errors.
But even after restarting my terminal, i still get
zsh: command not found: openai
Here is the output of echo $PATH:
/bin:/usr/bin:/usr/local/bin:/Users/nickrose/Downloads/google-cloud-sdk/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin
Here is the output of which python:
/usr/bin/python
Any tips for how to fix this? I'm on MacOS Big Sur 11.6.

Basically pip installs the packages under its related python directory, in a directory called site-packages (most likely, I'm not a python expert tbh). This is not included in the path you provided. First, ask pip to show the location to the package:
pip show openai
The output would be something like this:
Name: openai
Version: 0.22.0
Summary: Python client library for the OpenAI API
Home-page: https://github.com/openai/openai-python
Author: OpenAI
Author-email: support#openai.com
License:
Location: /Users/<USER>/DIR/TO/SOME/PYTHON/site-packages
Requires: numpy, openpyxl, pandas, pandas-stubs, requests, tqdm
Required-by:
So your package will be available in
/Users/<USER>/DIR/TO/SOME/PYTHON/site-packages/openai
Either add /Users/<USER>/DIR/TO/SOME/PYTHON/site-packages/ to your path, or use the complete address to your package, or try to access it using your python:
python -m openai # -m stands for module
To get more information about the -m flag, run python --help.
Update
So as you mentioned in the comments, you get permission denied after you add the directory to your package. This actually means that the package exists, but it's not permitted by your OS to execute. This is the thing you have to do, locate your package, and then:
sudo chmod +x /PATH/TO/script
And the reason you're getting command not found after you use sudo directly with the package, is that you update your path variable in zsh, but when you use sudo, superuser uses sh instead of zsh.

This doesn't answer the question directly but specifies an alternative if you only want to prepare the data set and create the new model for finetunning. It doesn't matter which system you have.
After a lot of struggle I decided it was not worth the hassel to run the cli on my specific machine because of so many different configurations and the mess. My end goal was just to create a model and upload it to OpenAI.
So if someone else stumbles on this post, just use Google Colab. I have also shared one of mine with steps to follow in here.
In case the links don't work in the future I'll list the steps here below as well:
(Step 1)
Set your API key (The already added api key is fake so please replace it with your own):
%env OPENAI_API_KEY=sk-Kz8Weh1234ddgYBmsdfinsdf7ndsfg55532432
(Step 2)
Install the openai package with pip like the following:
!pip install -Uq openai
(Step 3)
Import the openai package like the following:
import openai
(Step 4)
Make sure to upload the promptdata.csv file in the Google Colab folders.
The way to do it is:
On the right side you'll see a Hamburger Menu icon click on it.
You'll see the "Table of Contents"
Click on the last folder icon on the top. If you hover on the icon it says "Files".
Now you'll see a folder called "sample_data".
Click on the three dots menu for "sample_data" and then select "upload".
You should be able to upload your csv file
It is not mandatory to upload a csv file. You can also upload any type of TSV, XLSX, JSON or JSONL file as listed by the OpenAI documentation here. But it will always be converted to JSONL file after runnning the below command.
Once you're done uploading the file you can run the below command to prepare your data set which will return you a new JSONL file at the same location where the original file was with all the corrections the tool provides.
!openai tools fine_tunes.prepare_data -f "/content/sample_data/promptdata.csv"
(Step 5)
Run the below command once again after the corrections and it will most likely say "No remediations found".
!openai tools fine_tunes.prepare_data -f "/content/sample_data/promptdata_prepared.jsonl"
(Step 6)
Finally run the below command using the file promptdata_prepared.jsonl and create a model.
!openai api fine_tunes.create -t "/content/sample_data/promptdata_prepared.jsonl"
(Step 7)
Once the model is created note the name of the "Uploaded model"

So what happens is that after installing the package there are no actual executables available. That's why you get the error message when you try to execute for example:
openai --help
What i managed to find is that the actual parsing of the commands is done in
/Users/<USER>/DIR_TO_PYTHON/site-packages/openai/_openai_scripts.py
That's just a python script which by default is not executable, so you have to make a workaround of which I find the easiest is creating an executable which basically calls it with the given arguments. Below are the steps which I've done to make it work on "macOS Monterey 12.0.1"
Locate the "openai" package which should be in
/Users/<USER>/DIR_TO_PYTHON/site-packages/
Make sure you are in the "openai" package folder and run
sudo vim /bin/openai
That should create a new file, put in the following command and make sure the path to the file is correct
python3 /Users/<USER>/DIR_TO_PYTHON/site-packages/openai/_openai_scripts.py $#
$# is for the params that you pass when you call the executable
After saving the file the next step is making it executable which is done with
chmod +x /bin/openai
Last step is adding it to the PATH which is done by adding the file path in /etc/paths and after restarting the terminal, you should have fully working openai command globally

I was facing similar issue. It might due to global python in your machine is not maching with the pip installation path and it might be installing in some other python folder like in 3.9 and you have 3.10 python version globally set in your Mac.
First install fresh python using homebrew
brew install python
It will install the latest python into your machine. Then try to install openai again using
pip3 install openai
OR using pip (you can try installing using both and see which works as per your system config)
pip install openai
Now
ENJOY a cup of coffee ;)

Copy complete virtualenv to another pc

I have a virtualenv located at /home/user/virtualenvs/Environment. Now I need this environment at another PC. So I installed virtualenv-clone and used it to clone /Environment. Then I copied it to the other PC via USB. I can activate it with source activate, but when I try to start the python interpreter with sudo ./Environment/bin/python I get
./bin/python: 1: ./bin/python: Syntax Error: "(" unexpected
Executing it without sudo gives me an error telling me that there is an error in the binaries format.
But how can this be? I just copied it. Or is there a better way to do this? I can not just use pip freeze because there are some packages in /Environment/lib/python2.7/site-packages/ which I wrote myself and I need to copy them, too. As I understand it pip freeze just creates a list of packages which pip then downloads and installs.

Do the following steps on the source machine:
workon [environment_name]
pip freeze > requirements.txt
copy requirements.txt to other PC
On the other PC:
create a virtual environment using mkvirtualenv [environment_name]
workon [environment_name]
pip install -r requirements.txt
You should be done.
Other Resources:
How to Copy/Clone a Virtual Environment from Server to Local Machine

Pip Freeze Not Applicable For You?
Scenario: you have libraries installed on your current system that are very hard to migrate using pip freeze and am talking really hard, because you have to download and install the wheels manually such as gdal, fiona, rasterio, and then even doing so still causes the project to crash because possibly they were installed in the wrong order or the dependencies were wrong and so on.
This is the experience I had when I was brought on board a project.
For such a case, when you finally get the environment right you basically don't want to go through the same hell again when you move your project to a new machine. Which I did, multiple times. Until finally I found a solution.
Now, disclaimer before I move on:
I don't advocate for this method as the best, but it was the best for my case at the time.
I also cannot guarantee it will work when switching between different OSes as I have only tried it between Windows machine. In fact I don't expect it to work when you move from Windows to other OSs as the structure of the virtualenv folder from Unix-based OS is different from that of Windows.
Finally, the best way to do all of this is to use Docker. My plan is to eventually do so. I have just never used Docker for a non-web-app project before and I needed a quick fix as my computer broke down and the project could not be delayed. I will update this thread when I can once I apply Docker to the project.
THE HACK
So this is what I did:
Install the same base Python on your new machine. If you have 3.9 on the old, install 3.9 on the new one and so on. Keep note of where the executable can be located, usually something like C:\Users\User\Appdata\Local\Programs\Python\PythonXX
Compress your virtual env folder, copy it into the project directory
inside your new machine. Extract all files there
Using text editor of your choice, or preferably IDE, use the 'Search
in all files' feature to look for all occurrences of references to
your old machine paths: C:\Users*your-old-username*
Replace these with your new references. For my case I had to do it in
the following files inside the virtual env folder: pyvenv.cfg, Scripts/activate, Scripts/activate.bat, Scripts/activate.fish and Scripts/activate.nu.
And that's it!
Good luck everyone.

I think what occurs is that you just copy the symbolic links in the source file to the target machine as binary files(no longer links). You should copy it using rsync -l to copy to keep those links.

Usually I use virtualenv to create a new environment, then I go to the environment where I want to copy from, copy all the folders and paste it into the environment folder I just created, but most importantly when asking if you want to replace the Destination files, choose to skip these files. This way you keep your settings.
At least for me, this has worked very well.
I hope it works for you too.

I share my experience.
Suppose another PC does not install Python
Python version: 3.7.3
Platform: Platform: Windows 10, 7 (64bit)
The following is working for me.
Step:
download Windows embeddable zip file
download get-pip.py (because the embeddable zip file does not provide pip)
[Optional] install tkinter, see this article: Python embeddable zip: install Tkinter
Choose a packaging method (I use NSIS: https://nsis.sourceforge.io/Download)
folder artictures:
- main.nsi
- InstallData
- contains: Step1 & Step2
- YourSitePackages # I mean that packages you do not intend to publish to PyPI.
- LICENSE
- MANIFEST.in
- README.rst
- ...
- requirements.txt
- setup.py
The abbreviated content about main.nsi is as follows:
!define InstallDirPath "$PROGRAMFILES\ENV_PYTHON37_X64"
!define EnvScriptsPath "${InstallDirPath}\Scripts"
...
CreateDirectory "${InstallDirPath}" # Make sure the directory exists before the writing of Uninstaller. Otherwise, it may not write correctly!
SetOutPath "${InstallDirPath}"
SetOverwrite on
File /nonfatal /r "InstallData\*.*"
SetOutPath "${InstallDirPath}\temp"
SetOverwrite on
File /nonfatal /r "YourSitePackages\*.*"
nsExec::ExecToStack '"${InstallDirPath}\python.exe" "${InstallDirPath}\get-pip.py"' # install pip
nsExec::ExecToStack '"${InstallDirPath}\Scripts\pip.exe" install "${InstallDirPath}\temp\."' # install you library. same as: `pip install .`
RMDir /r "${InstallDirPath}\temp" # remove source folder.
...
/*
Push ${EnvScriptsPath} # Be Careful about the length of the HKLM.Path. it is recommended to write it to the HKCU.Path, it is difficult for the user path to exceed the length limit
Call AddToPath # https://nsis.sourceforge.io/Path_Manipulation
*/
hope someone will benefit from this.

Finish installation of PyMySQL on Mac running Yosemite

I am trying to create a local development space on my laptop, running Apache-MySQL-Python. I have each component installed, but am having difficulty connecting Python to MySQL. I have used these instructions, including installing pip and PyMySQL: https://github.com/PyMySQL/PyMySQL#installation
When I get to the part that says to enter this:
$ cp .travis.databases.json pymysql/tests/databases.json
I get this:
$ cp .travis.databases.json pymysql/tests/databases.json
cp: .travis.databases.json: No such file or directory
I can't locate the .travis.databases.json file (I have hidden files showing), even though my $PATH is:
/Library/Frameworks/Python.framework/Versions/3.4/bin:/opt/local/bin:/opt/local/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:
Is my path wrong, or is there something else I'm missing? If it makes a difference, all of my tools (installers, pkgs, etc.) are in a folder on my desktop. Apache Server is up and running, too.

Did you install it from source or using pip? .travis.databases.json is used only for running the test suite and if you're building from source then it is here in the git repo. If you installed it using pip then you'll want to copy that file locally.

Error while loading shared libraries: libreadline.so.5:

I'm trying to run the command sudo pip install --upgrade virtualenv, but I keep receiving the following error:
/opt/bitnami/python/bin/.python2.7.bin: error while loading shared libraries:
libreadline.so.5: cannot open shared object file: No such file or directory
I've attempted to use the recommendation on this link [Bitnami - /opt/bitnami/python/bin/.python2.7.bin: error while loading shared libraries: libreadline.so.5](Bitnami - /opt/bitnami/python/bin/.python2.7.bin: error while loading shared libraries: libreadline.so.5 and no prevai), but it was not helpful.
Why do I receive the error?

I figured this out.
You have to be in the root level by issueing the sudo su command.
Now while in root level run the following command . /opt/bitnami/scripts/setenv.sh
I'm loggin into my server using SSH, apparently I have to follow the same steps every session.

installing virtualenv using pip installs it in bitnami stack
hence to use virtualenv we need to execute setenv.sh shell script
this script gives powers to virtualenv but we need to run it everytime
so better to install virtualenv in root of the system using sudo apt-get
install virtualenv in root
sudo apt-get install python-virtualenv

So while maplesyrup's answer is good, I have found a solution that works better in practice.
Run sudo echo '. /opt/bitnami/scripts/setenv.sh' >> /opt/bitnami/.bitnamirc
This will append the script call in maplesyrup's answer, but then it will be called at every logon. The only downside is you have to enter your password immediately after logging in through ssh, but it is much better than having to manually call the script each time you login.

The required file is not in directory. usually this happened because the update which replaced the certain version of the file and with the newer version (e.g. libreadline.so.5 replaced by libreadline.so.8). to fix this, first you should check the library directory (/usr/lib) if a version version of the file is exist then you can create a link to that that file named with the file that is missing.
the following example is creating a link named with the missing file (libreadline.so.5) that linked to libreadline.so.8. but be CAREFULL because this might cause your terminal unable to get input if the certain libreadline.so.* is lost
cd /usr/lib
ln -sf libreadline.so.8 -T libreadline.so.5
this solution works for me.

How to install Flask on Windows?

I have a project to do for after create a webpage that display the latest weather from my CSV file.
I would like some details how to do it (don't really get the http://flask.pocoo.org/docs/installation/#installation installation setup)
Can anyone mind explain me how to do it simply?
Thanks.
I'm running on Windows 7, with the Windows Powershell.

Install pip as described here: How do I install pip on Windows?
Then do
pip install flask
That installation tutorial is a bit misleading, it refers to actually running it in a production environment.

First install flask using pip,
pip install Flask
* If pip is not installed then install pip
Then copy below program (hello.py)
from flask import Flask
app = Flask(__name__)
#app.route("/")
def hello():
return "Hello World!"
if __name__ == "__main__":
app.run()
Now, run the program
python hello.py
Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
Just copy paste the above address line in your browser.
Reference: http://flask.pocoo.org/

Assuming you are a PyCharm User, its pretty easy to install Flask
This will help users without shell pip access also.
Open Settings(Ctrl+Alt+s) >>
Goto Project Interpreter>>
Double click pip>> Search for flask
Select and click Install Package ( Check Install to site users if intending to use Flask for this project alone
Done!!!
Cases in which flask is not shown in pip:
Open Manage Repository>>
Add(+) >> Add this following url
https://www.palletsprojects.com/p/flask/
Now back to pip, it will show related packages of flask,
select flask>>
install package>>
Voila!!!

https://www.youtube.com/watch?v=QjtW-wnXlUY&t=38s
Follow as in the url
This is how i do :
1) create an app.py in Sublime Text or Pycharm, or whatever the ide, and in that app.py have this code
from flask import Flask
app = Flask(__name__)
#app.route('/')
def helloWorld():
return'<h1>Hello!</h1>'
This is a very basic program to printout a hello , to test flask is working.I would advise to create app.py in a new folder, then locate where the folder is on command prompt
enter image description here
type in these line of codes on command prompt
>py -m venv env
>env\Scripts\activate
>pip install flask
Then
>set FLASK_APP=app.py
>flask run
Then press enter all will work
The name of my file is app.py, give the relevant name as per your file in code line
set FLASK_APP=app.py
Also if your python path is not set, in windows python is in AppData folder its hidden, so first have to view it and set the correct path under environment variables. This is how you reach environment variables
Control panel ->> system and security ->> system ->> advanced system setting
Then in system properties you get environment variables

On Windows, installation of easy_install is a little bit trickier, but still quite easy. The easiest way to do it is to download the distribute_setup.py file and run it. The easiest way to run the file is to open your downloads folder and double-click on the file.
Next, add the easy_install command and other Python scripts to the command search path, by adding your Python installation’s Scripts folder to the PATH environment variable. To do that, right-click on the “Computer” icon on the Desktop or in the Start menu, and choose “Properties”. Then click on “Advanced System settings” (in Windows XP, click on the “Advanced” tab instead). Then click on the “Environment variables” button. Finally, double-click on the “Path” variable in the “System variables” section, and add the path of your Python interpreter’s Scripts folder. Be sure to delimit it from existing values with a semicolon. Assuming you are using Python 2.7 on the default path, add the following value:
;C:\Python27\Scripts
And you are done! To check that it worked, open the Command Prompt and execute easy_install. If you have User Account Control enabled on Windows Vista or Windows 7, it should prompt you for administrator privileges.
Now that you have easy_install, you can use it to install pip:
easy_install pip

First: I assumed you already have Python 2.7 or 3.4 installed.
1: In the Control Panel, open the System option (alternately, you can right-click on My Computer and select Properties). Select the “Advanced system settings” link.
In the System Properties dialog, click “Environment Variables”.
In the Environment Variables dialog, click the New button underneath the “System variables” section.
if someone is there that above is not working, then kindly append to your PATH with the C:\Python27 then it should surely work. C:\Python27\Scripts
Run this command (Windows cmd terminal): pip install virtualenv
If you already have pip, you can upgrade them by running:
pip install --upgrade pip setuptools
Create your project. Then, run virtualenv flask

heres a step by step procedure (assuming you've already installed python):
first install chocolatey:
open terminal (Run as Administrator) and type in the command line:
C:/> #powershell -NoProfile -ExecutionPolicy Bypass -Command "iex ((new-object net.webclient).DownloadString('https://chocolatey.org/install.ps1'))" && SET PATH=%PATH%;%ALLUSERSPROFILE%\chocolatey\bin
it will take some time to get chocolatey installed on your machine. sit back n relax...
now install pip. type in terminal
cinst easy.install pip
now type in terminal:
pip install flask
YOU'RE DONE !!!
Tested on Win 8.1 with Python 2.7

I have windows 10 and pythonv3.5. #uku answer is correct. however, problem I was facing is that where are python scripts which are to be added in environment variable. So I found out that we need to add
C:\Users\\AppData\Local\Programs\Python\Python35\Scripts
above location as environment variable. If it still doesnot work search for python in C Drive then find out script locations.

If You are using windows then go to python installation path like.
D:\Python37\Scripts>pip install Flask
it take some movement to download the package.

you are a PyCharm User, its good easy to install Flask
First open the pycharm press
Open Settings(Ctrl+Alt+s)
Goto Project Interpreter
Double click pip>>
search bar (top of page) you search the flask and click install package
such Cases in which flask is not shown in pip: Open Manage Repository>> Add(+) >> Add this following url
https://www.palletsprojects.com/p/flask/
Now back to pip, it will show related packages of flask,
select flask>>
install package

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.