Google Drive doesn't recognize my uploaded xes file - python

I tried to upload an xes file into Google Colab and for a new dataset, it just doesn't work.
Google Drive is mounted and the new file is stored exactly where the old one was and I just switched the path.
Here is the code I used:
!pip install pm4py
import pm4py
from pm4py.objects.log.importer.xes import importer as xes_importer
from google.colab import drive
drive.mount('/content/gdrive')
log = xes_importer.apply('/content/gdrive/MyDrive/BPI Challenge 2017.xes.gz')
For another file that is stored in the same folder, the code worked. I tried it with the .gz and the .xes file and it's the same - works for two other datasets I already had but not for multiple other ones I tried.
Now I get the following error message:
File "<string>", line unknown
SyntaxError: invalid or missing encoding declaration for '/tmp/tmp0rajagivgz'
I tried unpacking the file first before uploading it (from gz to xes) but it also doesn't work, just the file name is changing.
I assume, the problem is that Google Drive considers the file a zip file even though the file I uploaded originally isn't. On my desktop, both datasets (the old and the new one after unpacking) are recognized as xes.
I don't know why Drive doesn't upload the new dataset as the file that it is.
Do you have any recommendations?
Any explanation would be helpful.
Best,
Linda
Edit: added the code

Related

How to import python files in Google Colab

So I have been trying to make this file compatible to Google Colab but I'm not able to find any way to do it.
[]
EfficientDet-DeepSORT-Tracker is the main folder of this entire package
This picture is from one of the files placed alongside backbone.py
How to fix the fact that the file isn't able to detect backbone.py?
EDIT for more context: I shared the errors I found when trying to run waymo_open_dataset.py which isn't able to detect the other .py files alongside it.
According to this past question you could import 'filename' of the filename.py. So in the main.py file you are trying to run in colab, then import the required files in the main.py file.

Is it safe to save files in python library?

I wanted to use python turtle and the code didn't work, until I saved the file in the python library. (it didn't work earlier giving an error message saying 'turtle' is not recognised)
When i saved the file in the library, i saved the file as “hello23.py” but it got saved in the library as "pytube-pytube-v12.0.0-0-ge85.py”
It's been saved by the same name of the pytube file. (“pytube” is the package used to download YouTube videos by python.) Why is that?
And I saved another file again by creating a new folder in the library. it got saved fine.
My OS is Mac, Is it safe to save files in the library? I’ve heard not to save user files in the library.

Unzip zip file on Google Colab

I am trying to unzip my zip file on google colab but it is not working for me below are the details
Command used
!unzip "drive/MyDrive/Dog Vision2/dog-breed-identification.zip" -d "drive/MyDrive/Dog Vision2"
Error Message
Archive: drive/MyDrive/Dog Vision2/dog-breed-identification.zip
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of drive/MyDrive/Dog Vision2/dog-breed-identification.zip or
drive/MyDrive/Dog Vision2/dog-breed-identification.zip.zip, and cannot find drive/MyDrive/Dog Vision2/dog-breed-identification.zip.ZIP, period.
File Name Snippet
Note: I am able to unzip same file on my local machine.
My bad, I just saw my zip file was not fully uploaded.
For others, please re-check if the file is uploaded on drive or not, before running unzip command. PFB snippet where you can check it.
Note :
At Bottom of the left panel we can see if the file is uploaded
successfully or not.
Red Highlighted is the zip which was not uploaded successfully.
Green Highlighted is the zip which is in progress.

Files downloaded from Dropbox API come as a Zip

I am trying to download a file from my Dropbox account, however, on Linux (Raspbian) when I execute the line:
dbx = dropbox.Dropbox(TOKEN)
dbx.files_download_to_file(LOCAL_PATH,r'/file.ppsx')
It is downloaded as a zip. I do not have this problem executing the code on Windows. I'd like to note the file is a .ppsx, a PowerPoint presentation file. I have no problem downloading it manually from Dropbox. My question is, how can I circumvent this problem and download it unzipped?
It seems that Dropbox sent the file not as a zip, but rather changed the name of the file to the directory of where it was installed. I circumvented this problem by using the os.rename module. This solved the problem and allowed me to open the file within the same script.

How to upload and save large data to Google Colaboratory from local drive?

I have downloaded large image training data as zip from this Kaggle link
https://www.kaggle.com/c/yelp-restaurant-photo-classification/data
How do I efficiently achieve the following?
Create a project folder in Google Colaboratory
Upload zip file to project folder
unzip the files
Thanks
EDIT: I tried the below code but its crashing for my large zip file. Is there a better/efficient way to do this where I can just specify the location of the file in local drive?
from google.colab import files
uploaded = files.upload()
for fn in uploaded.keys():
print('User uploaded file "{name}" with length {length} bytes'.format(
name=fn, length=len(uploaded[fn])))
!pip install kaggle
api_token = {"username":"USERNAME","key":"API_KEY"}
import json
import zipfile
import os
with open('/content/.kaggle/kaggle.json', 'w') as file:
json.dump(api_token, file)
!chmod 600 /content/.kaggle/kaggle.json
!kaggle config set -n path -v /content
!kaggle competitions download -c jigsaw-toxic-comment-classification-challenge
os.chdir('/content/competitions/jigsaw-toxic-comment-classification-challenge')
for file in os.listdir():
zip_ref = zipfile.ZipFile(file, 'r')
zip_ref.extractall()
zip_ref.close()
There is minor change on line 9, without which was encountering error.
source: https://gist.github.com/jayspeidell/d10b84b8d3da52df723beacc5b15cb27
couldn't add as comment cause rep.
You may refer with these threads:
Import data into Google Colaboratory
Load local data files to Colaboratory
Also check out the I/O example notebook. Example, for access to xls files, you'll want to upload the file to Google Sheets. Then, you can use the gspread recipes in the same I/O example notebook.
You may need to use kaggle-cli module to help with the download.
It’s discussed in this fast.ai thread.
I just wrote this script that downloads and extracts data from the Kaggle API to a Colab notebook. You just need to paste in your username, API key, and competition name.
https://gist.github.com/jayspeidell/d10b84b8d3da52df723beacc5b15cb27
The manual upload function in Colab is kind of buggy now, and it's better to download files via wget or an API service anyway because you start with a fresh VM each time you open the notebook. This way the data will download automatically.
Another option is to upload the data to dropbox (if it can fit), get a download link. Then in the notebook do
!wget link -0 new-name && ls

Categories

Resources