How to read the string and long features in tensorflow - python

The tensorflow, I can't read string,long, only short float allowed? Why?
import tensorflow as tf
import numpy as np
# Data sets
IRIS_TRAINING = "seRelFeatures.csv"
IRIS_TEST = "seRelFeatures.csv"
# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TRAINING, target_dtype=np.int)
test_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TEST, target_dtype=np.int)
here is the error
/home/xuejiao/anaconda2/bin/python /home/xuejiao/Desktop/HDSO_DirectAnswer/training_testing/dnn_semiSuper.py
Traceback (most recent call last):
File "/home/xuejiao/Desktop/HDSO_DirectAnswer/training_testing/dnn_semiSuper.py", line 9, in <module>
training_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TRAINING, target_dtype=np.int)
File "/home/xuejiao/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py", line 47, in load_csv
target[i] = np.asarray(ir.pop(target_column), dtype=target_dtype)
File "/home/xuejiao/anaconda2/lib/python2.7/site-packages/numpy/core/numeric.py", line 482, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: invalid literal for long() with base 10: ''
Process finished with exit code 1

Your error is ValueError: invalid literal for long() with base 10: ''. It simply that you are entering empty string instead of an integer (or string presentation of an integer). I'd check data in CSV files.

Actually I solved this problem by myself, this mistake mean
ValueError: invalid literal for long() with base 10: ''
I have some empty cell, but actually I don't have on the view.
After I check it, it cased by I delete the last column but I just delete the content didn't delete the cells, so from view can't find any empty

Related

Iterating through DataLoader (PyTorch): RuntimeError: Expected object of scalar type unsigned char but got scalar type float for sequence element 9

I am new to PyTorch and am running into an expected error. The overall context is trying to build a building segmentation model off of Spacenet imagery. I am forked off of this repo from someone at Microsoft AI who built a segmentation model, and I am just trying to re-run her training scripts.
I've been able to download the data, and do the pre-processing. My issue comes when trying to actually train the model, I am trying to iterate through my DataLoader, and I get the following error message:
RuntimeError: Expected object of scalar type unsigned char but got
scalar type float for sequence element 9.
Snippets of code that are useful:
I have a dataset.py that creates the SpaceNetDataset class and looks like:
import os
# Ignore warnings
import warnings
import numpy as np
from PIL import Image
import torch
from torch.utils.data import Dataset
warnings.filterwarnings('ignore')
class SpaceNetDataset(Dataset):
"""Class representing a SpaceNet dataset, such as a training set."""
def __init__(self, root_dir, splits=['trainval', 'test'], transform=None):
"""
Args:
root_dir (string): Directory containing folder annotations and .txt files with the
train/val/test splits
splits: ['trainval', 'test'] - the SpaceNet utilities code would create these two
splits while converting the labels from polygons to mask annotations. The two
splits are created after chipping larger images into the required input size with
some overlaps. Thus to have splits that do not have overlapping areas, we manually
split the images (not chips) into train/val/test using utils/split_train_val_test.py,
followed by using the SpaceNet utilities to annotate each folder, and combine the
trainval and test splits it creates inside each folder.
transform (callable, optional): Optional transform to be applied
on a sample.
"""
self.root_dir = root_dir
self.transform = transform
self.image_list = []
self.xml_list = []
data_files = []
for split in splits:
with open(os.path.join(root_dir, split + '.txt')) as f:
data_files.extend(f.read().splitlines())
for line in data_files:
line = line.split(' ')
image_name = line[0].split('/')[-1]
xml_name = line[1].split('/')[-1]
self.image_list.append(image_name)
self.xml_list.append(xml_name)
def __len__(self):
return len(self.image_list)
def __getitem__(self, idx):
img_path = os.path.join(self.root_dir, 'RGB-PanSharpen', self.image_list[idx])
target_path = os.path.join(self.root_dir, 'annotations', self.image_list[idx].replace('.tif', 'segcls.tif'))
image = np.array(Image.open(img_path))
target = np.array(Image.open(target_path))
target[target == 100] = 1 # building interior
target[target == 255] = 2 # border
sample = {'image': image, 'target': target, 'image_name': self.image_list[idx]}
if self.transform:
sample = self.transform(sample)
return sample
To create the DataLoader, I have something like:
dset_train = SpaceNetDataset(data_path_train, split_tags, transform=T.Compose([ToTensor()]))
loader_train = DataLoader(dset_train, batch_size=train_batch_size, shuffle=True,
num_workers=num_workers)
I then iterate over the data loader by doing something like:
for batch in loader_train:
image_tensors = batch['image']
images = batch['image'].cpu().numpy()
break # take the first shuffled batch
but then I get the error:
Traceback (most recent call last):
File "training/train_aml.py", line 137, in <module> sample_images_train, sample_images_train_tensors = get_sample_images(which_set='train')
File "training/train_aml.py", line 123, in get_sample_images for i, batch in enumerate(loader):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 345, in __next__ data = self._next_data() File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 856, in _next_data return self._process_data(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 881, in _process_data data.reraise()
File "/usr/local/lib/python3.6/dist-packages/torch/_utils.py", line 395, in reraise raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop data = fetcher.fetch(index)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch return self.collate_fn(data)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 74, in <dictcomp> return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate return torch.stack(batch, 0, out=out)
RuntimeError: Expected object of scalar type unsigned char but got scalar type float for sequence element 9.
The error seems quite similar to this one, although I did try a similar solution by casting:
dtype = torch.cuda.CharTensor if torch.cuda.is_available() else torch.CharTensor
for batch in loader:
batch['image'] = batch['image'].type(dtype)
batch['target'] = batch['target'].type(dtype)
but I end up with the same error.
A couple of other things that are weird:
This seems to be non-deterministic. Most of the time I get this error, but some times the code keeps running (not sure why)
The "Sequence Element" number at the end of the error message keeps changing. In this case it was "sequence element 9" sometimes it's "sequence element 2", etc. Not sure why.
Ah nevermind.
Turns out unsigned char comes from C++ where it gives you 0 to 255, so it makes sense that's what it expects from image data.
So I actually fixed this by doing:
image = np.array(Image.open(img_path)).astype(np.int)
target = np.array(Image.open(target_path)).astype(np.int)
inside the SpaceNetDataset class and it seemed to work!

ValueError: invalid literal for int() with base 10: '5df1170a921ee283d8529aa3'

I am trying to retrieve one book from a json of books. When retrieving a single book i get this Value Error message on the line that is in bold.
Error -
show_all_books
page_num = int(request.args.get('pn'))
ValueError: invalid literal for int() with base 10: '5df1170a921ee283d8529aa3'
#app.route("/api/v1.0/books", methods=["GET"])
def show_all_books():
page_num, page_size = 1, 10
if request.args.get('pn'):
***page_num = int(request.args.get('pn'))***
if request.args.get('ps'):
page_size = int(request.args.get('ps'))
page_start = (page_size * (page_num - 1))
You are getting this error because you can't convert the string to integer because it have characters that doesn't allow it:
int('5df1170a921ee283d8529aa3')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '5df1170a921ee283d8529aa3'
If you try with a number:
int('1231')
1231
So I think that the problem is with the data you receive in get('pn'). The default base of the int() function is 10 but you can do this with hexadecimal base:
int('5df1170a921ee283d8529aa3', 16)
29073565845337865796180941475L
You are getting this error because the string passed to the function can not be represented as an integer.
For instance if you were to do:
int('abc')
you'd get the same error, because the string abc can not be represented as an integer.

Traceback (most recent call last) np.int8

I am shown with the given below problem: let me help how to solve.Thanks.
#Traceback (most recent call last):
#File "C:/Users/Admin/PycharmProjects/frec/part3.py", line 15, in
#<module>
#Training_Data.append(np.asarray(images, dtype=np.uint8))
#File "C:\Users\Admin\.virtualenvs\frec\lib\site-
#packages\numpy\core\numeric.py", line 538, in asarray
#return array(a, dtype, copy=False, order=order)
#TypeError: int() argument must be a string, a bytes-like object or a
#number,
#not 'NoneType'
nothing i have no idea how to find solution for it.
for i, files in enumerate(onlyfiles):
image_path = data_path + onlyfiles[i]
images = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
Training_Data.append(np.asarray(images, dtype=np.uint8))
Labels.append(i)
no idea.
Probably, the index's in the for loop.
2 issues.
1) no indentation of the body of the for loop. The code as you have shown does not loop over the last 4 lines. You need to indent these if you want them to be evaluated in the context of the loop. Otherwise, your indexes, i and files are not defined.
for i, files in enumerate(onlyfiles):
image_path = data_path + onlyfiles[i]
images = cv2.imread(image_path,
cv2.IMREAD_GRAYSCALE)
Training_Data.append(np.asarray(images,dtype=np.uint8))
Labels.append(i)
2) You have 2 indexes, but use only 1. Is files defined, or is it extraneous? If it is extra, what you think is going into i, may actually be going into files.
In any case, there is a variable that has been assigned a None value. You need to find out which one.

Reading a single number from another file using python

I am reading a number from a file p2.txt. This file contrains only 1 number which is an integer lets say 10.
test_file = open('p2.txt', 'r')
test_lines = test_file.readlines()
test_file.close()
ferNum= test_lines[0]
print int(ferNum)
when however, I am getting an error
print int(ferNum)
ValueError: invalid literal for int() with base 10: '1.100000000000000000e+01\n'
I can see that it is considering it just as a line. How can I parse that number to a variable? any suggestions? regards
The problem is that even though the value of the number is an integer (11) it is represented in scientific notation so you'd have to read it as a float first.
>>> float('1.100000000000000000e+01\n')
11.0
>>> int('1.100000000000000000e+01\n')
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
int('1.100000000000000000e+01\n')
ValueError: invalid literal for int() with base 10: '1.100000000000000000e+01\n'
You can of course convert first to a float then to an int after that.
>>> int(float('1.100000000000000000e+01\n'))
11

ValueError: could not convert string to float: TF_008_2s

I'm very confused here. I have a code to read a csv file with 2 columns. Here is a piece of data:
1,TF_001_2s
2,TF_002_2s
3,TF_003_2s
4,TF_004_2s
5,TF_005_2s
6,TF_006_2s
7,TF_007_1s
8,TF_008_2s
9,TF_009_2s
10,TF_010_1s
What I want to do is to replace the number of the first column for the name of the second column in a vector. Here is my function:
def rename_flights(self):
index_flight = self.Names.index("Flight ID")
with open('Flight_names.csv','rb') as typical_flights:
typical_flights_read = csv.reader(typical_flights,delimiter=',')
for i in range(len(self.matrix[index_flight])):
flight_id = int(self.matrix[index_flight][i])
for new_typical_flights in typical_flights_read:
if flight_id == int(new_typical_flights[0]):
self.matrix[index_flight][i] = new_typical_flights[1]
This function takes the index of a vector (Names) and uses it in a matrix. This matrix may contain int numbers, and these numbers are those which I want to replace for the string of the csv file. I don't know if I have been enough clear. The problem is in the last line, when I assign the string to the correct position in the matrix. It throws me:
ValueError: could not convert string to float: TF_008_2s
I don't know what could be the fail, because I'm not converting any string to float. I hope you all can help me.
Thanks in advance.
Editing-#1:
Traceback (most recent call last):
File "script_server.py", line 566, in <module>
reading(name)
File "script_server.py", line 552, in reading
info.rename_flights()
File "script_server.py", line 272, in rename_flights
self.results_matrix_rounded[index_flight][i] = new_typical_flights[1]
ValueError: could not convert string to float: TF_008_2s

Categories

Resources