機械学習のお勉強(自作データセットでCNN by pytorch)
- Pytorch tutorial
- DataSetの作成
- DataLoader
- Model Definition
- Training
- total evaluation
- each class evaluation
CNNを用いた簡単な2class分類をしてみる
Pytorch tutorial
Training a Classifier — PyTorch Tutorials 1.4.0 documentation
Transfer Learning for Computer Vision Tutorial — PyTorch Tutorials 1.4.0 documentation
Writing Custom Datasets, DataLoaders and Transforms — PyTorch Tutorials 1.4.0 documentation
DataSetの作成
https://download.pytorch.org/tutorial/hymenoptera_data.zip
ここからアリさんとハチさんのデータセットをダウンロード
ディレクトリ構成
train
- ants
- bees
val
- ants
- beets
となっている。
このように自分で作ったデータセットを用意してあげる。
DataLoader
# -*- coding: utf-8 -*- import torch from torchvision import transforms, datasets # 取り込んだデータに施す処理を指定 data_transform = transforms.Compose([ transforms.RandomSizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) ]) # train data読み込み hymenoptera_dataset = datasets.ImageFolder(root='hymenoptera_data/train', transform=data_transform) dataset_loader = torch.utils.data.DataLoader(hymenoptera_dataset, batch_size=4, shuffle=True, num_workers=4) # test data読み込み hymenoptera_testset = datasets.ImageFolder(root='hymenoptera_data/val', transform=data_transform) dataset_testloader = torch.utils.data.DataLoader(hymenoptera_testset, batch_size=4, shuffle=False, num_workers=4) classes = ('ants', 'bees')
データセットの形は
for i, data in enumerate(dataset_loader, 0): inputs, labels = data print inputs.size() print labels.size() -> (4L, 3L, 224L, 224L) (4L,)
となる。
自作transformsの使い方
class Crop(object): """Crop the image. Args: left_up (tuple): Desired crop left up position. right_down (tuple): Desired crop right down position. """ def __init__(self, left_up, right_down): self._left_up = left_up self._right_down = right_down def __call__(self, img): image = img.crop((self._left_up[0], self._left_up[1], self._right_down[0], self._right_down[1])) return image
これを以下のように使用すれば良い
data_transform = transforms.Compose([ transforms.Scale(224), Crop((50,100),(100,200)), transforms.RandomHorizontalFlip(), transforms.ToTensor(), transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) ])
PILの使い方
pytorchの画像の読み込み、処理はPILを使っている
Model Definition
from torch.autograd import Variable import torch.nn as nn import torch.nn.functional as F class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.pool = nn.MaxPool2d(2, 2) self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(16 * 53 * 53, 120) # (((224 - 4) / 2 ) - 4) / 2 = 53 self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, 2) def forward(self, x): x = self.pool(F.relu(self.conv1(x))) x = self.pool(F.relu(self.conv2(x))) x = x.view(-1, 16 * 53 * 53) x = F.relu(self.fc1(x)) x = F.relu(self.fc2(x)) x = self.fc3(x) return x net = Net()
Training
import torch.optim as optim criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9) for epoch in range(1): # loop over the dataset multiple times running_loss = 0.0 for i, data in enumerate(dataset_loader, 0): # get the inputs inputs, labels = data # wrap them in Variable inputs, labels = Variable(inputs), Variable(labels) # zero the parameter gradients optimizer.zero_grad() # forward + backward + optimize outputs = net(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() # print statistics running_loss += loss.data[0] if i % 10 == 9: # print every 10 mini-batches print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 10)) running_loss = 0.0 print('Finished Training')
total evaluation
correct = 0 total = 0 for data in dataset_testloader: images, labels = data outputs = net(Variable(images)) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum() print('Accuracy of the network on the test images: %d %%' % ( 100 * correct / total))
each class evaluation
class_correct = list(0. for i in range(2)) class_total = list(0. for i in range(2)) for data in dataset_testloader: images, labels = data outputs = net(Variable(images)) _, predicted = torch.max(outputs.data, 1) c = (predicted == labels).squeeze() try: for i in range(4): label = labels[i] class_correct[label] += c[i] class_total[label] += 1 except: break for i in range(2): print('Accuracy of %5s : %2d %%' % ( classes[i], 100 * class_correct[i] / class_total[i]))