Introduction
For an elaborated introduction to machine learning we would like to refer to the lecture of Nando de Freitas (University of Oxford). Lecture notes are available on his homepage. The lectures are available on Youtube. A brief and illustrative example how convolutional neural networks (CNNs) work is given in Brandon Rohrer's 'Data Science and Robots' blog. Two simple image categories (X and O images) are used. The figure below illustrates how an exemplary CNN solves this task.
Our CNN has three convolutional layers (16, 32 and 32 filters) with pooling and activation layers followed by a fully connected layer and softmax classification. Exemplary filters and feature maps for an example of each category are plotted. While the first convolution layer detects simple features (e.g. edges) and has 2D weight matrices, higher convolutional layers combine multiple (lower level) features at different spatial positions (illustrated by red lines in the figure) and have 3D weight matrices. This hierarchy of feature detection is the core of CNN function.
Matlab训练cnn结果是一个net,但是matlab改结构比较困难,所以想用caffe训练出model,然后将model导入matlab使用。因为我们大部分都是用matlab的。所以想问下怎么做。 显示全部. Caffe: a fast open framework for deep learning. Contribute to BVLC/caffe development by creating an account on GitHub.
CNNs with Matlab
The example figures above were generated with Matlab. Convolutional Neural Networks were introduced in the Neural Network Toolbox in Matlab R2016a(e.g. Webinare on CNNs with Matlab).
Here is our corresponding Matlab code for training the CNN and image classification. The RAW circle and cross image files are available here. The code is also awailable on GitHub.
CNNs with Caffe
The Caffe framework offers more flexible CNN architectures than Matlab and is highly optimized for speed (CUDA and CuDNN support). It is developed by the Berkeley Vision and Learning Center (BVLC) and is released under the BSD 2-Clause license. Nvidia Digits is based on Caffe and can be used as GUI and convenient interface for multi-GPU systems. For demonstration purpose we also implemented the X' and O' example from above in Caffe.
First the network structure has to be defined (ANet_3conv.prototxt). Solver instructions are given in a seperate file (ANet_3conv_solver.prototxt). To test single imput images we wrote this Python program (ClassifySingleImagesWithPython.py). Note that you need a deploy file (ANet_3conv_deploy.prototxt) as described here.
This ZIP archive contains the corresponding Caffe code for training the CNN and image classification and the RAW circle and cross images (HDF5 and List format). The code is also awailable on GitHub.
CNNs with TensorFlow
The TensorFlow framework for machine learning also offers flexible CNN architectures and is optimized for speed. TensorFlow is developed by Google and is published under the Apache open source license 2.0. For demonstration purpose we also implemented the X' and O' example from above in TensorFlow.
You just need the following two Python files TensorFlow_XO_example_2-categories.py and TensorFlow_XO_dataReadIn.py. The RAW circle and cross image files are available here.
Acknowledgment
Dr. Alexander Hanuschkin gratefully acknowledges the support of NVIDIA Corporation for our research.
CS 2770: Homework 1 (Matlab Version)
Due: 2/9/2017, 11:59pmIn this homework assignment, you will use a deep network to perform image categorization. You will first use a pretrained network (trained on a different problem) to extract features. You will then use these features to train a SVM classifier which discriminates between 20 object categories. You will then train a network (with weights initialized from the same pre-trained network) and train it on this task. Finally, you will compare the performance of the pre-trained network to the network you trained on this problem.
You will use the Caffe package, a very popular deep learning framework for computer vision. Caffe is a C++ framework, but has both Python and Matlab interfaces. This page is for the Matlab interface. We have installed Caffe for you on the nietzsche.cs.pitt.edu server.
Training the CNN in this assignment may take a long time, and several of you will be using the limited computing resources at the same time, so be sure to start this assignment early.
Part I: SSH Basics - Getting Connected to the Server and Transferring Files
- You will be connecting to the server via SSH. If you are using a Windows machine and haven't used SSH before, you will need to first download a SSH client such as PuTTY. You can download PuTTY from here. If you are using a Mac or Linux, you already have SSH installed.
- This server only allows incoming connections from computers in the CS department or via the VPN client. If you are connecting from off campus, you must first install the Pulse VPN client (see here for instructions) in order to connect to the server. Connect to the VPN before trying to ssh to the server.
- If you are on a Mac or Linux, open a terminal and type: ssh nietzsche.cs.pitt.edu and press enter to connect to the server. If you are on Windows, open PuTTY and for the host name, enter nietzsche.cs.pitt.edu and click Open to connect to the server. You will need to enter your departmental username and password when prompted by the server.
- Once you are logged in, you will be taken to your AFS home directory and will probably see a 'public' and 'private' directory (if you have not changed these yourself). Make sure to put any assignment files you are working on in the private directory (or another directory which no one except you can access).
- You can either write your Matlab assignment file on your own computer and transfer it to the server using scp (on Mac or Linux) or WinSCP (you'll need to download this on Windows) to run it on the server or directly write the Matlab assignment file on the server using a text editor such as vim. On Mac or Linux a scp command to copy a file you've written to the server might look like this (where my username is chris):
scp file.m [email protected]:/afs/cs.pitt.edu/usr0/chris/private/
This command will copy the Matlab file from your computer to your AFS storage space. If you are on Windows and install WinSCP, you will be presented with a GUI interface where you can drag and drop files from your computer to your AFS space.
Part II: Setting Up Your Environment and Matlab for Caffe
- Caffe requires libraries to be visible to Matlab for it to work. We need to tell Matlab where these libraries are located. Before starting Matlab, copy paste the following directly into the shell on the server:
bash (press enter after each line)
export LD_LIBRARY_PATH=/tmp/caffe/ffmpeg:/opt/cuda-8.0-cuDNN5.1/lib64:/tmp/caffe/opencv/install/lib:/tmp/caffe/anaconda2/lib:/opt/OpenBLAS/lib:/usr/local/lib
export PATH=/opt/cuda-8.0-cuDNN5.1/bin:/tmp/caffe/anaconda2/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin - To launch Matlab on the server, type matlab -nodisplay. We include the -nodisplay so that Matlab does not attempt to open the GUI interface. You can then begin typing commands as you normally would in Matlab or run a script that you write.
- Matlab needs to know where to find its interface to the Caffe library. The Matlab addpath command allows you to specify the location of files your scripts need to run. Add the following line to the top of your script: addpath('/tmp/caffe/matlab/')
- You will be using a GPU to accelerate your CNNs. There are 4 GPUs on this machine. Type nvidia-smi(before starting Matlab) to view the 4 GPUs on the machine. Look in the center column and you will see four lines like: 0MiB / 11439MiB which show the memory utilization on the GPU. The first number is the current utilization. Note which GPU has the least memory utilization on the machine (this will change depending on who is using what GPUs). Once a model loads on the GPU, the memory is unable to be used by anybody else, so make sure to exit Matlab after you are done doing your work so as not to exclusively hold memory unnecessarily.
- Add the following lines to the top of your script:
caffe.set_device( #ENTER THE GPU NUMBER YOU NOTED ABOVE (0-3) HERE )
caffe.set_mode_gpu()
Part III: Preparing the Dataset for the Experiment
- We need to subtract the mean of the train data from each image before the CNN classifies it. Load the mean image of the train data by using the following command:
image_mean = caffe.io.read_mean('/tmp/caffe/models/data_mean.binaryproto'); - The data for this assignment is located at /tmp/caffe/data/. You will find 20 folders with images in them. Each folder is the category of the image. For each image, you will need to extract image features from the CNN and store them in a variable along with the folder name that the image came from. Later, you will train a linear SVM using these features to predict which folder an image came from. In the second part of the assignment, you will train the network on these images. Note: You can use the Matlab imageSet function with the recursive flag in order to easily get a list of all the images and the folder that they are in.
- You will need to randomly withhold 10% of the images as a validation set for training the CNN. Withhold an additional 10% as a test set for evaluation. You can use the Matlab datasample function with 'replace' set to false to randomly sample from the data (make sure to then remove the images you sample from the list you sampled them from). Make sure to retain your data split for the entire assignment because you will use the same data split for training, validating, and testing the neural network.
Part IV: Using a Pretrained Network as a Feature Extractor
- We will now load in a pretrained CNN model. The model we are loading has been trained on 1.4M images to classify images into 1000 classes (which aren't necessarily the animals we will be classfying). Add the following line to your script:
net = caffe.Net('/tmp/caffe/models/deploy.prototxt', '/tmp/caffe/models/weights.caffemodel', 'test')
The caffe.Net function loads a network model for use in Matlab. The first argument specifies a file containing the network structure which tells Caffe how the various network layers connect. The second argument specifies the learned model to load containing the weights learned during training and copies those weights into the network structure created by the first argument. The final argument tells Caffe to load the network in test mode, rather than train mode. You will see a lot of output appear once you execute this command which you can ignore. - In order to extract the features for an image, first load the image in Matlab using the: caffe.io.load_image('/path/to/image/to/load.jpg') function. Do not use Matlab's imread function. Caffe expects images in BGR format (instead of RGB), needs to have the width and height dimensions flipped, and needs to be in single precision. The load_image function will do all of these things for you automatically. After loading the image, use Matlab's imresize function to resize the image to height and width 227 (which is what this model expects. After resizing the image, subtract the image_mean we loaded previously from the image. You can then run the image through the neural network by using the command net.forward({image});
- Once each image has been run through the neural network, we are ready to extract features from the network for that image. You will be extracting features from the fc8 layer of the network. To extract an image feature from the network for an image, use the command net.blobs('fc8').get_data() Store the features you extract somewhere for training the SVM along with the folder that the image came from.
- Train a linear SVM using Matlab's fitcecoc function on the train set but do not train on the withheld validation set or test set. To specify that Matlab should train a linear SVM, pass the following templateSVM to the fitcecoc function: templateSVM('Standardize',1,'KernelFunction','linear'); Matlab will also automatically standardize your data for you. Note you will not be using the validation set for this part of the assignment.
- Test your SVM on the test set and report the accuracy of the SVM at predicting the folder that the image was in. Also include a confusion matrix of the predictions using the confusionmat function and include it in your submission. What do you observe about the types of errors the network makes?
Part V: Preparing Your Own Network
- Before we train the network, we must first set up the network solver, which contains parameters necessary for training the network. Copy all of the prototxt files from the /tmp/caffe/models directory to your own directory.
- We will begin by editing the solver.prototxt file. You will see the syntax of the file when you open it. Each variable is on its own line and is followed by a colon and then the parameter.
- We will first start out by setting the learning rate. Set the base_lr parameter to be 0.0001. We are using a slightly lower learning rate than usual because our batch size (the number of images we use to compute the gradient at each step) will be small (if we use too high of a learning rate with a small batch size, the error will not decrease because the changes to the weights in the network are too large).
- During training, Caffe will decrease the learning rate after so many iterations so that later training can have less impact on the weights (the idea is that after a few passes through the training data, the weights don't need to change as much). Set the gamma parameter to 0.1. This means that the learning rate will decrease by 10X every stepsize iterations.
- We want to save a copy of the network every epoch (one pass through the train data). To find out how many iterations are in an epoch, first figure out the number of images in your train set (from line 13), then divide that number by the train batch size (which is 8) and round down to the nearest whole number. Set the snapshot parameter to the number of iterations in an epoch.
- Set the snapshot_prefix to the directory you wish to save your trained models.
- Set the net parameter to the full path of where your train_val.prototxt file is.
- Finally, set the stepsize to the number of iterations in 10 epochs (so 10 times the number from snapshot). This means the learning rate will decrease by 10X every 10 epochs.
- Now, we will need to change the train_val.prototxt file to handle our problem. Currently, the network is trained to handle 1000 object classes. We need to change the classifier output so that there are only 20 outputs (for our 20 categories). Find the line: num_output: 1000 and change it to num_output: 20 to accommodate the 20 object classes in our dataset. You will also need to rename the layer you changed since you changed the dimensions of the layer. Search the file for fc8 and rename it to something of your choice (it appears in multiple places, so be sure to change them all). While you are in this file, you can view the overall network structure and see the different layers in the network.
Part VI: Training and Evaluating Your Own Network
- We are now ready to begin training in Matlab. Begin by creating a Caffe solver:
solver = caffe.Solver('Path to your solver.prototxt');
This instantiates the solver in Matlab. However, we don't have enough data to train the network entirely from scratch, so we will initialize the network to the same weights we used before. To do this, type:
solver.net.copy_from('/tmp/caffe/models/weights.caffemodel'); - Write a loop to loop through your train set 25 times (25 epochs). You will process 8 images each iteration. For each iteration, randomly choose 8 images and their labels from your train set (but do not use the same images again in that epoch). Note: Caffe accepts labels as 0 indexed, so your labels should be from 0 to 19, not strings. Load the 8 images and subtract their means as you did in step 15. You will now create an input 'blob' for the Caffe network from the 8 preprocessed images. To do this, concatenate the 8 images along the fourth dimension using the cat(4,...) function to form a Matlab array of shape [227 227 3 8]. Also create a 8x1 labels array which contains an integer from 0 to 19 for each of the images in the input image blob.
- Provide Caffe with the data and labels using these commands:
solver.net.blobs('data').set_data(INPUT MINIBATCH)
solver.net.blobs('label').set_data(INPUT LABELS) - Train the network on the minibatch using solver.step(1). This tells Caffe to perform one update of the weights using your minibatch.
- After each step of the solver, get the value of the 'loss' layer and save it in an array. See step 16 for how to get the value of a layer.
- After each epoch of training, evaluate the model on the validation set. To do this, load and preprocess the images as usual, and run the images through the network by providing the images and their labels as you did in step 24 (you will need to run the images through the network in batches of 8). However, instead of doing net.forward, you need to access the network using solver.net.forward_prefilled(). Do not usesolver.stepbecause we are not training on the validation set. Finally, get the accuracy on each minibatch from the validation set by getting the result of the accuracy layer. Take the average of all of the accuracies of the minibatches in the validation set and you have the accuracy of the network at that epoch.
- After training, you can use the solver.net.save('FILENAME.caffemodel')command to save your final trained network.
- Provide a plot of the train losses in your report. Also, provide a second plot of your validation set accuracies (you should have 25 numbers in this plot).
- Perform Part IV using your trained network instead of the pretrained model. Use your network which had the best accuracy on the validation set. You can reuse all of your code from Part IV. You will need to change the line to point to your network instead of the pretrained model:
net = caffe.Net(MODIFIED DEPLOY FILE, YOUR CAFFEMODEL FILE, 'test') Note: you will also need to modify the deploy.prototxt file to have num_output: 20 and to have the name of the layer that you changed in the train_val.prototxt file (i.e. find all fc8 in the deploy.prototxt file and rename it to whatever name you chose). - Report the accuracy of your network on the train set and test set without the SVM. To do this, you can extract the network's classification scores for each image by accessing the output of the fc8 layer (remember to access your re-named version) and using the class with the max score as the network's prediction to compute the accuracy.
If you need additional help with Matlab Caffe syntax, you may want to consult the Caffe interface tutorial. Scroll down to the 'Use MatCaffe' section. It is short and covers basics of how to create a network, perform input and output, access data blobs, and train a network.
Grading rubric:
- [10 points] Setting up and splitting the data correctly.
- [30 points] Accuracy of pretrained model using SVM, and confusion matrix.
- [40 points] Accuracy of trained model without SVM.
- [10 points] Accuracy of trained model using SVM.
- [10 points] Plot of train losses and validation accuracies.