CS231n

Content
  1. 1. 2

Note for CS231n

  1. Data preprocessing: Normalize the features in your data to have zero mean and unit variance before feed into the deep model.
  2. Split data tricks: This setting depends on how many hyperparameters you have and how much of an influence you expect them to have.

kNN defaults:

  1. remember all the data, i.e. store all the data into storage
  2. classify a single image cost much.(every classification need much computation)

2

Approach of neural network to do the classification task:

  1. score function: maps the raw data to class scores.
  2. loss function: quantify the agreement between the predicted scores.
    cast as an optimization problem.

single matrix multiplication WxiWxi is effectively evaluating 10 separate classifiers in parallel where each classifier is a row of W.

classifying the test image involves a single matrix multiplication and addition, faster than comparing each test images to all training images which kNN use.
That’s also why we do like to use deep model although the training part is time-consuming, the test part is just a matrix multiplication and addition with complex O(n).

Depending on precisely what values we set for these weights,function has the capacity to like or dislike (depending on the sign of each weight) certain colors at certain positions in the image. That is to say the color have influence on the accuracy of classification.

Caffe Training Experiments

Caffe is an open-source deep learning framework originally created by Yangqing Jia which allows you to leverage your GPU for training neural networks. As opposed to other deep learning frameworks like Theano or Torch you don’t have to program the algorithms yourself; instead you specify your network by means of configuration files. Obviously this approach is less time consuming than programming everything on your own, but it also forces you to stay within the boundaries of the framework, of course. Practically though this won’t matter most of the time as the framework Caffe provides is quite powerful and continuously advanced.

Defining the Model and Meta-Parameters

Training of a model and its application requires at least three configuration files. The format of those configuration files follows an interface description language called protocol buffers. It supeficially resembles JSON but is significantly different and actually supposed to replace it in use cases where the data document needs to be validateable (by means of a custom schema – like this one for Caffe) and serializable.

For training you need one prototxt-file keeping the meta-parameters (config.prototxt) of the training and the model and another for defining the graph of the network (model_train_test.prototxt) – connecting the layers in an acyclical and directed fashion. Note that the data flows from bottom to top with regards to how the order of layers is specified. The example network here is composed of five layers:

  1. data layer (one for TRAINing and one for TESTing)
  2. inner product layer (the weights I)
  3. rectified linear units (the hidden layer)
  4. inner product layer (the weights II)
  5. output layer (Soft Max for classification)
    A. soft max layer giving the loss
    B. accuracy layer – so we can see how the network improves while training.

Blog Configuration

Creat a tags cloud page

generate a new page named “tags”.

1
hexo new page "tags"

edit the new generated file “./source/tags/index.md”

1
2
3
4
5
title: All tags
date: 2014-12-22 12:39:04
type: "tags"
comments: false ## do not activate the disqus function
---

add the “tags” into “menu”: open the _config.yml in blog root path and add:

1
2
3
4
menu:
home: /
archives: /archives
tags: /tags