문서 내용

여기서는 GPyOpt와 tensorflow를 사용하여, ML Model의 parameter를 최적화하는 실험을 한다.

Data: MNIST
Model: FNN Model
- input: 784-dim
- output: 10-dim
- parameter
  - learning-rate (0.01, 0.5)
  - layer 수 {2, 3}
  - hidden_size {30, 60}
  - batch_size {500, 1000, 2000}

tensorflow model with score

여기서는 먼저 tensorflow model을 만든다. Bayesian Optimization는 model을 학습하고 결과를 받는 함수를 objective function으로 생각하기 때문에 accuracy에 -를 취해주어야 한다.

TODO: Objective function의 scale에 따라서 다르게 최적화가 되는지… 알아봐야할 듯!

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('../MNIST_data', one_hot=True)

Extracting ../MNIST_data/train-images-idx3-ubyte.gz
Extracting ../MNIST_data/train-labels-idx1-ubyte.gz
Extracting ../MNIST_data/t10k-images-idx3-ubyte.gz
Extracting ../MNIST_data/t10k-labels-idx1-ubyte.gz

def objective_function(args):
    '''
    args[0] : layer 개수
    args[1] : hidden size
    '''
    lr_rate = args[0, 0]
    num_layer = args[0, 1].astype(int)
    output_size = args[0, 2].astype(int)
    batch_size = args[0, 3].astype(int)

    
    print("실험 시작\nargs: {}".format(args))
    # reset graph
    tf.reset_default_graph()
    
    x = tf.placeholder(tf.float32, shape=[None, 784])
    y_ = tf.placeholder(tf.float32, shape=[None, 10])
    
    input_size = 784
    remain_num_layer = num_layer-1
    
    new_x = x  # feed_dict에 x로 들어가기 위해서...
    for i in range(remain_num_layer):
        W = tf.Variable(tf.random_normal([input_size, output_size]))
        b = tf.Variable(tf.random_normal([output_size]))
        new_x = tf.nn.relu(tf.matmul(new_x, W) + b)
        input_size = output_size

    W = tf.Variable(tf.random_normal([input_size, 10]))
    b = tf.Variable(tf.random_normal([10]))
    
    y = tf.matmul(new_x, W) + b
    cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_,
                                                                           logits=y))

    global_step = tf.Variable(0, trainable=False)

    lr_rate_decayed = tf.train.exponential_decay(lr_rate,
                                                 global_step,
                                                 10,
                                                 0.99,
                                                 staircase=True)
    train_step = tf.train.GradientDescentOptimizer(lr_rate_decayed).minimize(cross_entropy)

    acc = [0]
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())

        for i in range(1000):
            batch = mnist.train.next_batch(batch_size)
            train_step.run(feed_dict={x: batch[0], y_: batch[1]})

            correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
            if i % 50 is 0:
                new_acc = accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels})
                print("중간 결과: {}".format(new_acc))
                if abs(new_acc - acc[-1]) < 0.0001: #효과가 별로 없으면
                    break
                else:
                    acc.append(new_acc)
                
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        
        acc = accuracy.eval(feed_dict={x : mnist.test.images,
                                       y_: mnist.test.labels})
        print("실험 하나 종료! 결과\nargs: {}, acc: {}".format(args, acc))
        return 1 - acc  # fMin이기 때문에...

Bayesian Optimization

위 코드 마지막을 보면 1-acc로 accuracy에 마이너스를 취해줬지..

이제 다음 코드로 Bayesian Optimization을 하면 끝!

from GPyOpt.methods import BayesianOptimization
import numpy as np

# continuous가 discrete 앞에 위치해야함! BUGBUG!! https://github.com/SheffieldML/GPyOpt/issues/85
domain = [{'name': 'lr_rate',
          'type': 'continuous',
          'domain': (0.01, 0.5),
           'dimensionality': 1},
          {'name': 'num layer',
           'type': 'discrete',
           'domain':(2, 3),
           'dimensionality': 1},
          {'name': 'hidden size',
           'type': 'discrete',
           'domain': (30, 60),
           'dimensionality': 1},
          {'name': 'batch size',
           'type': 'discrete',
           'domain': (500, 1000, 2000),
           'dimensionality': 1}]

# --- Solve your problem
myBopt = BayesianOptimization(f=objective_function,
                              domain=domain,
                              initial_design_numdata=5)
myBopt.run_optimization(max_iter=15)
myBopt.plot_acquisition()

실험 시작
args: [[  3.27519352e-01   3.00000000e+00   6.00000000e+01   1.00000000e+03]]
중간 결과: 0.11749999970197678
...
중간 결과: 0.8468000292778015
중간 결과: 0.8496000170707703
실험 하나 종료! 결과
args: [[   0.5    2.    30.   500. ]], acc: 0.8436999917030334

결과 보기

다음 코드로 결과를 확인할 수 있다.

print(myBopt.x_opt)  # lr_rate, #layer, hidden_size, batch_size

N, _ = myBopt.X.shape
for i in range(N):
    if np.array_equal(myBopt.x_opt, myBopt.X[i, :]):
        print(1 - myBopt.Y[i,:])  # accuracy 변환

[  2.72432763e-01   2.00000000e+00   6.00000000e+01   2.00000000e+03]
[ 0.87239999]

부록

# 돌릴 필요는 없고 참고삼아...
args = np.array([[2, 10, 500, 0.3]])
objective_function(args)

Hulk의 개인 공부용 블로그

GPyOpt - Bayesian Optimization and tensorflow 예제 만들기

문서 내용

tensorflow model with score

Bayesian Optimization

결과 보기

부록

Hulk의 개인 공부용 블로그

GPyOpt - Bayesian Optimization and tensorflow 예제 만들기

문서 내용

tensorflow model with score

Bayesian Optimization

결과 보기

부록

Related Posts

논문 정리 URL 변경 27 Jan 2020

MobileNet 정리 06 Oct 2019

pytorch dataset 정리 30 Sep 2019