파이프라인에서 하이퍼 매개 변수 튜닝을 수행하는 방법

아티클
09/24/2024

적용 대상:Azure CLI ml 확장 v2(현재)Python SDK azure-ai-ml v2(현재)

이 문서에서는 Azure Machine Learning CLI v2 또는 Python v2용 Azure Machine Learning SDK를 사용하여 Azure Machine Learning 파이프라인에서 하이퍼 매개 변수 튜닝을 자동화하는 방법을 알아봅니다.

하이퍼 매개 변수는 모델 학습 프로세스를 제어할 수 있게 하는 조정 가능한 매개 변수입니다. 하이퍼 매개 변수 튜닝은 최상의 성능을 제공하는 하이퍼 매개 변수의 구성을 찾는 프로세스입니다. Azure Machine Learning을 사용하면 하이퍼 매개 변수 튜닝을 자동화하고 병렬 실험을 실행하여 하이퍼 매개 변수를 효율적으로 최적화할 수 있습니다.

필수 조건

Azure Machine Learning 계정 및 작업 영역이 있습니다.
Azure Machine Learning 파이프라인 및 모델을 튜닝하는 하이퍼 매개 변수 해석

다음 예제는 Azure Machine Learning 예제 리포지토리의 파이프라인 비우기(hyperdrive)를 사용하여 파이프라인 작업을 실행에서 제공됩니다. 구성 요소를 사용하여 파이프라인을 만드는 방법에 대한 자세한 내용은 Azure Machine Learning CLI 구성 요소를 사용하여 기계 학습 파이프라인 만들기 및 실행을 참조하세요.

하이퍼 매개 변수 입력을 사용하여 명령 구성 요소 만들기

Azure Machine Learning 파이프라인에는 하이퍼 매개 변수 입력이 있는 명령 구성 요소가 있어야 합니다. 예시 프로젝트의 다음 train.yml 파일은 c_value, kernel 및 coef 하이퍼 매개 변수 입력을 trial 구성 요소를 정의하고 ./train-src 폴더에 있는 소스 코드를 실행합니다.

$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
type: command

name: train_model
display_name: train_model
version: 1

inputs: 
  data:
    type: uri_folder
  c_value:
    type: number
    default: 1.0
  kernel:
    type: string
    default: rbf
  degree:
    type: integer
    default: 3
  gamma:
    type: string
    default: scale
  coef0: 
    type: number
    default: 0
  shrinking:
    type: boolean
    default: false
  probability:
    type: boolean
    default: false
  tol:
    type: number
    default: 1e-3
  cache_size:
    type: number
    default: 1024
  verbose:
    type: boolean
    default: false
  max_iter:
    type: integer
    default: -1
  decision_function_shape:
    type: string
    default: ovr
  break_ties:
    type: boolean
    default: false
  random_state:
    type: integer
    default: 42

outputs:
  model_output:
    type: mlflow_model
  test_data:
    type: uri_folder
  
code: ./train-src

environment: azureml://registries/azureml/environments/sklearn-1.5/labels/latest

command: >-
  python train.py 
  --data ${{inputs.data}}
  --C ${{inputs.c_value}}
  --kernel ${{inputs.kernel}}
  --degree ${{inputs.degree}}
  --gamma ${{inputs.gamma}}
  --coef0 ${{inputs.coef0}}
  --shrinking ${{inputs.shrinking}}
  --probability ${{inputs.probability}}
  --tol ${{inputs.tol}}
  --cache_size ${{inputs.cache_size}}
  --verbose ${{inputs.verbose}}
  --max_iter ${{inputs.max_iter}}
  --decision_function_shape ${{inputs.decision_function_shape}}
  --break_ties ${{inputs.break_ties}}
  --random_state ${{inputs.random_state}}
  --model_output ${{outputs.model_output}}
  --test_data ${{outputs.test_data}}

평가판 구성 요소 소스 코드 만들기

이 예제의 소스 코드는 단일 train.py 파일입니다. 이 코드는 비우기 작업의 모든 평가판에서 실행됩니다.

# imports
import os
import mlflow
import argparse

import pandas as pd
from pathlib import Path

from sklearn.svm import SVC
from sklearn.model_selection import train_test_split

# define functions
def main(args):
    # enable auto logging
    mlflow.autolog()

    # setup parameters
    params = {
        "C": args.C,
        "kernel": args.kernel,
        "degree": args.degree,
        "gamma": args.gamma,
        "coef0": args.coef0,
        "shrinking": args.shrinking,
        "probability": args.probability,
        "tol": args.tol,
        "cache_size": args.cache_size,
        "class_weight": args.class_weight,
        "verbose": args.verbose,
        "max_iter": args.max_iter,
        "decision_function_shape": args.decision_function_shape,
        "break_ties": args.break_ties,
        "random_state": args.random_state,
    }

    # read in data
    df = pd.read_csv(args.data)

    # process data
    X_train, X_test, y_train, y_test = process_data(df, args.random_state)

    # train model
    model = train_model(params, X_train, X_test, y_train, y_test)
    # Output the model and test data
    # write to local folder first, then copy to output folder

    mlflow.sklearn.save_model(model, "model")

    from distutils.dir_util import copy_tree

    # copy subdirectory example
    from_directory = "model"
    to_directory = args.model_output

    copy_tree(from_directory, to_directory)

    X_test.to_csv(Path(args.test_data) / "X_test.csv", index=False)
    y_test.to_csv(Path(args.test_data) / "y_test.csv", index=False)


def process_data(df, random_state):
    # split dataframe into X and y
    X = df.drop(["species"], axis=1)
    y = df["species"]

    # train/test split
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=random_state
    )

    # return split data
    return X_train, X_test, y_train, y_test


def train_model(params, X_train, X_test, y_train, y_test):
    # train model
    model = SVC(**params)
    model = model.fit(X_train, y_train)

    # return model
    return model


def parse_args():
    # setup arg parser
    parser = argparse.ArgumentParser()

    # add arguments
    parser.add_argument("--data", type=str)
    parser.add_argument("--C", type=float, default=1.0)
    parser.add_argument("--kernel", type=str, default="rbf")
    parser.add_argument("--degree", type=int, default=3)
    parser.add_argument("--gamma", type=str, default="scale")
    parser.add_argument("--coef0", type=float, default=0)
    parser.add_argument("--shrinking", type=bool, default=False)
    parser.add_argument("--probability", type=bool, default=False)
    parser.add_argument("--tol", type=float, default=1e-3)
    parser.add_argument("--cache_size", type=float, default=1024)
    parser.add_argument("--class_weight", type=dict, default=None)
    parser.add_argument("--verbose", type=bool, default=False)
    parser.add_argument("--max_iter", type=int, default=-1)
    parser.add_argument("--decision_function_shape", type=str, default="ovr")
    parser.add_argument("--break_ties", type=bool, default=False)
    parser.add_argument("--random_state", type=int, default=42)
    parser.add_argument("--model_output", type=str, help="Path of output model")
    parser.add_argument("--test_data", type=str, help="Path of output model")

    # parse args
    args = parser.parse_args()

    # return args
    return args


# run script
if __name__ == "__main__":
    # parse args
    args = parse_args()

    # run main function
    main(args)

참고 항목

파이프라인 파일의 primary_metric 값과 정확히 동일한 이름의 평가판 구성 요소 소스 코드에서 메트릭을 기록해야 합니다. 이 예제에서는 기계 학습 실험을 추적하는 데 권장되는 방법인 mlflow.autolog()(을)를 사용합니다. MLflow에 대한 자세한 내용은 MLflow를 사용한 ML 실험 및 모델 추적을 참조하세요.

하이퍼 매개 변수 비우기 단계를 사용하여 파이프라인 만들기

Azure CLI
Python SDK

train.yml에 정의된 명령 구성 요소를 고려할 때, 다음 코드는 2단계 train 및 predict 파이프라인 정의 파일을 만듭니다. sweep_step에서 필요한 단계 유형은 sweep이고 trial 구성에 대한 c_value, kernel 및 coef 하이퍼 매개 변수 입력이 search_space에 추가 되었습니다.

다음 예제에서는 sweep_step 하이퍼 매개 변수 튜닝을 하이라이트 합니다.

$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline
display_name: pipeline_with_hyperparameter_sweep
description: Tune hyperparameters using TF component
settings:
    default_compute: azureml:cpu-cluster
jobs:
  sweep_step:
    type: sweep
    inputs:
      data: 
        type: uri_file
        path: wasbs://datasets@azuremlexamples.blob.core.windows.net/iris.csv
      degree: 3
      gamma: "scale"
      shrinking: False
      probability: False
      tol: 0.001
      cache_size: 1024
      verbose: False
      max_iter: -1
      decision_function_shape: "ovr"
      break_ties: False
      random_state: 42
    outputs:
      model_output:
      test_data:
    sampling_algorithm: random
    trial: ./train.yml
    search_space:
      c_value:
        type: uniform
        min_value: 0.5
        max_value: 0.9
      kernel:
        type: choice
        values: ["rbf", "linear", "poly"]
      coef0:
        type: uniform
        min_value: 0.1
        max_value: 1
    objective:
      goal: minimize
      primary_metric: training_f1_score
    limits:
      max_total_trials: 5
      max_concurrent_trials: 3
      timeout: 7200

  predict_step:
    type: command
    inputs:
      model: ${{parent.jobs.sweep_step.outputs.model_output}}
      test_data: ${{parent.jobs.sweep_step.outputs.test_data}}
    outputs:
      predict_result:
    component: ./predict.yml

v2 SDK에서 .sweep() 메서드를 호출하여 명령 구성 요소에 대해 하이퍼 매개 변수 튜닝을 사용하도록 설정할 수 있습니다. 다음 파이프라인 정의는 train_model에 대한 비우기를 사용하도록 설정하는 방법을 보여 줍니다.

이 예제에서는 먼저 train.yml 파일에 정의된 train_component_func(을)를 로드합니다. train_model(을)를 만들기 위해 코드는 c_value, kernel 및 coef0 하이퍼 매개 변수를 검색 공간에 추가합니다. sweep_step(은)는 primary_metric, sampling_algorithm 및 기타 매개 변수를 정의합니다.

train_component_func = load_component(source="./train.yml")
score_component_func = load_component(source="./predict.yml")

# define a pipeline
@pipeline()
def pipeline_with_hyperparameter_sweep():
    """Tune hyperparameters using sample components."""
    train_model = train_component_func(
        data=Input(
            type="uri_file",
            path="wasbs://datasets@azuremlexamples.blob.core.windows.net/iris.csv",
        ),
        c_value=Uniform(min_value=0.5, max_value=0.9),
        kernel=Choice(["rbf", "linear", "poly"]),
        coef0=Uniform(min_value=0.1, max_value=1),
        degree=3,
        gamma="scale",
        shrinking=False,
        probability=False,
        tol=0.001,
        cache_size=1024,
        verbose=False,
        max_iter=-1,
        decision_function_shape="ovr",
        break_ties=False,
        random_state=42,
    )
    sweep_step = train_model.sweep(
        primary_metric="training_f1_score",
        goal="minimize",
        sampling_algorithm="random",
        compute="cpu-cluster",
    )
    sweep_step.set_limits(max_total_trials=20, max_concurrent_trials=10, timeout=7200)

    score_data = score_component_func(
        model=sweep_step.outputs.model_output, test_data=sweep_step.outputs.test_data
    )


pipeline_job = pipeline_with_hyperparameter_sweep()

# set pipeline level compute
pipeline_job.settings.default_compute = "cpu-cluster"

전체 비우기 작업 스키마는 CLI(v2) 비우기 작업 YAML 스키마를 참조하세요.

하이퍼 매개 변수 튜닝 파이프라인 작업 제출

이 파이프라인 작업을 제출한 후 Azure Machine Learning은 sweep_step에서 정의한 검색 공간 및 제한에 따라 하이퍼 매개 변수를 비우기 위해 trial 구성 요소를 여러 번 실행합니다.

스튜디오에서 하이퍼 매개 변수 튜닝 결과 보기

파이프라인 작업을 제출한 후 SDK 또는 CLI 위젯은 Azure Machine Learning 스튜디오 UI의 파이프라인 그래프에 대한 웹 URL 링크를 제공합니다.

하이퍼 매개 변수 튜닝 결과를 보려면 파이프라인 그래프에서 비우기 단계를 두 번 클릭하고 세부 정보 패널에서 자식 작업 탭을 선택한 다음 자식 작업을 선택합니다.

자식 작업 페이지에서 평가판 탭을 선택하여 모든 자식 실행에 대한 메트릭을 보고 비교합니다. 자식 실행을 선택하여 해당 실행에 대한 세부 정보를 확인합니다.

자식 실행이 실패한 경우 자식 실행 페이지에서 출력 + 로그 탭을 선택하여 유용한 디버그 정보를 볼 수 있습니다.

다음을 통해 공유

파이프라인에서 하이퍼 매개 변수 튜닝을 수행하는 방법

필수 조건

하이퍼 매개 변수 튜닝 파이프라인 만들기 및 실행

하이퍼 매개 변수 입력을 사용하여 명령 구성 요소 만들기

평가판 구성 요소 소스 코드 만들기

하이퍼 매개 변수 비우기 단계를 사용하여 파이프라인 만들기

하이퍼 매개 변수 튜닝 파이프라인 작업 제출

스튜디오에서 하이퍼 매개 변수 튜닝 결과 보기

피드백

추가 리소스

다음을 통해 공유

파이프라인에서 하이퍼 매개 변수 튜닝을 수행하는 방법

필수 조건

하이퍼 매개 변수 튜닝 파이프라인 만들기 및 실행

하이퍼 매개 변수 입력을 사용하여 명령 구성 요소 만들기

평가판 구성 요소 소스 코드 만들기

하이퍼 매개 변수 비우기 단계를 사용하여 파이프라인 만들기

하이퍼 매개 변수 튜닝 파이프라인 작업 제출

스튜디오에서 하이퍼 매개 변수 튜닝 결과 보기

관련 콘텐츠

피드백

추가 리소스