Original training script
Suppose you have a Python script that trains a model (see below). Your goal is to find the hyperparameters that maxmimizes the validation accuracy(val_acc).
In your Python script, you define two functions: train_one_epoch and evaluate_one_epoch. The train_one_epoch function simulates training for one epoch and returns the training accuracy and loss. The evaluate_one_epoch function simulates evaluating the model on the validation data set and returns the validation accuracy and loss.
You define a configuration dictionary (config) that contains hyperparameter values such as the learning rate (lr), batch size (batch_size), and number of epochs (epochs). The values in the configuration dictionary control the training process.
Next you define a function called main that mimics a typical training loop. For each epoch, the accuracy and loss is computed on the training and validation data sets.
This code is a mock training script. It does not train a model, but simulates the training process by generating random accuracy and loss values. The purpose of this code is to demonstrate how to integrate W&B into your training script.
val_acc).
Training script with W&B Python SDK
How you integrate W&B to your Python script or notebook depends on how you manage sweeps. You can start a sweep job within a Python notebook or script or from the command line.- Python script or notebook
- CLI
Add the following to your Python script:
- Create a dictionary object where the key-value pairs define a sweep configuration. The sweep configuration defines the hyperparameters you want W&B to explore on your behalf along with the metric you want to optimize. Continuing from the previous example, the batch size (
batch_size), epochs (epochs), and the learning rate (lr) are the hyperparameters to vary during each sweep. You want to maximize the accuracy of the validation score so you set"goal": "maximize"and the name of the variable you want to optimize for, in this caseval_acc("name": "val_acc"). - Pass the sweep configuration dictionary to
wandb.sweep(). This initializes the sweep and returns a sweep ID (sweep_id). For more information, see Initialize sweeps. - At the top of your script, import the W&B Python SDK (
wandb). - Within your
mainfunction, use thewandb.init()API to generate a background process to sync and log data as a W&B Run. Pass the project name as a parameter to thewandb.init()method. If you do not pass a project name, W&B uses the default project name. - Fetch the hyperparameter values from the
wandb.Run.configobject. This allows you to use the hyperparameter values defined in the sweep configuration dictionary instead of hard coded values. - Log the metric you are optimizing for to W&B using
wandb.Run.log(). You must log the metric defined in your configuration. For example, if you define the metric to optimize asval_acc, you must logval_acc. If you do not log the metric, W&B does not know what to optimize for. Within the configuration dictionary (sweep_configurationin this example), you define the sweep to maximize theval_accvalue. - Start the sweep with
wandb.agent(). Provide the sweep ID and the name of the function the sweep will execute (function=main), and specify the maximum number of runs to try to four (count=4).
Logging metrics to W&B in a sweepYou must log the metric you define and are optimizing for in both your sweep configuration and with The following is an incorrect example of logging the metric to W&B. The metric that is optimized for in the sweep configuration is
wandb.Run.log(). For example, if you define the metric to optimize as val_acc within your sweep configuration, you must also log val_acc to W&B. If you do not log the metric, W&B does not know what to optimize for.val_acc, but the code logs val_acc within a nested dictionary under the key validation. You must log the metric directly, not within a nested dictionary.