enchanter.engine¶
BaseRunner¶
- class enchanter.engine.BaseRunner[source]¶
Bases:
abc.ABC,enchanter.engine.saving.RunnerIOA class for creating runners to train PyTorch models.
Examples
>>> from comet_ml import Experiment >>> import torch >>> class Runner(BaseRunner): >>> def __init__(self): >>> super(Runner, self).__init__() >>> self.model = torch.nn.Linear(10, 10) >>> self.optimizer = torch.optim.Adam(self.model.parameters()) >>> self.experiment = Experiment() >>> self.criterion = torch.nn.CrossEntropyLoss() >>> >>> def train_step(self, batch): >>> x, y = batch >>> out = self.model(x) >>> loss = self.criterion(out, y) >>> >>> return {"loss": loss}
Methods
add_loader(mode, loader)A method to register a DataLoader to be used for training etc.
backward(loss)calculate the gradient.
fit(x, y, **kwargs)Scikit-Learn style training method.
freeze()A method to freeze the model’s parameters so that they do not calculate the slope.
The method that prepares the Runner.
log_hyperparams([dic, prefix])logging hyper parameters
predict(x)A method that makes predictions based on the given input.
quite()Quit Runner.
run([phase, verbose, sleep_time])Runners are executed.
train_step(batch)When training the neural network,
train_end(outputs)This method is executed at the end of each step of neural network training.
train_cycle(epoch, loader)This is a training loop for neural net.
train_config(epochs[, checkpoint_path, monitor])This method is used to specify epochs and so on when you execute using the .run() method.
test_step(batch)This method is executed at every 1 step when testing the neural net.
test_end(outputs)This method is executed at the end of each step of neural network testing.
test_cycle(loader)This is a testing loop for neural net.
unfreeze()The method to make it possible to re-learn the parameters fixed by .freeze().
update optimizer
update_scheduler(epoch)Method called to update the value of the scheduler.
val_step(batch)This method is executed at every 1 step when validating the neural net.
val_end(outputs)This method is executed at the end of each step of neural network validating.
val_cycle(epoch, loader)This is a validating loop for neural net.
- add_loader(mode: str, loader: Union[torch.utils.data.dataloader.DataLoader, Any])[source]¶
A method to register a DataLoader to be used for training etc. in a runner.
- Parameters
mode (str) – Specify one of [‘train’, ‘val’, ‘test’].
loader (torch.utils.data.DataLoader) –
Examples
>>> train_loader = DataLoader(...) >>> runner: BaseRunner = ... >>> runner.add_loader("train", train_loader)
- backward(loss: torch.Tensor) None[source]¶
calculate the gradient. If self.scaler is a torch.cuda.amp.GradScaler object, it is automatically processed by amp.
- Parameters
loss (torch.Tensor) –
- Returns
None
- fit(x: numpy.ndarray, y: numpy.ndarray, **kwargs)[source]¶
Scikit-Learn style training method.
- Parameters
x – Training data
y – Label
**kwargs –
- freeze() None[source]¶
A method to freeze the model’s parameters so that they do not calculate the slope.
- initialize() None[source]¶
The method that prepares the Runner. If the variables required for execution, such as self.model, self.optimizer, self.experiment, etc., are not defined, the program will exit with an error message. If there are no problems, pass the model to the CPU or GPU.
- Returns
None
- log_hyperparams(dic: Optional[Dict] = None, prefix: Optional[str] = None) None[source]¶
logging hyper parameters
- Parameters
dic (Dict) –
prefix (str) –
- Returns
None
- predict(x: Union[torch.Tensor, numpy.ndarray]) numpy.ndarray[source]¶
A method that makes predictions based on the given input.
- Parameters
x (Union[torch.Tensor, np.ndarray]) –
- Returns
predict
- quite() None[source]¶
Quit Runner.
When this method is executed, it sends an exit command to comet.ml.
- run(phase: str = 'all', verbose: bool = True, sleep_time: int = 1)[source]¶
Runners are executed. To run it, you must register a data loader using self.add_loader() before.
- Parameters
phase (str) –
train
val
test
all
debug
by specifying one of the above, you can determine the execution phase. Default: all
verbose (bool) – If true, progress is displayed.
sleep_time (int) – The time to wait for data transfer to the comet.ml server (in seconds). Default: 1 (sec).
Notes
If “TypeError” occurs even though there is no mistake in the formula of the monitor specified by .train_config(), the reason may be that it takes a long time to transfer the data to the comet.ml server. Try to set the sleep_time to about 5 seconds.
- Returns
None
- test_cycle(loader: torch.utils.data.dataloader.DataLoader) None[source]¶
This is a testing loop for neural net.
- Parameters
loader –
Returns:
- test_end(outputs: List) Dict[str, torch.Tensor][source]¶
This method is executed at the end of each step of neural network testing.
- Parameters
outputs –
- Returns
You need to return a dictionary.
- test_step(batch: Tuple) Dict[str, torch.Tensor][source]¶
This method is executed at every 1 step when testing the neural net. See train_step() for help.
- Parameters
batch – A tuple containing data & labels get from the PyTorch DataLoader.
- Returns
You need to return a dictionary.
- train_config(epochs: int, checkpoint_path: Optional[str] = None, monitor: Optional[str] = None)[source]¶
This method is used to specify epochs and so on when you execute using the .run() method.
Examples
>>> runner: BaseRunner = ... >>> runner.train_config( >>> epochs=10, >>> checkpoint_path="/path/to/checkpoint_dir", >>> monitor="validate_avg_acc >= 0.75" >>> )
- Parameters
epochs (int) – Specify the number of training epochs.
checkpoint_path – Specify the name of the directory where the checkpoint is stored, and if monitor is not specified, store weights for all epochs.
monitor – Save only the epoch that corresponds to the specified expression. The checkpoint_path must be set together with it.
- Returns
None
Notes
When you specify the monitor argument, be sure to put a space between ‘keyword’, ‘symbol’, and ‘value’.
- train_cycle(epoch: int, loader: torch.utils.data.dataloader.DataLoader) None[source]¶
This is a training loop for neural net.
- Parameters
epoch –
loader (torch.utils.data.DataLoader) –
- train_end(outputs: List) Dict[str, torch.Tensor][source]¶
This method is executed at the end of each step of neural network training.
- Parameters
outputs –
Returns:
- train_step(batch: Tuple) Dict[str, torch.Tensor][source]¶
- When training the neural network,
>>> import torch.nn as nn >>> train_loader: DataLoader = ... >>> model: nn.Module = ... >>> criterion = ... >>> for x, y in train_loader: >>> out = model(x) >>> loss = criterion(out, y)
this method is responsible for the above areas.
- Parameters
batch – A tuple containing data & labels get from the PyTorch DataLoader.
- Returns
You need to return a dictionary with the key ‘loss’.
Examples
>>> def train_step(self, batch): >>> x, y = batch >>> out = self.model(x) >>> loss = nn.functional.cross_entropy(out, y) >>> return {"loss": loss}
- unfreeze() None[source]¶
The method to make it possible to re-learn the parameters fixed by .freeze().
- update_scheduler(epoch: int) None[source]¶
Method called to update the value of the scheduler. It is called after updating the Optimizer.
- Parameters
epoch (int) – Current Epoch
- Returns
None
Examples
>>> from torch.optim import lr_scheduler >>> from enchanter.tasks import ClassificationRunner >>> runner: BaseRunner = ClassificationRunner( >>> model=..., >>> optimizer=..., >>> criterion=..., >>> scheduler=[ >>> lr_scheduler.CosineAnnealingLR(...), lr_scheduler.StepLR(...) >>> ] >>> )
- val_cycle(epoch: int, loader: torch.utils.data.dataloader.DataLoader) None[source]¶
This is a validating loop for neural net.
- Parameters
epoch –
loader –
Returns:
RunnerIO¶
- class enchanter.engine.RunnerIO[source]¶
Bases:
objectA class responsible for loading and saving parameters such as PyTorch model weights and Optimizer state.
Methods
fetch model name
A method to output model weights and Optimizer state as a dictionary.
load_checkpoint(checkpoint)Takes a dictionary with keys
model_state_dictandoptimizer_state_dictand uses them to restore the state of the model and the Optimizer.save([directory, epoch, filename])Save the model and the Optimizer state file in the specified directory.
load(filename[, map_location])Restores the model and Optimizer state based on the specified file.
- load(filename: str, map_location: str = 'cpu')[source]¶
Restores the model and Optimizer state based on the specified file.
- Parameters
filename (str) –
map_location (str) – default: ‘cpu’
- load_checkpoint(checkpoint: Dict[str, collections.OrderedDict])[source]¶
Takes a dictionary with keys
model_state_dictandoptimizer_state_dictand uses them to restore the state of the model and the Optimizer.- Parameters
checkpoint –
- Takes a dictionary with the following keys and values.
model_state_dict: model weightsoptimizer_state_dict: Optimizer state
- save(directory: Optional[str] = None, epoch: Optional[int] = None, filename: Optional[str] = None)[source]¶
Save the model and the Optimizer state file in the specified directory.
Notes
enchanter_checkpoints_epoch_{}.pthfile containsmodel_state_dict&optimizer_state_dict.- Parameters
directory (Optional[str]) –
epoch (Optional[int]) –
filename (Optional[str]) –
enchanter.engine.modules¶
is_jupyter¶
get_dataset¶
- enchanter.engine.modules.get_dataset(x: Union[numpy.ndarray, torch.Tensor], y: Optional[Union[numpy.ndarray, torch.Tensor]] = None) torch.utils.data.dataset.Dataset[source]¶
Generates
torch.utils.data.TensorDatasetbased on the values entered.Examples
>>> import torch >>> x = torch.randn(512, 6) >>> y = torch.randint(0, 9, size=[512]) >>> ds = get_dataset(x, y)
- Parameters
x (Union[np.ndarray, torch.Tensor]) –
y (Optional[Union[np.ndarray, torch.Tensor]]) –
- Returns
torch.utils.data.TensorDataset
fix_seed¶
- enchanter.engine.modules.fix_seed(seed: int, deterministic: bool = False, benchmark: bool = False) None[source]¶
Fixed the
Seedvalue of PyTorch, NumPy, Pure Python Random at once.Examples
>>> import torch >>> import numpy as np >>> fix_seed(0) >>> x = torch.randn(...) >>> y = np.random.randn(...)
- Parameters
seed (int) – random state (sedd)
deterministic (bool) – Whether to ensure reproducibility as much as possible on CuDNN.
benchmark (bool) –
- Returns
None