- The most recent dials and parsnip releases introduced tuning integration for the lightgbm
num_leavesengine argument! The
num_leavesparameter sets the maximum number of nodes per tree, and is an important tuning parameter for lightgbm (tidymodels/dials#256, tidymodels/parsnip#838). With the newest version of each of dials, parsnip, and bonsai installed, tune this argument by marking the
num_leavesengine argument for tuning when defining your model specification:
- Fixed a bug where lightgbm’s parallelism argument
num_threadswas overridden when passed via
paramrather than as a main argument. By default, then, lightgbm will fit sequentially rather than with
num_threads = foreach::getDoParWorkers(). The user can still set
num_threadsvia engine arguments with
engine = "lightgbm":
Note that, when tuning hyperparameters with the tune package, detection of parallel backend will still work as usual.
stop_iternow maps to the
early_stopping_roundrather than its alias
early_stopping_rounds. This does not affect parsnip’s interface to lightgbm (i.e. via
boost_tree() %>% set_engine("lightgbm")), though will introduce errors for code that uses the
train_lightgbm()wrapper directly and sets the
early_stopping_roundby its alias
Disallowed passing main model arguments as engine arguments to
set_engine("lightgbm", ...)via aliases. That is, if a main argument is marked for tuning and a lightgbm alias is supplied as an engine argument, bonsai will now error, rather than supplying both to lightgbm and allowing the package to handle aliases. Users can still interface with non-main
boost_tree()arguments via their lightgbm aliases (#53).
CRAN release: 2022-08-31
- Enabled bagging with lightgbm via the
boost_tree(#32 and tidymodels/parsnip#768). The following docs now available in
?details_boost_tree_lightgbmdescribe the interface in detail:
sample_sizeargument is translated to the
bagging_fractionparameter in the
lgb.train. The argument is interpreted by lightgbm as a proportion rather than a count, so bonsai internally reparameterizes the
sample_sizeargument with [dials::sample_prop()] during tuning.
To effectively enable bagging, the user would also need to set the
bagging_freqargument to lightgbm.
bagging_freqdefaults to 0, which means bagging is disabled, and a
kmeans that the booster will perform bagging at every
kth boosting iteration. Thus, by default, the
sample_sizeargument would be ignored without setting this argument manually. Other boosting libraries, like xgboost, do not have an analogous argument to
k = 1when the analogue to
bagging_fractionis in (0,1). bonsai will thus automatically set
bagging_freq = 1in
bagging_fraction) is not equal to 1 and no
bagging_freqvalue is supplied. This default can be overridden by setting the
Corrected mapping of the
boost_treewith the lightgbm engine.
mtrypreviously mapped to the
lgb.trainbut was documented as mapping to an argument more closely resembling
mtrynow maps to
Fixed error in lightgbm with engine argument
objective = "tweedie"and response values less than 1.
A number of documentation improvements, increases in testing coverage, and changes to internals in anticipation of the 4.0.0 release of the lightgbm package. Thank you to
@jameslambfor the effort and expertise!