| Title: | Hybrid Stepwise Regression with Single-Split Dummy Encoding |
|---|---|
| Description: | Implements 'SplitWise', a hybrid regression approach that transforms numeric variables into either single-split (0/1) dummy variables or retains them as continuous predictors. The transformation is followed by stepwise selection to identify the most relevant variables. The default 'iterative' mode adaptively explores partial synergies among variables to enhance model performance, while an alternative 'univariate' mode applies simpler transformations independently to each predictor. For details, see Kurbucz et al. (2025) <doi:10.48550/arXiv.2505.15423>. |
| Authors: | Marcell T. Kurbucz [aut, cre], Nikolaos Tzivanakis [aut], Nilufer Sari Aslam [aut], Adam M. Sykulski [aut] |
| Maintainer: | Marcell T. Kurbucz <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 1.0.2 |
| Built: | 2026-05-27 09:54:29 UTC |
| Source: | https://github.com/cran/SplitWise |
Transforms each numeric variable into either a single-split
dummy or keeps it linear, then runs stats::step() for stepwise
selection. The user can choose a simpler univariate transformation or an
iterative approach.
splitwise( formula, data, transformation_mode = c("iterative", "univariate"), direction = c("backward", "forward", "both"), min_support = 0.1, min_improvement = 3, criterion = c("AIC", "BIC"), exclude_vars = NULL, verbose = FALSE, steps = 1000, k = 2, ... ) ## S3 method for class 'splitwise_lm' print(x, ...) ## S3 method for class 'splitwise_lm' summary(object, ...) ## S3 method for class 'splitwise_lm' predict(object, newdata, ...) ## S3 method for class 'splitwise_lm' coef(object, ...) ## S3 method for class 'splitwise_lm' fitted(object, ...) ## S3 method for class 'splitwise_lm' residuals(object, ...) ## S3 method for class 'splitwise_lm' model.matrix(object, ...)splitwise( formula, data, transformation_mode = c("iterative", "univariate"), direction = c("backward", "forward", "both"), min_support = 0.1, min_improvement = 3, criterion = c("AIC", "BIC"), exclude_vars = NULL, verbose = FALSE, steps = 1000, k = 2, ... ) ## S3 method for class 'splitwise_lm' print(x, ...) ## S3 method for class 'splitwise_lm' summary(object, ...) ## S3 method for class 'splitwise_lm' predict(object, newdata, ...) ## S3 method for class 'splitwise_lm' coef(object, ...) ## S3 method for class 'splitwise_lm' fitted(object, ...) ## S3 method for class 'splitwise_lm' residuals(object, ...) ## S3 method for class 'splitwise_lm' model.matrix(object, ...)
formula |
A formula specifying the response and (initial) predictors,
e.g. |
data |
A data frame containing the variables used in |
transformation_mode |
Either |
direction |
Stepwise direction: |
min_support |
Minimum fraction (between 0 and 0.5) of observations
needed in either group when making a dummy split. Prevents over-fragmented
or tiny dummy groups. Default = |
min_improvement |
Minimum required improvement (in AIC/BIC units) for
accepting a dummy split or variable transformation. Helps guard against
overfitting from marginal improvements. Default = |
criterion |
Either |
exclude_vars |
A character vector naming variables that should be
forced to remain linear (i.e., no dummy splits allowed).
Default = |
verbose |
Logical; if |
steps |
Maximum number of steps for |
k |
Penalty multiple for the number of degrees of freedom
(used by |
... |
Additional arguments passed to |
x |
A |
object |
An object of class |
newdata |
A data frame of new data (with original predictors) to generate predictions for. The appropriate dummy variables will be generated using the transformation rules learned during model training. If omitted, predictions for the training data are returned. |
An S3 object of class c("splitwise_lm", "lm"), storing:
splitwise_info |
List containing transformation decisions, final data, and call. |
print(splitwise_lm): Prints a summary of the splitwise_lm object.
summary(splitwise_lm): Provides a detailed summary, including how dummies
were created.
predict(splitwise_lm): Generate predictions from a splitwise_lm object
using learned transformation rules.
coef(splitwise_lm): Extract model coefficients from a SplitWise linear
model.
fitted(splitwise_lm): Extract fitted values from a SplitWise linear model.
residuals(splitwise_lm): Extract residuals from a SplitWise linear model.
model.matrix(splitwise_lm): Extract the model matrix from a SplitWise linear model.
# Load the mtcars dataset data(mtcars) # Univariate transformations (AIC-based, backward stepwise) model_uni <- splitwise( mpg ~ ., data = mtcars, transformation_mode = "univariate", direction = "backward" ) summary(model_uni) # Iterative approach (BIC-based, forward stepwise) # Note: typically set k = log(nrow(mtcars)) for BIC in step(). model_iter <- splitwise( mpg ~ ., data = mtcars, transformation_mode = "iterative", direction = "forward", criterion = "BIC", k = log(nrow(mtcars)) ) summary(model_iter)# Load the mtcars dataset data(mtcars) # Univariate transformations (AIC-based, backward stepwise) model_uni <- splitwise( mpg ~ ., data = mtcars, transformation_mode = "univariate", direction = "backward" ) summary(model_uni) # Iterative approach (BIC-based, forward stepwise) # Note: typically set k = log(nrow(mtcars)) for BIC in step(). model_iter <- splitwise( mpg ~ ., data = mtcars, transformation_mode = "iterative", direction = "forward", criterion = "BIC", k = log(nrow(mtcars)) ) summary(model_iter)