Link Search Menu Expand Document

AHB_MIP_match

AHB_MIP_match(data, holdout = 0.1, treated_column_name = "treated",
              outcome_column_name = "outcome", black_box = "BART", 
              cv = T, gamma0 = 3, gamma1 = 3, Beta = 2, m = 1, M = 1e+05, 
              n_prune = ifelse(is.numeric(holdout), round(0.1 * (1 - holdout) * nrow(data)), round(0.1 * nrow(data))))

Parameters

data:
file, Dataframe, required
If holdout is not a numeric value, this is the data to be matched. If holdout is a numeric scalar between 0 and 1, that proportion of data will be made into a holdout set and only the remaining proportion of data will be matched.
holdout:
numeric, file, Dataframe, optional (default = 0.1)
Holdout data used to train the outcome model. If a numeric scalar, that proportion of data will be made into a holdout set and only the remaining proportion of data will be matched. Otherwise, if a file path or dataframe is provided, that dataset will serve as the holdout data.
treated_column_name:
string, optional (default = 'treated')
The name of the column which specifies whether a unit is treated or control.
outcome_column_name:
string, optional (default = 'outcome')
The name of the column which specifies each unit outcome.
black_box:
string, optional (default = 'BART)
Denotes the method to be used to generate outcome model Y. If "BART" and cv = F, uses dbarts::bart with keeptrees = TRUE, keepevery = 10, verbose = FALSE, k = 2 and ntree =200 and then the default predict method to estimate the outcome. If "BART" and cv = T, k and ntree will be best values from cross validation. Defaults to 'BART'. There will be multiple choices about black_box in the future.
cv:
logical, optional (default = T)
If TURE, do cross-validation on the train set to generate outcome model Y
gamma0:
A numeric scalar, optional (default = 3)
A numeric value, one of hyperparameters in global MIP that controls the weight placed on the outcome function portion of the loss.
gamma1:
A numeric scalar, optional (default = 3)
A numeric value, one of hyperparameters in global MIP that controls the weight placed on the outcome function portion of the loss.
beta:
A numeric scalar, optional (default = 2)
A numeric value, one of hyperparameters in global MIP that controls the weight placed on the outcome function portion of the loss.
m:
A integer scalar, optional (default = 1)
Determines the at least number of control units that the box contains when estimating causal effects for a single treatment unit.
M:
A positive integer scalar, optional (default = 1e+5)
Controls the weight placed on decision variable wij, which is an indicator for whether a unit is in the box.
n_prune:
A positive inetger scalar, optional (default = 0.1* nrow(dataset to be matched))
Determines the number of candidate units selected to run the mip on for constructing the box. Dataset mentioned below is refered to the dataset for matching. If you match a small dataset with the number of units smaller than 400, it will run MIP on all dataset for each treated unit. If you match larger dataset and your memory of your computer cannot support such much computation, plase adjust n_prune below 400 or even smaller. The smaller number of candidate units selected to run the mip on for constructing the box, the faster this program runs.

Returns

$data:
dataframe
Data set that was matched by AHB_MIP_match(). If holdout is not a numeric value, then $data is the same as the data input into AHB_MIP_match(). If holdout is a numeric scalar between 0 and 1, $data is the remaining proportion of data that were matched.
$units_id:
integer vector
A integer vector with unit_id for test treated units
$CATE:
numeric vector
A numeric vector with the conditional average treatment effect estimates for every test treated unit in its matched group in $MGs
$bins:
numeric vector
An array of two lists where the first list contains the lower bounds and the second list contains the upper bounds for each hyper-box. Each row of each list corresponds to the hyper-box for a test treated unit in $units_id.
$MGs:
list
A list of all the matched groups formed by AHB_MIP_match(). For each test treated unit, each row contains all unit_id of the other units that fall into its box, including itself.