Link Search Menu Expand Document

AHB_fast_match

AHB_fast_match(data, holdout = 0.1, treated_column_name = "treated", 
               outcome_column_name = "outcome", black_box = "BART", 
               cv = T, C = 0.1)

Parameters

data:
file, Dataframe, required
If holdout is not a numeric value, this is the data to be matched. If holdout is a numeric scalar between 0 and 1, that proportion of data will be made into a holdout set and only the remaining proportion of data will be matched.
holdout:
numeric, file, Dataframe, optional (default = 0.1)
Holdout data used to train the outcome model. If a numeric scalar, that proportion of data will be made into a holdout set and only the remaining proportion of data will be matched. Otherwise, if a file path or dataframe is provided, that dataset will serve as the holdout data.
treated_column_name:
string, optional (default = 'treated')
The name of the column which specifies whether a unit is treated or control.
outcome_column_name:
string, optional (default = 'outcome')
The name of the column which specifies each unit outcome.
black_box
string, optional (default = 'BART)
Denotes the method to be used to generate outcome model Y. If "BART" and cv = F, uses dbarts::bart with keeptrees = TRUE, keepevery = 10, verbose = FALSE, k = 2 and ntree =200 and then the default predict method to estimate the outcome. If "BART" and cv = T, k and ntree will be best values from cross validation. Defaults to 'BART'. There will be multiple choices about black_box in the future.
cv
logical, optional (default = T)
If TURE, do cross-validation on the train set to generate outcome model Y
C
A positive scalar, optional (default = 0.1)
Determines the stopping condition for Fast AHB. When the variance in a newly expanded region exceeds C times the variance in the previous expansion region, the algorithm stops. Thus, higher C encourages coarser bins while lower C encourages finer ones. The user should analyze the data with multiple values of C to see how robust results are to its choice.

Returns

$data:
dataframe
Data set that was matched by AHB_fast_match(). If holdout is not a numeric value, then $data is the same as the data input into AHB_fast_match(). If holdout is a numeric scalar between 0 and 1, $data is the remaining proportion of data that were matched.
$units_id:
integer vector
A integer vector with unit_id for test treated units
$CATE:
numeric vector
A numeric vector with the conditional average treatment effect estimates for every test treated unit in its matched group in $MGs
$bins:
numeric vector
An array of two lists where the first list contains the lower bounds and the second list contains the upper bounds for each hyper-box. Each row of each list corresponds to the hyper-box for a test treated unit in $units_id.
$MGs:
list
A list of all the matched groups formed by AHB_fast_match(). For each test treated unit, each row contains all unit_id of the other units that fall into its box, including itself.