dame_flame.matching.FLAME

The FLAME algorithm class

class dame_flame.matching.FLAME(adaptive_weights='ridge', alpha=0.1, 
         repeats=True, verbose=2, early_stop_iterations=float('inf'), 
         stop_unmatched_c=False, early_stop_un_c_frac=False, 
         stop_unmatched_t=False, early_stop_un_t_frac=False, 
         early_stop_pe=0.05, 
         missing_indicator=np.nan, missing_data_replace=0, 
         missing_holdout_replace=0, missing_holdout_imputations=10, 
         missing_data_imputations=1, want_pe=False, want_bf=False)    

Source Code

This class creates the matches based on the FLAME “Fast Large-Scale Almost Matching Exactly” algorithm. It has built in support for stopping criteria and missing data handling.

Parameters

Parameter Name	Type	Default	Description
adaptive_weights	{bool, ‘ridge’, ‘decisiontree’, ‘ridgeCV’, ‘decisiontreeCV’}	‘ridge’	The method used to decide what covariate set should be dropped next.
alpha	float	0.1	If adaptive_weights is set to ridge, this is the alpha for ridge regression.
repeats	bool	True	Whether or not units for whom a main matched has been found can be used again, and placed in an auxiliary matched group.
verbose	int: {0,1,2,3}	2	Style of printout while algorithm runs. If 0, no output. If 1, provides iteration number. If 2, provides iteration number and additional information on the progress of the matching at every 10th iteration. If 3, provides iteration number and additional information on the progress of the matching at every iteration
early_stop_iterations	float,int	float(‘inf’)	A number of iterations after which to hard stop the algorithm. The default is infinite; i.e. no early stopping is done. Iterations start at 0 so setting early_stop_iterations to 0, for example, implies that only exact matches should be made.
stop_unmatched_c	bool	False	If True, then the algorithm terminates when there are no more control units to match.
stop_unmatched_t	bool	False	If True, then the algorithm terminates when there are no more treatment units to match.
early_stop_un_c_frac	float	0.1	Must be between 0.0 and 1.0. This provides a fraction of unmatched control units. When the threshold is met, the algorithm will stop iterating. For example, using an input dataset with 100 control units, the algorithm will stop when 10 control units are unmatched and 90 are matched (or earlier, depending on other stopping conditions).
early_stop_un_t_frac	float	0.1	Must be between 0.0 and 1.0. This provides a fraction of unmatched treatment units. When the threshold is met, the algorithm will stop iterating. For example, using an input dataset with 100 treatment units, the algorithm will stop when 10 control units are unmatched and 90 are matched (or earlier, depending on other stopping conditions).
early_stop_pe	float	0.05	If FLAME attempts to drop a covariate set that would raise the PE above (1 + early_stop_pe) times the baseline PE (the PE before any covariates have been dropped), DAME terminates before dropping this covariate set.
want_pe	bool	False	If true, the output of the algorithm will include the predictive error of the covariate sets used for matching in each iteration.
want_bf	bool	False	If true, the output will include the balancing factor for each iteration.
missing_indicator	{character, integer, numpy.nan}	numpy.nan	This is the indicator for missing data in the dataset.
missing_holdout_replace	int: {0,1,2}	0	If 0, assume no missing holdout data and proceed. If 1, the algorithm excludes units with missing values from the holdout dataset. If 2, do MICE on holdout dataset. If this option is selected, it will be done for a number of iterations equal to missing_holdout_imputations.
missing_data_replace	int: {0,1,2,3}	0	If 0, assume no missing data in matching data and proceed. If 1, the algorithm does not match on units that have missing values. If 2, prevent all missing_indicator values from being matched on. If 3, do MICE on matching dataset. This is not recommended. If this option is selected, it will be done for a number of iterations equal to missing_data_imputations.
missing_holdout_imputations	int	10	If missing_holdout_replace=2, the number of imputations.
missing_data_imputations	int	1	If missing_data_replace=3, the number of imputations.

Attributes

Attribute Name	Type	Description
units_per_group	Array	This is an array of arrays. Each sub-array is a matched group, and each item in each sub-array is an int, indicating the unit in that matched group. If matching is done with `repeats=False` then no unit will appear more than once. If `repeats=True` then the first group in which a unit appears is its main matched group.
df_units_and_covars_matched	dataframe	This is the resulting matches of FLAME. Each matched unit is in this array, and the covariates they were matched on have the value used to match. The covariates units were not matched on are indicated with a `*`
groups_per_unit	Array	The length of this is equal to the number of units in the input array. Each item in this array corresponds to the number of times that each item was matched. If matching is done with repeats=False, then this number will be either 0 or 1.
bf_each_iter	Array	if `want_bf` parameter is True, this will contain the balancing factor of the chosen covariate set at each iteration
pe_each_iter	Array	if `want_pe` parameter is True, this will contain the predictive error of the chosen covariate set at each iteration

Quick Example

import pandas as pd
import dame_flame
df = pd.read_csv("dame_flame/data/sample.csv")
model = dame_flame.matching.FLAME()
model.fit(df)
result = model.predict(df)
print(result)
#>    x1   x2   x3   x4
#> 0   *   1    1    1     
#> 2   *   *    1    1     
#> 3   *   0    1    1     

Methods

`fit(self, holdout_data, treatment_col....)`	Provide self with holdout training data
`predict(self, input_data...)`	Perform the match on the input data

__init__(adaptive_weights='ridge', alpha=0.1, repeats=True, verbose=2, early_stop_iterations=float('inf'), 
stop_unmatched_c=False, early_stop_un_c_frac=False, stop_unmatched_t=False, early_stop_un_t_frac=False,
early_stop_pe=0.05, 
missing_indicator=np.nan, missing_data_replace=0, missing_holdout_replace=0, 
missing_holdout_imputations=10, missing_data_imputations=1, want_pe=False, want_bf=False)

Source Code

Initialize self

fit(self, holdout_data=False, treatment_column_name='treated', outcome_column_name='outcome'
weight_array=False))

Source Code

Provide self with holdout data

`fit` Parameter Name	Type	Default	Description
holdout_data	{string, dataframe, float, False }	False	This is the holdout training dataset. If a string is given, that should be the location of a CSV file to input. If a float between 0.0 and 1.0 is given, that corresponds the percent of the input dataset to randomly select for holdout data. If False, the holdout data is equal to the entire input data.
treatment_column_name	string	“treated”	This is the name of the column with a binary indicator for whether a row is a treatment or control unit.
outcome_column_name	string	“outcome”	This is the name of the column with the outcome variable of each unit.
adaptive_weights	{bool, “ridge”, “decisiontree”, “ridgeCV”, “decisiontreeCV”}	“ridge”	The method used to decide what covariate set should be dropped next.
weight_array	array	optional	If adaptive_weights = False, these are the weights to the covariates in input_data, for the non-adaptive version of FLAME. Must sum to 1. In this case, we do not use machine learning for the weights, they are manually entered as weight_array.

predict(self, input_data)

Source Code

Perform match and return matches

`predict` Parameter Name	Type	Default	Description
input_data	{string, dataframe}	Required Parameter	The dataframe on which to perform the matching, or the location of the CSV with the dataframe
C	float	0.1	The tradeoff parameter between the balancing factor and the predictive error when deciding which covariates to match on
pre_dame	{float, int}	float(‘inf’)	FLAME will run for this many iterations prior to switching to DAME.

`predict` Return	Description
Result	Pandas dataframe of matched units and covariates matched on, with a “*” at each covariate that a unit did not use in matching

dame_flame.matching.FLAME

Source Code

Parameters

Attributes

Quick Example

Methods

Source Code

Source Code

Source Code

Further Readings