cat2cat.cat2cat_ml

Functions

cat2cat_ml_run(→ cat2cat_ml_run_results)

Automatic mapping in a panel dataset - cat2cat procedure

Module Contents

cat2cat.cat2cat_ml.cat2cat_ml_run(mappings: cat2cat.dataclass.cat2cat_mappings, ml: cat2cat.dataclass.cat2cat_ml, **kwargs: Any) cat2cat_ml_run_results

Automatic mapping in a panel dataset - cat2cat procedure

Parameters:
  • mappings (cat2cat_mappings) – dataclass with mappings related arguments. Please check out the cat2cat.dataclass.cat2cat_mappings for more information.

  • ml (Optional[cat2cat_ml]) – dataclass with ml related arguments. Please check out the cat2cat.dataclass.cat2cat_ml for more information.

  • **kwargs – additional arguments passed to the cat2cat_ml_run function. min_match (float): minimum share of categories from the base period that have to be matched in the mapping table. Between 0 and 1. Default 0.8. test_prop (float): share of the data used for testing. Between 0 and 1. Default 0.2. split_seed (int): random seed for the train_test_split function. Default 42.

Returns:

cat2cat_ml_run_class

Note

Please check out the cat2cat.cat2cat.cat2cat for more information.

>>> from cat2cat import cat2cat
>>> from cat2cat.cat2cat_ml import cat2cat_ml_run
>>> from cat2cat.dataclass import cat2cat_data, cat2cat_mappings, cat2cat_ml
>>> from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
>>> from sklearn.tree import DecisionTreeClassifier
>>> from cat2cat.datasets import load_trans, load_occup
>>> trans = load_trans()
>>> occup = load_occup()
>>> o_old = occup.loc[occup.year == 2008, :].copy()
>>> o_new = occup.loc[occup.year == 2010, :].copy()
>>> mappings = cat2cat_mappings(trans = trans, direction = "backward")
>>> ml = cat2cat_ml(
...    occup.loc[occup.year >= 2010, :].copy(),
...    "code",
...    ["salary", "age", "edu", "sex"],
...    [DecisionTreeClassifier(random_state=1234), LinearDiscriminantAnalysis()]
... )
>>> cat2cat_ml_run(mappings = mappings, ml = ml)
...