mindmeld.active_learning.classifiers module¶
This module contains classifiers for the Active Learning Pipeline.
-
class
mindmeld.active_learning.classifiers.
ALClassifier
(app_path: str, tuning_level: list)[source]¶ Bases:
abc.ABC
Abstract class for Active Learning Classifiers.
-
class
mindmeld.active_learning.classifiers.
MindMeldALClassifier
(app_path: str, tuning_level: list, n_classifiers: int, aggregate_statistic: str = None, class_level_statistic: str = None)[source]¶ Bases:
mindmeld.active_learning.classifiers.ALClassifier
Active Learning classifier that uses MindMeld classifiers internally. Handles the training of MindMeld components (Domain or Intent classifiers) and collecting performance statistics (eval_stats).
-
domain_classifier_fit_eval
(sampled_queries: mindmeld.resource_loader.ProcessedQueryList, unsampled_queries: mindmeld.resource_loader.ProcessedQueryList, test_queries: mindmeld.resource_loader.ProcessedQueryList, domain2id: Dict)[source]¶ Fit and evaluate the domain classifier. :param sampled_queries: List of Sampled Queries :type sampled_queries: ProcessedQueryList :param unsampled_queries: List of Unsampled Queries :type unsampled_queries: ProcessedQueryList :param test_queries: List of Test Queries :type test_queries: ProcessedQueryList :param domain2id: Dictionary mapping domains to IDs :type domain2id: Dict
Returns: - List of probability distributions
- for unsampled queries.
- dc_eval_test (mindmeld.models.model.StandardModelEvaluation): Mindmeld evaluation
- object for the domain classifier.
Return type: dc_queries_prob_vectors (List[List])
-
entity_recognizers_fit_eval
(sampled_queries: mindmeld.resource_loader.ProcessedQueryList, unsampled_queries: mindmeld.resource_loader.ProcessedQueryList, test_queries: mindmeld.resource_loader.ProcessedQueryList, domain_to_intents: Dict, entity2id: Dict)[source]¶ Fit and evaluate the entity recognizer. :param sampled_queries: List of Sampled Queries. :type sampled_queries: ProcessedQueryList :param unsampled_queries: List of Unsampled Queries. :type unsampled_queries: ProcessedQueryList :param test_queries: List of Test Queries. :type test_queries: ProcessedQueryList :param domain_to_intents: Dictionary mapping domain to list of intents. :type domain_to_intents: Dict :param entity2id: Dictionary mapping entities to IDs. :type entity2id: Dict
Returns: - List of probability distributions
- for unsampled queries.
- ic_eval_test_dict (Dict): Dictionary mapping a domain (str) to the
- associated ic_eval_test object.
Return type: ic_queries_prob_vectors (List[List])
-
intent_classifiers_fit_eval
(sampled_queries: mindmeld.resource_loader.ProcessedQueryList, unsampled_queries: mindmeld.resource_loader.ProcessedQueryList, test_queries: mindmeld.resource_loader.ProcessedQueryList, domain_list: Dict, domain_to_intent2id: Dict)[source]¶ Fit and evaluate the intent classifier. :param sampled_queries: List of Sampled Queries. :type sampled_queries: ProcessedQueryList :param unsampled_queries: List of Unsampled Queries. :type unsampled_queries: ProcessedQueryList :param test_queries: List of Test Queries. :type test_queries: ProcessedQueryList :param domain_list: List of domains used by the application. :type domain_list: List[str] :param domain_to_intent2id: Dictionary mapping intents to IDs. :type domain_to_intent2id: Dict
Returns: - List of probability distributions
- for unsampled queries.
- ic_eval_test_dict (Dict): Dictionary mapping a domain (str) to the
- associated ic_eval_test object.
Return type: ic_queries_prob_vectors (List[List])
-
train
(data_bucket: mindmeld.active_learning.data_loading.DataBucket, heuristic: mindmeld.active_learning.heuristics.Heuristic, tuning_type: mindmeld.constants.TuningType = <TuningType.CLASSIFIER: 'classifier'>)[source]¶ Main training function.
Parameters: - data_bucket (DataBucket) -- DataBucket for current iteration
- heuristic (Heuristic) -- Current Heuristic.
- tuning_type (TuningType) -- Component to be tuned ("classifier" or "tagger")
Returns: Evaluation metrics to be included in accuracies.json confidences_2d (List[List]): 2D array with probability vectors for unsampled queries
(returns a 3d output for tagger tuning).
- confidences_3d (List[List[List]]]): 3D array with probability vectors for unsampled
queries from multiple classifiers
- domain_indices (Dict): Maps domains to a tuple containing the start and
ending indexes of intents with the given domain.
Return type: eval_stats (defaultdict)
-
train_multi
(data_bucket: mindmeld.active_learning.data_loading.DataBucket)[source]¶ Trains multiple models to get a 3D probability array for multi-model selection strategies. :param data_bucket: Databucket for current iteration :type data_bucket: DataBucket
Returns: - 3D array with probability vectors for unsampled
- queries from multiple classifiers
Return type: confidences_3d (List[List[List]]])
-
train_single
(data_bucket: mindmeld.active_learning.data_loading.DataBucket, eval_stats: collections.defaultdict = None)[source]¶ Trains a single model to get a 2D probability array for single-model selection strategies. :param data_bucket: Databucket for current iteration :type data_bucket: DataBucket :param eval_stats: Evaluation metrics to be included in accuracies.json :type eval_stats: defaultdict
Returns: - 2D array with probability vectors for unsampled queries
- (returns a 3d output for tagger tuning).
Return type: confidences_2d (List)
-