mindmeld.models.text_models module¶
This module contains all code required to perform multinomial classification of text.
-
class
mindmeld.models.text_models.
PytorchTextModel
(config)[source]¶ Bases:
mindmeld.models.model.PytorchModel
-
evaluate
(examples, labels)[source]¶ Evaluates a model against the given examples and labels
Parameters: - examples -- A list of examples to predict
- labels -- A list of expected labels
Returns: an object containing information about the evaluation
Return type: ModelEvaluation
-
ALLOWED_CLASSIFIER_TYPES
= ['embedder', 'cnn', 'lstm']¶
-
-
class
mindmeld.models.text_models.
TextModel
(config)[source]¶ Bases:
mindmeld.models.model.Model
-
evaluate
(examples, labels)[source]¶ Evaluates a model against the given examples and labels
Parameters: - examples -- A list of examples to predict
- labels -- A list of expected labels
Returns: an object containing information about the evaluation
Return type: ModelEvaluation
-
fit
(examples, labels, params=None)[source]¶ Trains this model.
This method inspects instance attributes to determine the classifier object and cross-validation strategy, and then fits the model to the training examples passed in.
Parameters: - examples (ProcessedQueryList.*Iterator) -- A list of examples.
- labels (ProcessedQueryList.*Iterator) -- A parallel list to examples. The gold labels for each example.
- params (dict, optional) -- Parameters to use when training. Parameter selection will be bypassed if this is provided
Returns: Returns self to match classifier scikit-learn interfaces.
Return type: (TextModel)
-
get_feature_matrix
(examples, y=None, fit=False, dynamic_resource=None)[source]¶ Transforms a list of examples into a feature matrix.
Parameters: examples (list) -- The examples. Returns: tuple containing: - (numpy.matrix): The feature matrix.
- (numpy.array): The group labels for examples.
Return type: (tuple)
-
inspect
(example, gold_label=None, dynamic_resource=None)[source]¶ - This class takes an example and returns a 2D list for every feature with feature
- name, feature value, feature weight and their product for the predicted label. If gold label is passed in, we will also include the feature value and weight for the gold label and returns the log probability of the difference.
Parameters: Returns: A 2D array that includes every feature, their value, weight and probability
Return type: (list of lists)
-
select_params
(examples, labels, selection_settings=None)[source]¶ - Selects the best set of hyper-parameters for a given set of examples and true labels
- through cross-validation
Parameters: - examples -- A list of example queries
- labels -- A list of labels associated with the queries
- selection_settings -- A dictionary of parameter lists to select from
Returns: A dictionary of optimized parameters to use
Return type:
-
ACCURACY_SCORING
= 'accuracy'¶
-
ALLOWED_CLASSIFIER_TYPES
= ['logreg', 'dtree', 'rforest', 'svm']¶
-
DECISION_TREE_TYPE
= 'dtree'¶
-
LOG_REG_TYPE
= 'logreg'¶
-
RANDOM_FOREST_TYPE
= 'rforest'¶
-
SVM_TYPE
= 'svm'¶
-