mindmeld.models.helpers module¶
This module contains some helper functions for the models package
-
class
mindmeld.models.helpers.
FileBackedList
[source]¶ Bases:
object
FileBackedList implements an interface for simple list use cases that is backed by a temporary file on disk. This is useful for simple list processing in a memory efficient way.
-
class
mindmeld.models.helpers.
ModelType
[source]¶ Bases:
enum.Enum
An enumeration.
-
TAGGER_MODEL
= 'tagger'¶
-
TEXT_MODEL
= 'text'¶
-
-
mindmeld.models.helpers.
create_annotator
(config)[source]¶ Creates an annotator instance using the provided configuration
Parameters: config (dict) -- A model configuration Returns: An Annotator class Return type: Annotator Raises: ValueError
-- When model configuration is invalid or required key is missing
-
mindmeld.models.helpers.
create_embedder_model
(app_path, config)[source]¶ Creates and loads an embedder model
Parameters: config (dict) -- Model settings passed in as a dictionary with 'embedder_type' being a required key Returns: An instance of appropriate embedder class Return type: Embedder Raises: ValueError
-- When model configuration is invalid or required key is missing
-
mindmeld.models.helpers.
create_model
(config)[source]¶ Creates a model instance using the provided configuration
Parameters: config (ModelConfig) -- A model configuration Returns: a configured model Return type: Model Raises: ValueError
-- When model configuration is invalid
-
mindmeld.models.helpers.
entity_seqs_equal
(expected, predicted)[source]¶ Returns true if the expected entities and predicted entities all match, returns false otherwise. Note that for entity comparison, we compare that the span, text, and type of all the entities match.
Parameters: - expected (list of core.Entity) -- A list of the expected entities for some query
- predicted (list of core.Entity) -- A list of the predicted entities for some query
-
mindmeld.models.helpers.
get_feature_extractor
(example_type, name)[source]¶ Gets a feature extractor given the example type and name
Parameters: Returns: A feature extractor wrapper
Return type: function
-
mindmeld.models.helpers.
get_label_encoder
(config)[source]¶ Gets a label encoder given the label type from the config
Parameters: config (ModelConfig) -- A model configuration Returns: The appropriate LabelEncoder object for the given config Return type: LabelEncoder
-
mindmeld.models.helpers.
get_ngram
(tokens, start, length)[source]¶ Gets a ngram from a list of tokens.
Handles out-of-bounds token positions with a special character.
Parameters: Returns: (str) An n-gram in the input token list.
-
mindmeld.models.helpers.
get_ngrams_upto_n
(tokens, n)[source]¶ This function returns a generator that returns ngram tuples with length upto n
Parameters: - tokens (list of str) -- Word tokens.
- n (int) -- The length of n-gram upto which the ngram tokens are generated
Returns: ngram, (token index start, token index end)
Return type:
-
mindmeld.models.helpers.
get_seq_accuracy_scorer
()[source]¶ Returns a scorer that can be used by sklearn's GridSearchCV based on the sequence_accuracy_scoring method below.
-
mindmeld.models.helpers.
get_seq_tag_accuracy_scorer
()[source]¶ Returns a scorer that can be used by sklearn's GridSearchCV based on the sequence_tag_accuracy_scoring method below.
-
mindmeld.models.helpers.
ingest_dynamic_gazetteer
(resource, dynamic_resource=None, text_preparation_pipeline=None)[source]¶ Ingests dynamic gazetteers from the app and adds them to the resource
Parameters: Returns: A new resource with the ingested dynamic resource
Return type: (dict)
-
mindmeld.models.helpers.
load_model
(path)[source]¶ Loads a model from a specified path
Parameters: path (str) -- A path where the model configuration is pickled along with other metadata Returns: - metadata loaded from the path, which contains the configured model in 'model' key
- and the model configs in 'model_config' key along with other keys
Return type: dict Raises: ValueError
-- When model configuration is invalid
-
mindmeld.models.helpers.
mask_numerics
(token)[source]¶ Masks digit characters in a token
Parameters: token (str) -- A string Returns: A masked string for digit characters Return type: str
-
mindmeld.models.helpers.
merge_gazetteer_resource
(resource, dynamic_resource, text_preparation_pipeline)[source]¶ Returns a new resource that is a merge between the original resource and the dynamic resource passed in for only the gazetteer values
Parameters: Returns: The merged resource
Return type:
-
mindmeld.models.helpers.
register_annotator
(annotator_class_name, annotator_class)[source]¶ Registers an Annotator class for use with create_annotator()
Parameters: - annotator_class_name (str) -- The annotator class name as specified in the config
- model_class (class) -- The annotator class to register
-
mindmeld.models.helpers.
register_augmentor
(augmentor_name, augmentor_class)[source]¶ Registers an Annotator class for use with create_annotator()
Parameters: - annotator_class_name (str) -- The annotator class name as specified in the config
- model_class (class) -- The annotator class to register
-
mindmeld.models.helpers.
register_entity_feature
(feature_name)[source]¶ Registers entity feature
Parameters: feature_name (str) -- The name of the entity feature Returns: the feature extractor Return type: (func)
-
mindmeld.models.helpers.
register_feature
(feature_type, feature_name)[source]¶ Decorator for adding feature extractor mappings to FEATURE_MAP
Parameters: - feature_type -- 'query' or 'entity'
- feature_name -- The name of the feature, used in config.py
Returns: the feature extractor
Return type: (func)
-
mindmeld.models.helpers.
register_label
(label_type, label_encoder)[source]¶ Register a label encoder for use with get_label_encoder()
Parameters: - label_type (str) -- The label type of the label encoder
- label_encoder (LabelEncoder) -- The label encoder class to register
Raises: ValueError
-- If the label type is already registered
-
mindmeld.models.helpers.
register_model
(model_type, model_class)[source]¶ Registers a model for use with create_model()
Parameters: - model_type (str) -- The model type as specified in model configs
- model_class (class) -- The model to register
-
mindmeld.models.helpers.
register_query_feature
(feature_name)[source]¶ Registers query feature
Parameters: feature_name (str) -- The name of the query feature Returns: the feature extractor Return type: (func)
-
mindmeld.models.helpers.
requires
(resource)[source]¶ Decorator to enforce the resource dependencies of the active feature extractors
Parameters: resource (str) -- the key of a classifier resource which must be initialized before the given feature extractor is used Returns: the feature extractor Return type: (func)
-
mindmeld.models.helpers.
sequence_accuracy_scoring
(y_true, y_pred)[source]¶ - Accuracy score which calculates two sequences to be equal only if all of
- their predicted tags are equal.
Parameters: Returns: The sequence-level accuracy when comparing the predicted labels against the true expected labels
Return type: