mindmeld.system_entity_recognizer module

exception mindmeld.system_entity_recognizer.SystemEntityError[source]

Bases: Exception

class mindmeld.system_entity_recognizer.DucklingDimension[source]

Bases: enum.Enum

An enumeration.

AMOUNT_OF_MONEY = 'amount-of-money'
CREDIT_CARD_NUMBER = 'credit-card-number'
DISTANCE = 'distance'
DURATION = 'duration'
EMAIL = 'email'
NUMERAL = 'numeral'
ORDINAL = 'ordinal'
PHONE_NUMBER = 'phone-number'
QUANTITY = 'quantity'
TEMPERATURE = 'temperature'
TIME = 'time'
URL = 'url'
VOLUME = 'volume'
class mindmeld.system_entity_recognizer.DucklingRecognizer(url='http://localhost:7151/parse')[source]

Bases: mindmeld.system_entity_recognizer.SystemEntityRecognizer

get_candidates(query, entity_types=None, locale=None, language=None, time_zone=None, timestamp=None)[source]

Identifies candidate system entities in the given query.

Parameters:
  • query (Query) -- The query to examine
  • entity_types (list of str) -- The entity types to consider
  • locale (str, optional) -- The locale representing the ISO 639-1 language code and ISO3166 alpha 2 country code separated by an underscore character.
  • language (str, optional) -- Language as specified using a 639-1/2 code.
  • time_zone (str, optional) -- An IANA time zone id such as 'America/Los_Angeles'. If not specified, the system time zone is used.
  • timestamp (long, optional) -- A unix timestamp used as the reference time. If not specified, the current system time is used. If time_zone is not also specified, this parameter is ignored.
Returns:

The system entities found in the query

Return type:

list of QueryEntity

get_candidates_for_text(text, entity_types=None, locale=None, language=None, time_zone=None, timestamp=None)[source]

Identifies candidate system entities in the given text.

Parameters:
  • text (str) -- The text to examine
  • entity_types (list of str) -- The entity types to consider
  • language (str) -- Language code
  • locale (str) -- Locale code
  • time_zone (str, optional) -- An IANA time zone id such as 'America/Los_Angeles'. If not specified, the system time zone is used.
  • timestamp (long, optional) -- A unix timestamp used as the reference time. If not specified, the current system time is used. If time_zone is not also specified, this parameter is ignored.
Returns:

The system entities found in the text

Return type:

list of dict

static get_instance(url=None)[source]

Static access method. We get an instance for the Duckling URL. If there is no URL being passed,

default to DEFAULT_DUCKLING_URL.
Parameters:url -- Duckling URL.
Returns:A DucklingRecognizer instance
Return type:(DucklingRecognizer)
get_response(data)[source]

Send a post request to Duckling, data is a dictionary with field text. Return a tuple consisting the JSON response and a response code.

Parameters:data (dict) --
Returns:(dict, int)
parse(sentence, dimensions=None, language=None, locale=None, time_zone=None, timestamp=None)[source]

Calls System Entity Recognizer service API to extract numerical entities from a sentence.

Parameters:
  • sentence (str) -- A raw sentence.
  • dimensions (None or list of str) -- The list of types (e.g. volume, temperature) to restrict the output to. If None, include all types.
  • language (str, optional) -- Language of the sentence specified using a 639-1/2 code. If both locale and language are provided, the locale is used. If neither are provided, the EN language code is used.
  • locale (str, optional) -- The locale representing the ISO 639-1 language code and ISO3166 alpha 2 country code separated by an underscore character.
  • time_zone (str, optional) -- An IANA time zone id such as 'America/Los_Angeles'. If not specified, the system time zone is used.
  • timestamp (long, optional) -- A unix millisecond timestamp used as the reference time. If not specified, the current system time is used. If time_zone
Returns:

A tuple containing:
  • response (list, dict): Response from the System Entity Recognizer service that consists of a list of dicts, each corresponding to a single prediction or just a dict, corresponding to a single prediction.
  • response_code (int): http status code.

Return type:

(tuple)

resolve_system_entity(query, entity_type, span)[source]

Resolves a system entity in the provided query at the specified span.

Parameters:
  • query (Query) -- The query containing the entity
  • entity_type (str) -- The type of the entity
  • span (Span) -- The character span of the entity in the query
Returns:

The resolved entity

Return type:

Entity

Raises:

SystemEntityResolutionError

class mindmeld.system_entity_recognizer.NoOpSystemEntityRecognizer[source]

Bases: mindmeld.system_entity_recognizer.SystemEntityRecognizer

This is a no-ops recognizer which returns empty list and 200.

get_candidates(query, entity_types=None, **kwargs)[source]

Identifies candidate system entities in the given query.

Parameters:
  • query (Query) -- The query to examine
  • entity_types (list of str) -- The entity types to consider
Returns:

The system entities found in the query

Return type:

list of QueryEntity

get_candidates_for_text(text, entity_types=None, **kwargs)[source]

Identifies candidate system entities in the given text.

Parameters:
  • text (str) -- The text to examine
  • entity_types (list of str) -- The entity types to consider
Returns:

The system entities found in the text

Return type:

list of dict

static get_instance()[source]

Static access method. If there is no instance instantiated, we instantiate NoOpSystemEntityRecognizer.

Returns:A SystemEntityRecognizer instance
Return type:(SystemEntityRecognizer)
parse(sentence, **kwargs)[source]

Calls System Entity Recognizer service API to extract numerical entities from a sentence.

Parameters:sentence (str) -- A raw sentence.
Returns:
A tuple containing:
  • response (list, dict): Response from the System Entity Recognizer service that consists of a list of dicts, each corresponding to a single prediction or just a dict, corresponding to a single prediction.
  • response_code (int): http status code.
Return type:(tuple)
resolve_system_entity(query, entity_type, span)[source]

Resolves a system entity in the provided query at the specified span.

Parameters:
  • query (Query) -- The query containing the entity
  • entity_type (str) -- The type of the entity
  • span (Span) -- The character span of the entity in the query
Returns:

The resolved entity

Return type:

Entity

Raises:

SystemEntityResolutionError

class mindmeld.system_entity_recognizer.SystemEntityRecognizer[source]

Bases: abc.ABC

SystemEntityRecognizer is the external parsing service used to extract system entities. It is intended to be used as a singleton, so it's initialized only once during NLP object construction.

get_candidates(query, entity_types=None, **kwargs)[source]

Identifies candidate system entities in the given query.

Parameters:
  • query (Query) -- The query to examine
  • entity_types (list of str) -- The entity types to consider
Returns:

The system entities found in the query

Return type:

list of QueryEntity

get_candidates_for_text(text, entity_types=None, **kwargs)[source]

Identifies candidate system entities in the given text.

Parameters:
  • text (str) -- The text to examine
  • entity_types (list of str) -- The entity types to consider
Returns:

The system entities found in the text

Return type:

list of dict

static get_instance()[source]

Static access method. If there is no instance instantiated, we instantiate NoOpSystemEntityRecognizer.

Returns:A SystemEntityRecognizer instance
Return type:(SystemEntityRecognizer)
static load_from_app_path(app_path)[source]

If the application configuration is empty, we do not use Duckling.

Otherwise, we return the Duckling recognizer with the URL defined in the application's
config, default to the DEFAULT_DUCKLING_URL.
Parameters:app_path (str) -- Application path
Returns:(SystemEntityRecognizer)
parse(sentence, **kwargs)[source]

Calls System Entity Recognizer service API to extract numerical entities from a sentence.

Parameters:sentence (str) -- A raw sentence.
Returns:
A tuple containing:
  • response (list, dict): Response from the System Entity Recognizer service that consists of a list of dicts, each corresponding to a single prediction or just a dict, corresponding to a single prediction.
  • response_code (int): http status code.
Return type:(tuple)
resolve_system_entity(query, entity_type, span)[source]

Resolves a system entity in the provided query at the specified span.

Parameters:
  • query (Query) -- The query containing the entity
  • entity_type (str) -- The type of the entity
  • span (Span) -- The character span of the entity in the query
Returns:

The resolved entity

Return type:

Entity

Raises:

SystemEntityResolutionError

static set_system_entity_recognizer(system_entity_recognizer=None, app_path=None)[source]

We set the global System Entity Recognizer to be the one configured from the application's path.

Parameters:
  • system_entity_recognizer -- A system entity recognizer
  • app_path (str) -- The application path
Returns:

(SystemEntityRecognizer)

mindmeld.system_entity_recognizer.dimensions_from_entity_types(entity_types)[source]
Parameters:entity_types (list) --
Returns:(list)
mindmeld.system_entity_recognizer.duckling_item_to_entity(item)[source]

Converts an item from the output of duckling into an Entity

Parameters:item (dict) -- The duckling item
Returns:The entity described by the duckling item
Return type:Entity
mindmeld.system_entity_recognizer.duckling_item_to_query_entity(query, item, offset=0)[source]

Converts an item from the output of duckling into a QueryEntity

Parameters:
  • query (Query) -- The query to construct the QueryEntity from
  • item (dict) -- The duckling item
  • offset (int, optional) -- The offset into the query that the item's indexing begins
Returns:

The query entity described by the duckling item or None if no item is present

Return type:

QueryEntity