Module a2t.tasks
The module tasks
contains the code related to the Task
definition.
The tasks on this module are organized based on the number of spans to classify:
ZeroaryTask
: are tasks like Text Classification, where the aim is to classify the given text into a set of predefined labels.UnaryTask
: are tasks like Named Entity Classification, where the object to classify is an span within a text.BinaryTask
: are tasks like Relation Classification, where what is actually classified is the relation between two spans in a text.
There are also more specific predefined task classes like TopicClassificationTask
that includes helpful code and default values for
the given task. You can either create create a task specific class or instantiate one of the predefined ones.
In addition to the Task
class a Features
class must be defined. The Features
class will define which type of information is
used during the classification. For example, for a UnaryTask
a context
and some variable X
for the span are needed. This class
will also be used to instantiate the task data instances.
Expand source code
"""The module `tasks` contains the code related to the `Task` definition.
![Task taxonomy.](https://raw.githubusercontent.com/osainz59/Ask2Transformers/master/imgs/task_taxonomy.svg)
The tasks on this module are organized based on the number of spans to classify:
* `ZeroaryTask`: are tasks like Text Classification, where the aim is to classify the given text into a set of predefined
labels.
* `UnaryTask`: are tasks like Named Entity Classification, where the object to classify is an span within a text.
* `BinaryTask`: are tasks like Relation Classification, where what is actually classified is the relation between two spans in a text.
There are also more specific predefined task classes like `TopicClassificationTask` that includes helpful code and default values for
the given task. You can either create create a task specific class or instantiate one of the predefined ones.
In addition to the `Task` class a `Features` class must be defined. The `Features` class will define which type of information is
used during the classification. For example, for a `UnaryTask` a `context` and some variable `X` for the span are needed. This class
will also be used to instantiate the task data instances.
"""
from .base import Task, ZeroaryTask, UnaryTask, BinaryTask, Features, ZeroaryFeatures, UnaryFeatures, BinaryFeatures
from .text_classification import (
TopicClassificationFeatures,
TopicClassificationTask,
TextClassificationFeatures,
TextClassificationTask,
)
from .span_classification import NamedEntityClassificationFeatures, NamedEntityClassificationTask
from .tuple_classification import (
RelationClassificationFeatures,
RelationClassificationTask,
EventArgumentClassificationFeatures,
EventArgumentClassificationTask,
TACREDRelationClassificationTask,
TACREDFeatures,
)
PREDEFINED_TASKS = {
"zero-ary": (ZeroaryTask, ZeroaryFeatures),
"unary": (UnaryTask, UnaryFeatures),
"binary": (BinaryTask, BinaryFeatures),
"topic-classification": (TopicClassificationTask, TopicClassificationFeatures),
"named-entity-classification": (NamedEntityClassificationTask, NamedEntityClassificationFeatures),
"relation-classification": (RelationClassificationTask, RelationClassificationFeatures),
"event-argument-classification": (EventArgumentClassificationTask, EventArgumentClassificationFeatures),
"tacred": (TACREDRelationClassificationTask, TACREDFeatures),
}
__all__ = [
"Task",
"Features",
"ZeroaryTask",
"ZeroaryFeatures",
"UnaryTask",
"UnaryFeatures",
"BinaryTask",
"BinaryFeatures",
"TopicClassificationFeatures",
"TopicClassificationTask",
"TextClassificationFeatures",
"TextClassificationTask",
"NamedEntityClassificationFeatures",
"NamedEntityClassificationTask",
"RelationClassificationFeatures",
"RelationClassificationTask",
"EventArgumentClassificationFeatures",
"EventArgumentClassificationTask",
"TACREDFeatures",
"TACREDRelationClassificationTask",
"PREDEFINED_TASKS",
]
# Ignore __dataclass_fields__ variables on documentation
__pdoc__ = {
**{
f"{_class}.{varname}": False
for _class in __all__
if hasattr(eval(_class), "__dataclass_fields__")
for varname in eval(_class).__dataclass_fields__.keys()
}
}
Classes
class Task (name: str = None, required_variables: List[str] = <factory>, additional_variables: List[str] = <factory>, labels: List[str] = <factory>, templates: Dict[str, List[str]] = <factory>, valid_conditions: Dict[str, List[str]] = None, negative_label_id: int = -1, multi_label: bool = False, features_class: type = a2t.tasks.base.Features)
-
Abstract class for Tasks definition.
The method
_assert_constraints()
must be overrided.Args
name
:str
, optional- A name for the task that may be used for to differentiate task when saving. Defaults to None.
required_variables
:List[str]
, optional- The variables required to perform the task and must be implemented by the
Features
class. Defaults to empty list. additional_variables
:List[str]
, optional- The variables not required to perform the task and must be implemented by the
Features
class. Defaults to empty list. labels
:List[str]
, optional- The labels for the task. Defaults to empty list.
templates
:Dict[str, List[str]]
, optional- The templates/verbalizations for the task. Defaults to empty dict.
valid_conditions
:Dict[str, List[str]]
, optional- The valid conditions or constraints for the task. Defaults to None.
negative_label_id
:int
, optional- The index of the negative label or -1 if no negative label exist. A negative label is for example the class
Other
on NER, that means that the specific token is not a named entity. Defaults to -1. multi_label
:bool
, optional- Whether the task must be treated as multi-label or not. You should treat as multi-label task a task that contains a negative label. Defaults to False.
features_class
:type
, optional- The
Features
class related to the task. Default toFeatures
.
Expand source code
@dataclass class Task: """Abstract class for Tasks definition. The method `_assert_constraints()` must be overrided. Args: name (str, optional): A name for the task that may be used for to differentiate task when saving. Defaults to None. required_variables (List[str], optional): The variables required to perform the task and must be implemented by the `Features` class. Defaults to empty list. additional_variables (List[str], optional): The variables not required to perform the task and must be implemented by the `Features` class. Defaults to empty list. labels (List[str], optional): The labels for the task. Defaults to empty list. templates (Dict[str, List[str]], optional): The templates/verbalizations for the task. Defaults to empty dict. valid_conditions (Dict[str, List[str]], optional): The valid conditions or constraints for the task. Defaults to None. negative_label_id (int, optional): The index of the negative label or -1 if no negative label exist. A negative label is for example the class `Other` on NER, that means that the specific token is not a named entity. Defaults to -1. multi_label (bool, optional): Whether the task must be treated as multi-label or not. You should treat as multi-label task a task that contains a negative label. Defaults to False. features_class (type, optional): The `Features` class related to the task. Default to `Features`. """ name: str = None required_variables: List[str] = field(default_factory=list) additional_variables: List[str] = field(default_factory=list) labels: List[str] = field(default_factory=list) templates: Dict[str, List[str]] = field(default_factory=dict) valid_conditions: Dict[str, List[str]] = None negative_label_id: int = -1 # -1 for no negative class multi_label: bool = False features_class: type = Features def __post_init__(self): self._assert_minimal_constraints() self._assert_constraints() self.label2id = {label: i for i, label in enumerate(self.labels)} self.n_labels = len(self.labels) if not self.templates: self.templates = {} # Create the templates to label mapping self.template2label = defaultdict(list) for label, templates in self.templates.items(): for template in templates: self.template2label[template].append(label) self.template_list = list(self.template2label.keys()) template2id = {template: i for i, template in enumerate(self.template_list)} self.label2templateid = defaultdict(list) for label, templates in self.templates.items(): self.label2templateid[label].extend([template2id[template] for template in templates]) # Create the valid_conditions matrix if self.valid_conditions: self._valid_conditions = {} self._always_valid_labels = np.zeros(self.n_labels) self._always_valid_labels[self.negative_label_id] = 1.0 for label, conditions in self.valid_conditions.items(): if label not in self.labels: continue for condition in conditions: if condition == "*": self._always_valid_labels[self.label2id[label]] = 1.0 continue if condition not in self._valid_conditions: self._valid_conditions[condition] = np.zeros(self.n_labels) if self.negative_label_id >= 0: self._valid_conditions[condition][self.negative_label_id] = 1.0 self._valid_conditions[condition][self.label2id[label]] = 1.0 else: self._valid_conditions = None def idx2label(idx): return self.labels[idx] self.idx2label = np.vectorize(idx2label) def __repr__(self) -> str: class_name = self.name if self.name else str(self.__class__) labels_repr = self.labels.__repr__() if len(labels_repr) > 89: labels_repr = self.labels[:3].__repr__().replace("]", ", ...]") templates_repr = len(self.template2label) feature_class_repr = str(self.features_class) return ( f"{class_name} (" f"\n\tLabels: {labels_repr}" f"\n\tTemplates: {templates_repr}" f"\n\tFeatures: {feature_class_repr}" "\n)" ) def _assert_constraints(self): raise NotImplementedError(f"{self.__class__} is an abstract class. This method should be implemented.") def _assert_minimal_constraints(self): assert len(self.labels) > 0, "The number of labels should be greather than 0." assert self.negative_label_id < len( self.labels ), "The id for the negative label should be lower than the amount of labels." if self.negative_label_id >= 0: assert self.templates is not None and len( [value for values in self.templates.values() for value in values] ), "`templates` parameter must not be None nor empty." # assert all( # key in self.labels for key in self.templates.keys() # ), "All the keys of templates dicts must be defined on labels." for key in list(self.templates.keys()): if key not in self.labels: warnings.warn(f"Label {key} not found among valid labels. Templates for label {key} not loaded.") del self.templates[key] if self.valid_conditions: # assert all( # key in self.labels for key in self.valid_conditions.keys() # ), "All the keys of valid_conditions dict must be defined on labels." for key in list(self.valid_conditions.keys()): if key not in self.labels: warnings.warn(f"Label {key} not found among valid labels. Valid conditions for label {key} not loaded.") del self.valid_conditions[key] assert all( var in self.features_class.__dict__["__dataclass_fields__"] for var in self.required_variables + self.additional_variables ), "All variables should be defined on the features_class." assert all( var.strip("{").strip("}") in [*self.required_variables, *self.additional_variables] for templates in self.templates.values() for template in templates for var in re.findall(r"{\w+}", template) ) def assert_features_class(self, features: List[Features]) -> None: """Assert that all features are instance of the task specific `Features` class. Args: features (List[Features]): The list of features to check. Raises: IncorrectFeatureTypeError: Raised when any feature is not an instance of the task specific `Features` class. """ for feature in features: if not isinstance(feature, self.features_class): raise IncorrectFeatureTypeError( f"Incorrect feature type given. Expected {self.features_class} but obtained {type(feature)}." ) def generate_premise_hypotheses_pairs(self, features: List[Features], sep_token: str = "</s>") -> List[str]: """Generate premise-hypothesis pairs based on the `Task` templates. Args: features (List[Features]): The list of features. sep_token (str, optional): The model specific separator token. Defaults to "</s>". Returns: List[str]: The list of premise-hypothesis pairs generated from the features and templates. """ if not isinstance(features, list): features = [features] sentence_pairs = [ f"{feature.context} {sep_token} {template.format(**feature.__dict__)}" for feature in features for template in self.template_list ] return sentence_pairs def reverse_to_labels(self, template_probs: np.ndarray, collate_fn: Callable = np.max) -> np.ndarray: """A function that maps template probabilities to label probabilites. By default, the maximum probabilities among label related templates is used. Args: template_probs (np.ndarray): (batch_size, n_templates) The templates probabilites. collate_fn (Callable, optional): The probabilites collate function. Defaults to np.max. Returns: np.ndarray: (batch_size, n_labels) The labels probabilities. """ outputs = np.hstack( [ collate_fn(template_probs[:, self.label2templateid[label]], axis=-1, keepdims=True) if label in self.label2templateid else np.zeros((template_probs.shape[0], 1)) for label in self.labels ] ) return outputs def apply_valid_conditions(self, features: List[Features], probs: np.ndarray) -> np.ndarray: """Applies the valid conditions to the labels probabilities. If a constraint is not satisfied the probability is set to 0. Args: features (List[Features]): (batch_size,) The list of features. probs (np.ndarray): (batch_size, n_labels) The labels probabilities. Returns: np.ndarray: (batch_size, n_labels) The labels probabilities. """ if self._valid_conditions: mask_matrix = np.stack( [self._valid_conditions.get(feature.inst_type, np.zeros(self.n_labels)) for feature in features], axis=0, ) probs = probs * np.logical_or(mask_matrix, self._always_valid_labels) # TODO: Need a test return probs def compute_metrics( self, labels: np.ndarray, output: np.ndarray, threshold: Union[str, float] = "optimize" ) -> Dict[str, float]: """Compute the metrics for the given task. This method is abstract and needs to be overrided. Args: labels (np.ndarray): (batch_size,) The correct labels. output (np.ndarray): (batch_size, n_labels) The labels probabilities. threshold (Union[str, float], optional): The threshold to use on the evaluation. Options: * **"default"**: The threshold is set to 0.5. * **"optimize"**: Optimize the threshold with the `labels`. Intended to be used on the development split. * **`float`**: A specific float value for the threshold. Defaults to "optimize". Raises: NotImplementedError: Raise if not overrided. Returns: Dict[str, float]: Dict with the resulting metrics. """ # TODO: Unittest raise NotImplementedError("This method must be implemented.") @classmethod def from_config(cls, file_path: str) -> object: """Loads the Task instance from a configuration file. Args: file_path (str): The path to the configuration file. Returns: Task: A `Task` instance based on the configuration file. """ with open(file_path, "rt") as f: config = json.load(f) if "features_class" in config: components = config["features_class"].split(".") mod = __import__(components[0]) for comp in components[1:]: mod = getattr(mod, comp) config["features_class"] = mod params = set([p.name for p in fields(cls)]) | set(inspect.signature(cls).parameters.keys()) params = {key: config[key] for key in params if key in config} return cls(**params) def to_config(self, file_path: str) -> None: """Saves the task instance to a configuration file. Args: file_path (str): The path to the configuration file. """ os.makedirs(os.path.dirname(file_path), exist_ok=True) with open(file_path, "wt") as f: values = {key: value for key, value in vars(self).items()} values["features_class"] = values["features_class"].__module__ + "." + values["features_class"].__name__ for key in ["label2id", "idx2label", "n_labels", "template2label", "label2templateid", "_valid_conditions"]: del values[key] json.dump(values, f, indent=4)
Subclasses
- a2t.tasks.base.BinaryTask
- a2t.tasks.base.UnaryTask
- a2t.tasks.base.ZeroaryTask
Static methods
def from_config(file_path: str) ‑> object
-
Loads the Task instance from a configuration file.
Args
file_path
:str
- The path to the configuration file.
Returns
Expand source code
@classmethod def from_config(cls, file_path: str) -> object: """Loads the Task instance from a configuration file. Args: file_path (str): The path to the configuration file. Returns: Task: A `Task` instance based on the configuration file. """ with open(file_path, "rt") as f: config = json.load(f) if "features_class" in config: components = config["features_class"].split(".") mod = __import__(components[0]) for comp in components[1:]: mod = getattr(mod, comp) config["features_class"] = mod params = set([p.name for p in fields(cls)]) | set(inspect.signature(cls).parameters.keys()) params = {key: config[key] for key in params if key in config} return cls(**params)
Methods
def assert_features_class(self, features: List[a2t.tasks.base.Features]) ‑> None
-
Assert that all features are instance of the task specific
Features
class.Args
features
:List[Features]
- The list of features to check.
Raises
IncorrectFeatureTypeError
- Raised when any feature is not an instance of the task specific
Features
class.
Expand source code
def assert_features_class(self, features: List[Features]) -> None: """Assert that all features are instance of the task specific `Features` class. Args: features (List[Features]): The list of features to check. Raises: IncorrectFeatureTypeError: Raised when any feature is not an instance of the task specific `Features` class. """ for feature in features: if not isinstance(feature, self.features_class): raise IncorrectFeatureTypeError( f"Incorrect feature type given. Expected {self.features_class} but obtained {type(feature)}." )
def generate_premise_hypotheses_pairs(self, features: List[a2t.tasks.base.Features], sep_token: str = '</s>') ‑> List[str]
-
Generate premise-hypothesis pairs based on the
Task
templates.Args
features
:List[Features]
- The list of features.
sep_token
:str
, optional- The model specific separator token. Defaults to "".
Returns
List[str]
- The list of premise-hypothesis pairs generated from the features and templates.
Expand source code
def generate_premise_hypotheses_pairs(self, features: List[Features], sep_token: str = "</s>") -> List[str]: """Generate premise-hypothesis pairs based on the `Task` templates. Args: features (List[Features]): The list of features. sep_token (str, optional): The model specific separator token. Defaults to "</s>". Returns: List[str]: The list of premise-hypothesis pairs generated from the features and templates. """ if not isinstance(features, list): features = [features] sentence_pairs = [ f"{feature.context} {sep_token} {template.format(**feature.__dict__)}" for feature in features for template in self.template_list ] return sentence_pairs
def reverse_to_labels(self, template_probs: numpy.ndarray, collate_fn: Callable = <function amax>) ‑> numpy.ndarray
-
A function that maps template probabilities to label probabilites. By default, the maximum probabilities among label related templates is used.
Args
template_probs
:np.ndarray
- (batch_size, n_templates) The templates probabilites.
collate_fn
:Callable
, optional- The probabilites collate function. Defaults to np.max.
Returns
np.ndarray
- (batch_size, n_labels) The labels probabilities.
Expand source code
def reverse_to_labels(self, template_probs: np.ndarray, collate_fn: Callable = np.max) -> np.ndarray: """A function that maps template probabilities to label probabilites. By default, the maximum probabilities among label related templates is used. Args: template_probs (np.ndarray): (batch_size, n_templates) The templates probabilites. collate_fn (Callable, optional): The probabilites collate function. Defaults to np.max. Returns: np.ndarray: (batch_size, n_labels) The labels probabilities. """ outputs = np.hstack( [ collate_fn(template_probs[:, self.label2templateid[label]], axis=-1, keepdims=True) if label in self.label2templateid else np.zeros((template_probs.shape[0], 1)) for label in self.labels ] ) return outputs
def apply_valid_conditions(self, features: List[a2t.tasks.base.Features], probs: numpy.ndarray) ‑> numpy.ndarray
-
Applies the valid conditions to the labels probabilities. If a constraint is not satisfied the probability is set to 0.
Args
features
:List[Features]
- (batch_size,) The list of features.
probs
:np.ndarray
- (batch_size, n_labels) The labels probabilities.
Returns
np.ndarray
- (batch_size, n_labels) The labels probabilities.
Expand source code
def apply_valid_conditions(self, features: List[Features], probs: np.ndarray) -> np.ndarray: """Applies the valid conditions to the labels probabilities. If a constraint is not satisfied the probability is set to 0. Args: features (List[Features]): (batch_size,) The list of features. probs (np.ndarray): (batch_size, n_labels) The labels probabilities. Returns: np.ndarray: (batch_size, n_labels) The labels probabilities. """ if self._valid_conditions: mask_matrix = np.stack( [self._valid_conditions.get(feature.inst_type, np.zeros(self.n_labels)) for feature in features], axis=0, ) probs = probs * np.logical_or(mask_matrix, self._always_valid_labels) # TODO: Need a test return probs
def compute_metrics(self, labels: numpy.ndarray, output: numpy.ndarray, threshold: Union[str, float] = 'optimize') ‑> Dict[str, float]
-
Compute the metrics for the given task. This method is abstract and needs to be overrided.
Args
labels
:np.ndarray
- (batch_size,) The correct labels.
output
:np.ndarray
- (batch_size, n_labels) The labels probabilities.
threshold
:Union[str, float]
, optional-
The threshold to use on the evaluation. Options:
- "default": The threshold is set to 0.5.
- "optimize": Optimize the threshold with the
labels
. Intended to be used on the development split. float
: A specific float value for the threshold.
Defaults to "optimize".
Raises
NotImplementedError
- Raise if not overrided.
Returns
Dict[str, float]
- Dict with the resulting metrics.
Expand source code
def compute_metrics( self, labels: np.ndarray, output: np.ndarray, threshold: Union[str, float] = "optimize" ) -> Dict[str, float]: """Compute the metrics for the given task. This method is abstract and needs to be overrided. Args: labels (np.ndarray): (batch_size,) The correct labels. output (np.ndarray): (batch_size, n_labels) The labels probabilities. threshold (Union[str, float], optional): The threshold to use on the evaluation. Options: * **"default"**: The threshold is set to 0.5. * **"optimize"**: Optimize the threshold with the `labels`. Intended to be used on the development split. * **`float`**: A specific float value for the threshold. Defaults to "optimize". Raises: NotImplementedError: Raise if not overrided. Returns: Dict[str, float]: Dict with the resulting metrics. """ # TODO: Unittest raise NotImplementedError("This method must be implemented.")
def to_config(self, file_path: str) ‑> None
-
Saves the task instance to a configuration file.
Args
file_path
:str
- The path to the configuration file.
Expand source code
def to_config(self, file_path: str) -> None: """Saves the task instance to a configuration file. Args: file_path (str): The path to the configuration file. """ os.makedirs(os.path.dirname(file_path), exist_ok=True) with open(file_path, "wt") as f: values = {key: value for key, value in vars(self).items()} values["features_class"] = values["features_class"].__module__ + "." + values["features_class"].__name__ for key in ["label2id", "idx2label", "n_labels", "template2label", "label2templateid", "_valid_conditions"]: del values[key] json.dump(values, f, indent=4)
class Features (context: str, label: str = None, inst_type: str = None)
-
A simple class to handle the features information.
Args
context
:str
- The context sentence.
label
:str
, optional- The label of the instance.
inst_type
:str
, optional- The type of the instance. This information is used for the `valid_conditions' constraints.
Expand source code
@dataclass class Features: """A simple class to handle the features information. Args: context (str): The context sentence. label (str, optional): The label of the instance. inst_type (str, optional): The type of the instance. This information is used for the `valid_conditions' constraints. """ context: str label: str = None inst_type: str = None
Subclasses
- a2t.tasks.base.BinaryFeatures
- a2t.tasks.base.UnaryFeatures
- a2t.tasks.base.ZeroaryFeatures
- a2t.tasks.text_classification.TextClassificationFeatures
- a2t.tasks.text_classification.TopicClassificationFeatures
- a2t.tasks.tuple_classification.EventArgumentClassificationFeatures
- a2t.tasks.tuple_classification.TACREDFeatures
class ZeroaryTask (name: str = None, required_variables: List[str] = <factory>, additional_variables: List[str] = <factory>, labels: List[str] = <factory>, templates: Dict[str, List[str]] = <factory>, valid_conditions: Dict[str, List[str]] = None, negative_label_id: int = -1, multi_label: bool = False, features_class: type = a2t.tasks.base.ZeroaryFeatures)
-
A
Task
implementation for Text Classification like tasks.Args
name
:str
, optional- A name for the task that may be used for to differentiate task when saving. Defaults to None.
required_variables
:List[str]
, optional- The variables required to perform the task and must be implemented by the
ZeroaryFeatures
class. Defaults to empty list. additional_variables
:List[str]
, optional- The variables not required to perform the task and must be implemented by the
ZeroaryFeatures
class. Defaults to empty list. labels
:List[str]
, optional- The labels for the task. Defaults to empty list.
templates
:Dict[str, List[str]]
, optional- The templates/verbalizations for the task. Defaults to empty dict.
valid_conditions
:Dict[str, List[str]]
, optional- The valid conditions or constraints for the task. Defaults to None.
multi_label
:bool
, optional- Whether the task must be treated as multi-label or not. You should treat as multi-label task a task that contains a negative label. Defaults to False.
features_class
:type
, optional- The
Features
class related to the task. Defaults toZeroaryFeatures
. negative_label_id
:int
, optional- The index of the negative label or -1 if no negative label exist. A negative label is for example the class
Other
on NER, that means that the specific token is not a named entity. Defaults to -1.
Expand source code
@dataclass class ZeroaryTask(Task): """A `Task` implementation for Text Classification like tasks. Args: name (str, optional): A name for the task that may be used for to differentiate task when saving. Defaults to None. required_variables (List[str], optional): The variables required to perform the task and must be implemented by the `ZeroaryFeatures` class. Defaults to empty list. additional_variables (List[str], optional): The variables not required to perform the task and must be implemented by the `ZeroaryFeatures` class. Defaults to empty list. labels (List[str], optional): The labels for the task. Defaults to empty list. templates (Dict[str, List[str]], optional): The templates/verbalizations for the task. Defaults to empty dict. valid_conditions (Dict[str, List[str]], optional): The valid conditions or constraints for the task. Defaults to None. multi_label (bool, optional): Whether the task must be treated as multi-label or not. You should treat as multi-label task a task that contains a negative label. Defaults to False. features_class (type, optional): The `Features` class related to the task. Defaults to `ZeroaryFeatures`. negative_label_id (int, optional): The index of the negative label or -1 if no negative label exist. A negative label is for example the class `Other` on NER, that means that the specific token is not a named entity. Defaults to -1. """ features_class: type = ZeroaryFeatures def _assert_constraints(self): # Assert the number of required variables to be 0 assert len(self.required_variables) == 0, "Zero-ary tasks like Text classifiation do not require any variable." def compute_metrics(self, labels: np.ndarray, output: np.ndarray, threshold: Union[str, float] = None) -> Dict[str, float]: """Compute the metrics for the given task. By default on `ZeroaryTask` the Accuracy is computed. Args: labels (np.ndarray): (batch_size,) The correct labels. output (np.ndarray): (batch_size, n_labels) The labels probabilities. threshold (Union[str, float], optional): No threshold is needed on `ZeroaryTask`. Returns: Dict[str, float]: Dict with the resulting metrics. """ # TODO: Unittest if threshold: warnings.warn(f"{self.__class__} do not require 'threshold', ignored.") return {"accuracy_score": accuracy_score(labels, output.argmax(-1))}
Ancestors
- a2t.tasks.base.Task
Subclasses
- a2t.tasks.text_classification.TextClassificationTask
- a2t.tasks.text_classification.TopicClassificationTask
Methods
def compute_metrics(self, labels: numpy.ndarray, output: numpy.ndarray, threshold: Union[str, float] = None) ‑> Dict[str, float]
-
Compute the metrics for the given task. By default on
ZeroaryTask
the Accuracy is computed.Args
labels
:np.ndarray
- (batch_size,) The correct labels.
output
:np.ndarray
- (batch_size, n_labels) The labels probabilities.
threshold
:Union[str, float]
, optional- No threshold is needed on
ZeroaryTask
.
Returns
Dict[str, float]
- Dict with the resulting metrics.
Expand source code
def compute_metrics(self, labels: np.ndarray, output: np.ndarray, threshold: Union[str, float] = None) -> Dict[str, float]: """Compute the metrics for the given task. By default on `ZeroaryTask` the Accuracy is computed. Args: labels (np.ndarray): (batch_size,) The correct labels. output (np.ndarray): (batch_size, n_labels) The labels probabilities. threshold (Union[str, float], optional): No threshold is needed on `ZeroaryTask`. Returns: Dict[str, float]: Dict with the resulting metrics. """ # TODO: Unittest if threshold: warnings.warn(f"{self.__class__} do not require 'threshold', ignored.") return {"accuracy_score": accuracy_score(labels, output.argmax(-1))}
class ZeroaryFeatures (context: str, label: str = None, inst_type: str = None)
-
A features class for
ZeroaryTask
. It only requires acontext
argument.Expand source code
class ZeroaryFeatures(Features): """A features class for `ZeroaryTask`. It only requires a `context` argument.""" pass
Ancestors
- a2t.tasks.base.Features
class UnaryTask (name: str = None, required_variables: List[str] = <factory>, additional_variables: List[str] = <factory>, labels: List[str] = <factory>, templates: Dict[str, List[str]] = <factory>, valid_conditions: Dict[str, List[str]] = None, negative_label_id: int = -1, multi_label: bool = False, features_class: type = a2t.tasks.base.UnaryFeatures)
-
A
Task
implementation for Span Classification like tasks.Args
name
:str
, optional- A name for the task that may be used for to differentiate task when saving. Defaults to None.
required_variables
:List[str]
, optional- The variables required to perform the task and must be implemented by the
UnaryFeatures
class. Defaults["X"]
. additional_variables
:List[str]
, optional- The variables not required to perform the task and must be implemented by the
UnaryFeatures
class. Defaults to empty list. labels
:List[str]
, optional- The labels for the task. Defaults to empty list.
templates
:Dict[str, List[str]]
, optional- The templates/verbalizations for the task. Defaults to empty dict.
valid_conditions
:Dict[str, List[str]]
, optional- The valid conditions or constraints for the task. Defaults to None.
multi_label
:bool
, optional- Whether the task must be treated as multi-label or not. You should treat as multi-label task a task that contains a negative label. Defaults to False.
features_class
:type
, optional- The
Features
class related to the task. Default toUnaryFeatures
. negative_label_id
:int
, optional- The index of the negative label or -1 if no negative label exist. A negative label is for example the class
Other
on NER, that means that the specific token is not a named entity. Defaults to -1.
Expand source code
@dataclass class UnaryTask(Task): """A `Task` implementation for Span Classification like tasks. Args: name (str, optional): A name for the task that may be used for to differentiate task when saving. Defaults to None. required_variables (List[str], optional): The variables required to perform the task and must be implemented by the `UnaryFeatures` class. Defaults `["X"]`. additional_variables (List[str], optional): The variables not required to perform the task and must be implemented by the `UnaryFeatures` class. Defaults to empty list. labels (List[str], optional): The labels for the task. Defaults to empty list. templates (Dict[str, List[str]], optional): The templates/verbalizations for the task. Defaults to empty dict. valid_conditions (Dict[str, List[str]], optional): The valid conditions or constraints for the task. Defaults to None. multi_label (bool, optional): Whether the task must be treated as multi-label or not. You should treat as multi-label task a task that contains a negative label. Defaults to False. features_class (type, optional): The `Features` class related to the task. Default to `UnaryFeatures`. negative_label_id (int, optional): The index of the negative label or -1 if no negative label exist. A negative label is for example the class `Other` on NER, that means that the specific token is not a named entity. Defaults to -1. """ required_variables: List[str] = field(default_factory=lambda: ["X"]) features_class: type = UnaryFeatures def _assert_constraints(self): # Assert the number of required variables to be 1 assert len(self.required_variables) == 1, "Unary-ary tasks like Span classifiation requires 1 variable." def compute_metrics(self, labels: np.ndarray, output: np.ndarray, threshold: Union[str, float] = "optimize"): """Compute the metrics for the given task. By default on `UnaryTask` the Accuracy is computed if the `negative_label_id` is < 0, otherwise the Precision, Recall, F1-Score and positive Accuracy are computed. Args: labels (np.ndarray): (batch_size,) The correct labels. output (np.ndarray): (batch_size, n_labels) The labels probabilities. threshold (Union[str, float], optional): The threshold to use on the evaluation. Options: * "default": The threshold is set to 0.5. * "optimize": Optimize the threshold with the `labels`. Intended to be used on the development split. * `float`: A specific float value for the threshold. Defaults to "optimize". Returns: Dict[str, float]: Dict with the resulting metrics. """ # TODO: Unittest if threshold not in ["default", "optimize"] and not isinstance(threshold, float): raise ValueError("Threshold must be either 'default', 'optimize' or a float value.") if threshold == "default": threshold = 0.5 if threshold == "optimize": threshold, _ = find_optimal_threshold(labels, output, negative_label_id=self.negative_label_id) results = {"optimal_threshold": threshold} if self.negative_label_id < 0: results["accuracy_score"] = accuracy_score(labels, output.argmax(-1)) else: output_ = apply_threshold(output, threshold=threshold, negative_label_id=self.negative_label_id) positive_labels = list(set(range(len(self.labels))) - set([self.negative_label_id])) output_pos = output.copy() output_pos[:, self.negative_label_id] = 0.0 results["positive_accuracy"] = accuracy_score( labels[labels != self.negative_label_id], output_pos[labels != self.negative_label_id, :].argmax(-1) ) pre, rec, f1, _ = precision_recall_fscore_support(labels, output_, labels=positive_labels, average="micro") results["precision"] = pre results["recall"] = rec results["f1-score"] = f1 return results
Ancestors
- a2t.tasks.base.Task
Subclasses
- a2t.tasks.span_classification.NamedEntityClassificationTask
Methods
def compute_metrics(self, labels: numpy.ndarray, output: numpy.ndarray, threshold: Union[str, float] = 'optimize')
-
Compute the metrics for the given task. By default on
UnaryTask
the Accuracy is computed if thenegative_label_id
is < 0, otherwise the Precision, Recall, F1-Score and positive Accuracy are computed.Args
labels
:np.ndarray
- (batch_size,) The correct labels.
output
:np.ndarray
- (batch_size, n_labels) The labels probabilities.
threshold
:Union[str, float]
, optional-
The threshold to use on the evaluation. Options:
- "default": The threshold is set to 0.5.
- "optimize": Optimize the threshold with the
labels
. Intended to be used on the development split. float
: A specific float value for the threshold.
Defaults to "optimize".
Returns
Dict[str, float]
- Dict with the resulting metrics.
Expand source code
def compute_metrics(self, labels: np.ndarray, output: np.ndarray, threshold: Union[str, float] = "optimize"): """Compute the metrics for the given task. By default on `UnaryTask` the Accuracy is computed if the `negative_label_id` is < 0, otherwise the Precision, Recall, F1-Score and positive Accuracy are computed. Args: labels (np.ndarray): (batch_size,) The correct labels. output (np.ndarray): (batch_size, n_labels) The labels probabilities. threshold (Union[str, float], optional): The threshold to use on the evaluation. Options: * "default": The threshold is set to 0.5. * "optimize": Optimize the threshold with the `labels`. Intended to be used on the development split. * `float`: A specific float value for the threshold. Defaults to "optimize". Returns: Dict[str, float]: Dict with the resulting metrics. """ # TODO: Unittest if threshold not in ["default", "optimize"] and not isinstance(threshold, float): raise ValueError("Threshold must be either 'default', 'optimize' or a float value.") if threshold == "default": threshold = 0.5 if threshold == "optimize": threshold, _ = find_optimal_threshold(labels, output, negative_label_id=self.negative_label_id) results = {"optimal_threshold": threshold} if self.negative_label_id < 0: results["accuracy_score"] = accuracy_score(labels, output.argmax(-1)) else: output_ = apply_threshold(output, threshold=threshold, negative_label_id=self.negative_label_id) positive_labels = list(set(range(len(self.labels))) - set([self.negative_label_id])) output_pos = output.copy() output_pos[:, self.negative_label_id] = 0.0 results["positive_accuracy"] = accuracy_score( labels[labels != self.negative_label_id], output_pos[labels != self.negative_label_id, :].argmax(-1) ) pre, rec, f1, _ = precision_recall_fscore_support(labels, output_, labels=positive_labels, average="micro") results["precision"] = pre results["recall"] = rec results["f1-score"] = f1 return results
class UnaryFeatures (context: str, label: str = None, inst_type: str = None, X: str = None)
-
A features class for
UnaryTask
. It requirescontext
andX
arguments.Expand source code
@dataclass class UnaryFeatures(Features): """A features class for `UnaryTask`. It requires `context` and `X` arguments.""" X: str = None
Ancestors
- a2t.tasks.base.Features
Subclasses
- a2t.tasks.span_classification.NamedEntityClassificationFeatures
class BinaryTask (name: str = None, required_variables: List[str] = <factory>, additional_variables: List[str] = <factory>, labels: List[str] = <factory>, templates: Dict[str, List[str]] = <factory>, valid_conditions: Dict[str, List[str]] = None, negative_label_id: int = -1, multi_label: bool = False, features_class: type = a2t.tasks.base.BinaryFeatures)
-
A
Task
implementation for Relation Classification like tasks.Args
name
:str
, optional- A name for the task that may be used for to differentiate task when saving. Defaults to None.
required_variables
:List[str]
, optional- The variables required to perform the task and must be implemented by the
BinaryFeatures
class. Defaults["X", "Y"]
. additional_variables
:List[str]
, optional- The variables not required to perform the task and must be implemented by the
BinaryFeatures
class. Defaults to empty list. labels
:List[str]
, optional- The labels for the task. Defaults to empty list.
templates
:Dict[str, List[str]]
, optional- The templates/verbalizations for the task. Defaults to empty dict.
valid_conditions
:Dict[str, List[str]]
, optional- The valid conditions or constraints for the task. Defaults to None.
multi_label
:bool
, optional- Whether the task must be treated as multi-label or not. You should treat as multi-label task a task that contains a negative label. Defaults to False.
features_class
:type
, optional- The
Features
class related to the task. Default toBinaryFeatures
. negative_label_id
:int
, optional- The index of the negative label or -1 if no negative label exist. A negative label is for example the class
Other
on NER, that means that the specific token is not a named entity. Defaults to -1.
Expand source code
@dataclass class BinaryTask(Task): """A `Task` implementation for Relation Classification like tasks. Args: name (str, optional): A name for the task that may be used for to differentiate task when saving. Defaults to None. required_variables (List[str], optional): The variables required to perform the task and must be implemented by the `BinaryFeatures` class. Defaults `["X", "Y"]`. additional_variables (List[str], optional): The variables not required to perform the task and must be implemented by the `BinaryFeatures` class. Defaults to empty list. labels (List[str], optional): The labels for the task. Defaults to empty list. templates (Dict[str, List[str]], optional): The templates/verbalizations for the task. Defaults to empty dict. valid_conditions (Dict[str, List[str]], optional): The valid conditions or constraints for the task. Defaults to None. multi_label (bool, optional): Whether the task must be treated as multi-label or not. You should treat as multi-label task a task that contains a negative label. Defaults to False. features_class (type, optional): The `Features` class related to the task. Default to `BinaryFeatures`. negative_label_id (int, optional): The index of the negative label or -1 if no negative label exist. A negative label is for example the class `Other` on NER, that means that the specific token is not a named entity. Defaults to -1. """ required_variables: List[str] = field(default_factory=lambda: ["X", "Y"]) features_class: type = BinaryFeatures def _assert_constraints(self): # Assert the number of required variables to be 2 assert len(self.required_variables) == 2, "Binary-ary tasks like Tuple classifiation require 2 variable." def compute_metrics(self, labels: np.ndarray, output: np.ndarray, threshold: Union[str, float] = "optimize"): """Compute the metrics for the given task. By default on `BinaryTask` the Accuracy is computed if the `negative_label_id` is < 0, otherwise the Precision, Recall, F1-Score and positive Accuracy are computed. Args: labels (np.ndarray): (batch_size,) The correct labels. output (np.ndarray): (batch_size, n_labels) The labels probabilities. threshold (Union[str, float], optional): The threshold to use on the evaluation. Options: * "default": The threshold is set to 0.5. * "optimize": Optimize the threshold with the `labels`. Intended to be used on the development split. * `float`: A specific float value for the threshold. Defaults to "optimize". Returns: Dict[str, float]: Dict with the resulting metrics. """ # TODO: Unittest + documentation if threshold not in ["default", "optimize"] and not isinstance(threshold, float): raise ValueError("Threshold must be either 'default', 'optimize' or a float value.") if threshold == "default": threshold = 0.5 if threshold == "optimize": threshold, _ = find_optimal_threshold(labels, output, negative_label_id=self.negative_label_id) results = {"optimal_threshold": threshold} if self.negative_label_id < 0: results["accuracy_score"] = accuracy_score(labels, output.argmax(-1)) else: output_ = apply_threshold(output, threshold=threshold, negative_label_id=self.negative_label_id) positive_labels = list(set(range(len(self.labels))) - set([self.negative_label_id])) output_pos = output.copy() output_pos[:, self.negative_label_id] = 0.0 results["positive_accuracy"] = accuracy_score( labels[labels != self.negative_label_id], output_pos[labels != self.negative_label_id, :].argmax(-1) ) pre, rec, f1, _ = precision_recall_fscore_support(labels, output_, labels=positive_labels, average="micro") results["precision"] = pre results["recall"] = rec results["f1-score"] = f1 return results
Ancestors
- a2t.tasks.base.Task
Subclasses
- a2t.tasks.tuple_classification.EventArgumentClassificationTask
- a2t.tasks.tuple_classification.RelationClassificationTask
Methods
def compute_metrics(self, labels: numpy.ndarray, output: numpy.ndarray, threshold: Union[str, float] = 'optimize')
-
Compute the metrics for the given task. By default on
BinaryTask
the Accuracy is computed if thenegative_label_id
is < 0, otherwise the Precision, Recall, F1-Score and positive Accuracy are computed.Args
labels
:np.ndarray
- (batch_size,) The correct labels.
output
:np.ndarray
- (batch_size, n_labels) The labels probabilities.
threshold
:Union[str, float]
, optional-
The threshold to use on the evaluation. Options:
- "default": The threshold is set to 0.5.
- "optimize": Optimize the threshold with the
labels
. Intended to be used on the development split. float
: A specific float value for the threshold.
Defaults to "optimize".
Returns
Dict[str, float]
- Dict with the resulting metrics.
Expand source code
def compute_metrics(self, labels: np.ndarray, output: np.ndarray, threshold: Union[str, float] = "optimize"): """Compute the metrics for the given task. By default on `BinaryTask` the Accuracy is computed if the `negative_label_id` is < 0, otherwise the Precision, Recall, F1-Score and positive Accuracy are computed. Args: labels (np.ndarray): (batch_size,) The correct labels. output (np.ndarray): (batch_size, n_labels) The labels probabilities. threshold (Union[str, float], optional): The threshold to use on the evaluation. Options: * "default": The threshold is set to 0.5. * "optimize": Optimize the threshold with the `labels`. Intended to be used on the development split. * `float`: A specific float value for the threshold. Defaults to "optimize". Returns: Dict[str, float]: Dict with the resulting metrics. """ # TODO: Unittest + documentation if threshold not in ["default", "optimize"] and not isinstance(threshold, float): raise ValueError("Threshold must be either 'default', 'optimize' or a float value.") if threshold == "default": threshold = 0.5 if threshold == "optimize": threshold, _ = find_optimal_threshold(labels, output, negative_label_id=self.negative_label_id) results = {"optimal_threshold": threshold} if self.negative_label_id < 0: results["accuracy_score"] = accuracy_score(labels, output.argmax(-1)) else: output_ = apply_threshold(output, threshold=threshold, negative_label_id=self.negative_label_id) positive_labels = list(set(range(len(self.labels))) - set([self.negative_label_id])) output_pos = output.copy() output_pos[:, self.negative_label_id] = 0.0 results["positive_accuracy"] = accuracy_score( labels[labels != self.negative_label_id], output_pos[labels != self.negative_label_id, :].argmax(-1) ) pre, rec, f1, _ = precision_recall_fscore_support(labels, output_, labels=positive_labels, average="micro") results["precision"] = pre results["recall"] = rec results["f1-score"] = f1 return results
class BinaryFeatures (context: str, label: str = None, inst_type: str = None, X: str = None, Y: str = None)
-
A features class for
BinaryTask
. It requirescontext
,X
andY
arguments.Expand source code
@dataclass class BinaryFeatures(Features): """A features class for `BinaryTask`. It requires `context`, `X` and `Y` arguments.""" X: str = None Y: str = None
Ancestors
- a2t.tasks.base.Features
Subclasses
- a2t.tasks.tuple_classification.RelationClassificationFeatures
class TopicClassificationFeatures (context: str, label: str = None, inst_type: str = None)
-
A class handler for the Topic Classification features. It inherits from
Features
.Expand source code
@dataclass class TopicClassificationFeatures(Features): """A class handler for the Topic Classification features. It inherits from `Features`.""" pass
Ancestors
- a2t.tasks.base.Features
class TopicClassificationTask (name: str = None, labels: List[str] = None, templates: Dict[str, List[str]] = None, hypothesis_template: str = 'The domain of the sentence is about {label}.', features_class: type = a2t.tasks.text_classification.TopicClassificationFeatures, preprocess_labels: bool = False, preprocess_fn: Callable = None, **kwargs)
-
A class handler for Topic Classification task. It inherits from
ZeroaryTask
class.Initialization of a TopicClassification task.
Args
name
:str
, optional- A name for the task that may be used for to differentiate task when saving. Defaults to None.
labels
:List[str]
- The labels for the task. Defaults to empty list.
templates
:Dict[str, List[str]]
, optional- The templates/verbalizations for the task. Defaults to None.
hypothesis_template
:str
, optional- A meta template to generate hypothesis templates, if
templates
is None, then the templates will be the combinations of thehypothesis_template
and thelabels
. It must contain the '{label}' placeholder. Defaults to "The domain of the sentence is about {label}.". features_class
:type
, optional- The
Features
class related to the task. Defaults to TopicClassificationFeatures. preprocess_labels
:bool
, optional- Whether to split the topic labels. Defaults to True.
preprocess_fn
:Callable
, optional- The function that is applied if
split_labels
is True. If None thenTopicClassificationTask._split_labels_fn
is applied. Defaults to None.
Raises
IncorrectHypothesisTemplateError
- Raised when the
hypotesis_template
argument does not contain the '{label}' placeholder.
Expand source code
class TopicClassificationTask(ZeroaryTask): """A class handler for Topic Classification task. It inherits from `ZeroaryTask` class.""" def __init__( self, name: str = None, labels: List[str] = None, templates: Dict[str, List[str]] = None, hypothesis_template: str = "The domain of the sentence is about {label}.", features_class: type = TopicClassificationFeatures, preprocess_labels: bool = False, preprocess_fn: Callable = None, **kwargs, ) -> None: """Initialization of a TopicClassification task. Args: name (str, optional): A name for the task that may be used for to differentiate task when saving. Defaults to None. labels (List[str]): The labels for the task. Defaults to empty list. templates (Dict[str, List[str]], optional): The templates/verbalizations for the task. Defaults to None. hypothesis_template (str, optional): A meta template to generate hypothesis templates, if `templates` is None, then the templates will be the combinations of the `hypothesis_template` and the `labels`. It must contain the '{label}' placeholder. Defaults to "The domain of the sentence is about {label}.". features_class (type, optional): The `Features` class related to the task. Defaults to TopicClassificationFeatures. preprocess_labels (bool, optional): Whether to split the topic labels. Defaults to True. preprocess_fn (Callable, optional): The function that is applied if `split_labels` is True. If None then `TopicClassificationTask._split_labels_fn` is applied. Defaults to None. Raises: IncorrectHypothesisTemplateError: Raised when the `hypotesis_template` argument does not contain the '{label}' placeholder. """ if not templates: if "{label}" not in hypothesis_template: raise IncorrectHypothesisTemplateError( "The hypothesis_template argument must contain the '{label}' placeholder." ) if preprocess_labels: split_labels_fn = preprocess_fn if preprocess_fn is not None else self._split_and_extend_labels_fn templates = { label: [hypothesis_template.format(label=partial_label) for partial_label in split_labels_fn(label)] for label in labels } else: templates = {label: [hypothesis_template.format(label=label)] for label in labels} super().__init__( name=name, required_variables=[], additional_variables=[], labels=labels, templates=templates, valid_conditions=None, features_class=features_class, **kwargs, ) @staticmethod def _split_labels_fn(label: str) -> List[str]: labels = [ partial_label.strip().capitalize() for partial_label in label.split(",") for partial_label in partial_label.split("and") if len(partial_label.strip()) ] return list(set(labels)) @staticmethod def _split_and_extend_labels_fn(label: str) -> List[str]: labels = [label] + [ partial_label.strip().capitalize() for partial_label in label.split(",") for partial_label in partial_label.split("and") if len(partial_label.strip()) ] return list(set(labels))
Ancestors
- a2t.tasks.base.ZeroaryTask
- a2t.tasks.base.Task
class TextClassificationFeatures (context: str, label: str = None, inst_type: str = None)
-
A class handler for the Text Classification features. It inherits from
Features
.Expand source code
@dataclass class TextClassificationFeatures(Features): """A class handler for the Text Classification features. It inherits from `Features`.""" pass
Ancestors
- a2t.tasks.base.Features
class TextClassificationTask (name: str = None, labels: List[str] = None, templates: Dict[str, List[str]] = None, hypothesis_template: str = 'It was {label}.', features_class: type = a2t.tasks.text_classification.TextClassificationFeatures, multi_label: bool = False, **kwargs)
-
A class handler for Text Classification tasks. It inherits from
ZeroaryTask
class.summary
Args
name
:str
, optional- A name for the task that may be used for to differentiate task when saving. Defaults to None.
labels
:List[str]
, optional- The labels for the task. Defaults to empty list.
templates
:Dict[str, List[str]]
, optional- The templates/verbalizations for the task. Defaults to None.
hypothesis_template
:str
, optional- A meta template to generate hypothesis templates, if
templates
is None, then the templates will be the combinations of thehypothesis_template
and thelabels
. It must contain the '{label}' placeholder. Defaults to "The domain of the sentence is about {label}.". features_class
:type
, optional- The
Features
class related to the task. Defaults to TextClassificationFeatures. multi_label
:bool
, optional- Whether the task must be treated as multi-label or not. Defaults to False.
Raises
IncorrectHypothesisTemplateError
- Raised when the
hypotesis_template
argument does not contain the '{label}' placeholder.
Expand source code
class TextClassificationTask(ZeroaryTask): """A class handler for Text Classification tasks. It inherits from `ZeroaryTask` class.""" def __init__( self, name: str = None, labels: List[str] = None, templates: Dict[str, List[str]] = None, hypothesis_template: str = "It was {label}.", features_class: type = TextClassificationFeatures, multi_label: bool = False, **kwargs, ): """_summary_ Args: name (str, optional): A name for the task that may be used for to differentiate task when saving. Defaults to None. labels (List[str], optional): The labels for the task. Defaults to empty list. templates (Dict[str, List[str]], optional): The templates/verbalizations for the task. Defaults to None. hypothesis_template (str, optional): A meta template to generate hypothesis templates, if `templates` is None, then the templates will be the combinations of the `hypothesis_template` and the `labels`. It must contain the '{label}' placeholder. Defaults to "The domain of the sentence is about {label}.". features_class (type, optional): The `Features` class related to the task. Defaults to TextClassificationFeatures. multi_label (bool, optional): Whether the task must be treated as multi-label or not. Defaults to False. Raises: IncorrectHypothesisTemplateError: Raised when the `hypotesis_template` argument does not contain the '{label}' placeholder. """ if not templates: if "{label}" not in hypothesis_template: raise IncorrectHypothesisTemplateError( "The hypothesis_template argument must contain the '{label}' placeholder." ) templates = {label: [hypothesis_template.format(label=label)] for label in labels} super().__init__( name=name, required_variables=[], additional_variables=[], labels=labels, templates=templates, valid_conditions=None, features_class=features_class, multi_label=multi_label, **kwargs, )
Ancestors
- a2t.tasks.base.ZeroaryTask
- a2t.tasks.base.Task
class NamedEntityClassificationFeatures (context: str, label: str = None, inst_type: str = None, X: str = None)
-
A class handler for the Named Entity Classification features. It inherits from
UnaryFeatures
.Expand source code
@dataclass class NamedEntityClassificationFeatures(UnaryFeatures): """A class handler for the Named Entity Classification features. It inherits from `UnaryFeatures`."""
Ancestors
- a2t.tasks.base.UnaryFeatures
- a2t.tasks.base.Features
class NamedEntityClassificationTask (name: str, labels: List[str], *args, required_variables: List[str] = ['X'], additional_variables: List[str] = ['inst_type'], templates: Dict[str, List[str]] = None, valid_conditions: Dict[str, List[str]] = None, hypothesis_template: str = '{X} is a {label}.', features_class: type = a2t.tasks.span_classification.NamedEntityClassificationFeatures, multi_label: bool = True, negative_label_id: int = 0, **kwargs)
-
A class handler for Named Entity Classification task. It inherits from
UnaryTask
class.Initialization of a NamedEntityClassificationTask task.
Args
name
:str
- A name for the task that may be used for to differentiate task when saving.
labels
:List[str]
- The labels for the task.
required_variables
:List[str]
, optional- The variables required to perform the task and must be implemented by the
NamedEntityClassificationFeatures
class. Defaults to["X", "Y"]
. additional_variables
:List[str]
, optional- The variables not required to perform the task and must be implemented by the
NamedEntityClassificationFeatures
class. Defaults to ["inst_type"]. templates
:Dict[str, List[str]]
, optional- The templates/verbalizations for the task. Defaults to None.
valid_conditions
:Dict[str, List[str]]
, optional- The valid conditions or constraints for the task. Defaults to None.
hypothesis_template
:str
, optional- A meta template to generate hypothesis templates, if
templates
is None, then the templates will be the combinations of thehypothesis_template
and thelabels
. It must contain the '{label}' placeholder. Defaults to "{X} is a {label}.". features_class
:type
, optional- The
Features
class related to the task. Defaults to NamedEntityClassificationFeatures. multi_label
:bool
, optional- Whether the task must be treated as multi-label or not. Defaults to True.
negative_label_id
:int
, optional- The index of the negative label or -1 if no negative label exist. A negative label is for example the class
Other
on NER, that means that the specific token is not a named entity. Defaults to 0.
Raises
IncorrectHypothesisTemplateError
- Raised when the
hypotesis_template
argument does not contain the '{label}' placeholder.
Expand source code
class NamedEntityClassificationTask(UnaryTask): """A class handler for Named Entity Classification task. It inherits from `UnaryTask` class.""" def __init__( self, name: str, labels: List[str], *args, required_variables: List[str] = ["X"], additional_variables: List[str] = ["inst_type"], templates: Dict[str, List[str]] = None, valid_conditions: Dict[str, List[str]] = None, hypothesis_template: str = "{X} is a {label}.", features_class: type = NamedEntityClassificationFeatures, multi_label: bool = True, negative_label_id: int = 0, **kwargs ) -> None: """Initialization of a NamedEntityClassificationTask task. Args: name (str): A name for the task that may be used for to differentiate task when saving. labels (List[str]): The labels for the task. required_variables (List[str], optional): The variables required to perform the task and must be implemented by the `NamedEntityClassificationFeatures` class. Defaults to `["X", "Y"]`. additional_variables (List[str], optional): The variables not required to perform the task and must be implemented by the `NamedEntityClassificationFeatures` class. Defaults to ["inst_type"]. templates (Dict[str, List[str]], optional): The templates/verbalizations for the task. Defaults to None. valid_conditions (Dict[str, List[str]], optional): The valid conditions or constraints for the task. Defaults to None. hypothesis_template (str, optional): A meta template to generate hypothesis templates, if `templates` is None, then the templates will be the combinations of the `hypothesis_template` and the `labels`. It must contain the '{label}' placeholder. Defaults to "{X} is a {label}.". features_class (type, optional): The `Features` class related to the task. Defaults to NamedEntityClassificationFeatures. multi_label (bool, optional): Whether the task must be treated as multi-label or not. Defaults to True. negative_label_id (int, optional): The index of the negative label or -1 if no negative label exist. A negative label is for example the class `Other` on NER, that means that the specific token is not a named entity. Defaults to 0. Raises: IncorrectHypothesisTemplateError: Raised when the `hypotesis_template` argument does not contain the '{label}' placeholder. """ if not templates: if "{label}" not in hypothesis_template: raise IncorrectHypothesisTemplateError( "The hypothesis_template argument must contain the '{label}' placeholder." ) templates = { label: [hypothesis_template.replace("{label}", label)] for i, label in enumerate(labels) if i != negative_label_id } super().__init__( *args, name=name, required_variables=required_variables, additional_variables=additional_variables, labels=labels, templates=templates, valid_conditions=valid_conditions, features_class=features_class, multi_label=multi_label, negative_label_id=negative_label_id, **kwargs )
Ancestors
- a2t.tasks.base.UnaryTask
- a2t.tasks.base.Task
class RelationClassificationFeatures (context: str, label: str = None, inst_type: str = None, X: str = None, Y: str = None)
-
A class handler for the Relation Classification features. It inherits from
BinaryFeatures
.Expand source code
@dataclass class RelationClassificationFeatures(BinaryFeatures): """A class handler for the Relation Classification features. It inherits from `BinaryFeatures`."""
Ancestors
- a2t.tasks.base.BinaryFeatures
- a2t.tasks.base.Features
class RelationClassificationTask (name: str, labels: List[str], required_variables: List[str] = ['X', 'Y'], additional_variables: List[str] = ['inst_type'], templates: Dict[str, List[str]] = None, valid_conditions: Dict[str, List[str]] = None, features_class: type = a2t.tasks.tuple_classification.RelationClassificationFeatures, multi_label: bool = True, negative_label_id: int = 0, **kwargs)
-
A class handler for Relation Classification task. It inherits from
BinaryTask
class.Initialization of a RelationClassificationTask task.
Args
name
:str
- A name for the task that may be used for to differentiate task when saving.
labels
:List[str]
- The labels for the task.
required_variables
:List[str]
, optional- The variables required to perform the task and must be implemented by the
RelationClassificationFeatures
class. Defaults to["X", "Y"]
. additional_variables
:List[str]
, optional- The variables not required to perform the task and must be implemented by the
RelationClassificationFeatures
class. Defaults to ["inst_type"]. templates
:Dict[str, List[str]]
, optional- The templates/verbalizations for the task. Defaults to None.
valid_conditions
:Dict[str, List[str]]
, optional- The valid conditions or constraints for the task. Defaults to None.
features_class
:type
, optional- The
Features
class related to the task. Defaults to RelationClassificationFeatures. multi_label
:bool
, optional- Whether the task must be treated as multi-label or not. Defaults to True.
negative_label_id
:int
, optional- The index of the negative label or -1 if no negative label exist. A negative label is for example the class
Other
on NER, that means that the specific token is not a named entity. Defaults to 0.
Expand source code
class RelationClassificationTask(BinaryTask): """A class handler for Relation Classification task. It inherits from `BinaryTask` class.""" def __init__( self, name: str, labels: List[str], # *args, required_variables: List[str] = ["X", "Y"], additional_variables: List[str] = ["inst_type"], templates: Dict[str, List[str]] = None, valid_conditions: Dict[str, List[str]] = None, features_class: type = RelationClassificationFeatures, multi_label: bool = True, negative_label_id: int = 0, **kwargs ) -> None: """Initialization of a RelationClassificationTask task. Args: name (str): A name for the task that may be used for to differentiate task when saving. labels (List[str]): The labels for the task. required_variables (List[str], optional): The variables required to perform the task and must be implemented by the `RelationClassificationFeatures` class. Defaults to `["X", "Y"]`. additional_variables (List[str], optional): The variables not required to perform the task and must be implemented by the `RelationClassificationFeatures` class. Defaults to ["inst_type"]. templates (Dict[str, List[str]], optional): The templates/verbalizations for the task. Defaults to None. valid_conditions (Dict[str, List[str]], optional): The valid conditions or constraints for the task. Defaults to None. features_class (type, optional): The `Features` class related to the task. Defaults to RelationClassificationFeatures. multi_label (bool, optional): Whether the task must be treated as multi-label or not. Defaults to True. negative_label_id (int, optional): The index of the negative label or -1 if no negative label exist. A negative label is for example the class `Other` on NER, that means that the specific token is not a named entity. Defaults to 0. """ super().__init__( # *args, name=name, required_variables=required_variables, additional_variables=additional_variables, labels=labels, templates=templates, valid_conditions=valid_conditions, features_class=features_class, multi_label=multi_label, negative_label_id=negative_label_id, **kwargs )
Ancestors
- a2t.tasks.base.BinaryTask
- a2t.tasks.base.Task
Subclasses
- a2t.tasks.tuple_classification.TACREDRelationClassificationTask
class EventArgumentClassificationFeatures (context: str, label: str = None, inst_type: str = None, trg: str = None, arg: str = None, trg_type: str = None, trg_subtype: str = None)
-
A class handler for the Event Argument Classification features. It inherits from
BinaryFeatures
.Expand source code
@dataclass class EventArgumentClassificationFeatures(Features): """A class handler for the Event Argument Classification features. It inherits from `BinaryFeatures`.""" trg: str = None arg: str = None trg_type: str = None trg_subtype: str = None
Ancestors
- a2t.tasks.base.Features
class EventArgumentClassificationTask (name: str, labels: List[str], required_variables: List[str] = ['trg', 'arg'], additional_variables: List[str] = ['inst_type', 'trg_type', 'trg_subtype'], templates: Dict[str, List[str]] = None, valid_conditions: Dict[str, List[str]] = None, features_class: type = a2t.tasks.tuple_classification.EventArgumentClassificationFeatures, multi_label: bool = True, negative_label_id: int = 0, **kwargs)
-
A class handler for Event Argument Classification task. It inherits from
BinaryTask
class.Initialization of a RelationClassificationTask task.
Args
name
:str
- A name for the task that may be used for to differentiate task when saving.
labels
:List[str]
- The labels for the task.
required_variables
:List[str]
, optional- The variables required to perform the task and must be implemented by the
EventArgumentClassificationFeatures
class. Defaults to["trg", "arg"]
. additional_variables
:List[str]
, optional- The variables not required to perform the task and must be implemented by the
EventArgumentClassificationFeatures
class. Defaults to ["inst_type", "trg_type", "trg_subtype"]. templates
:Dict[str, List[str]]
, optional- The templates/verbalizations for the task. Defaults to None.
valid_conditions
:Dict[str, List[str]]
, optional- The valid conditions or constraints for the task. Defaults to None.
features_class
:type
, optional- The
Features
class related to the task. Defaults to EventArgumentClassificationFeatures. multi_label
:bool
, optional- Whether the task must be treated as multi-label or not. Defaults to True.
negative_label_id
:int
, optional- The index of the negative label or -1 if no negative label exist. A negative label is for example the class
Other
on NER, that means that the specific token is not a named entity. Defaults to 0.
Expand source code
class EventArgumentClassificationTask(BinaryTask): """A class handler for Event Argument Classification task. It inherits from `BinaryTask` class.""" def __init__( self, name: str, labels: List[str], required_variables: List[str] = ["trg", "arg"], additional_variables: List[str] = ["inst_type", "trg_type", "trg_subtype"], templates: Dict[str, List[str]] = None, valid_conditions: Dict[str, List[str]] = None, features_class: type = EventArgumentClassificationFeatures, multi_label: bool = True, negative_label_id: int = 0, **kwargs ) -> None: """Initialization of a RelationClassificationTask task. Args: name (str): A name for the task that may be used for to differentiate task when saving. labels (List[str]): The labels for the task. required_variables (List[str], optional): The variables required to perform the task and must be implemented by the `EventArgumentClassificationFeatures` class. Defaults to `["trg", "arg"]`. additional_variables (List[str], optional): The variables not required to perform the task and must be implemented by the `EventArgumentClassificationFeatures` class. Defaults to ["inst_type", "trg_type", "trg_subtype"]. templates (Dict[str, List[str]], optional): The templates/verbalizations for the task. Defaults to None. valid_conditions (Dict[str, List[str]], optional): The valid conditions or constraints for the task. Defaults to None. features_class (type, optional): The `Features` class related to the task. Defaults to EventArgumentClassificationFeatures. multi_label (bool, optional): Whether the task must be treated as multi-label or not. Defaults to True. negative_label_id (int, optional): The index of the negative label or -1 if no negative label exist. A negative label is for example the class `Other` on NER, that means that the specific token is not a named entity. Defaults to 0. """ super().__init__( name=name, required_variables=required_variables, additional_variables=additional_variables, labels=labels, templates=templates, valid_conditions=valid_conditions, features_class=features_class, multi_label=multi_label, negative_label_id=negative_label_id, **kwargs )
Ancestors
- a2t.tasks.base.BinaryTask
- a2t.tasks.base.Task
class TACREDFeatures (context: str, label: str = None, inst_type: str = None, subj: str = None, obj: str = None)
-
A class handler for the TACRED features. It inherits from
Features
.Expand source code
@dataclass class TACREDFeatures(Features): """A class handler for the TACRED features. It inherits from `Features`.""" subj: str = None obj: str = None
Ancestors
- a2t.tasks.base.Features
class TACREDRelationClassificationTask (labels: List[str], templates: Dict[str, List[str]], valid_conditions: Dict[str, List[str]], **kwargs)
-
A class handler for TACRED Relation Classification task. It inherits from
RelationClassificationTask
class.Initialization of the TACRED RelationClassification task
Args
labels
:List[str]
- The labels for the task.
templates
:Dict[str, List[str]]
- The templates/verbalizations for the task.
valid_conditions
:Dict[str, List[str]]
- The valid conditions or constraints for the task.
Expand source code
class TACREDRelationClassificationTask(RelationClassificationTask): """A class handler for TACRED Relation Classification task. It inherits from `RelationClassificationTask` class.""" def __init__( self, labels: List[str], templates: Dict[str, List[str]], valid_conditions: Dict[str, List[str]], **kwargs ) -> None: """Initialization of the TACRED RelationClassification task Args: labels (List[str]): The labels for the task. templates (Dict[str, List[str]]): The templates/verbalizations for the task. valid_conditions (Dict[str, List[str]]): The valid conditions or constraints for the task. """ for key in ["name", "required_variables", "additional_variables", "features_class", "multi_label", "negative_label_id"]: kwargs.pop(key, None) super().__init__( "TACRED Relation Classification task", labels=labels, required_variables=["subj", "obj"], additional_variables=["inst_type"], templates=templates, valid_conditions=valid_conditions, features_class=TACREDFeatures, multi_label=True, negative_label_id=0, **kwargs )
Ancestors
- a2t.tasks.tuple_classification.RelationClassificationTask
- a2t.tasks.base.BinaryTask
- a2t.tasks.base.Task