TokenSearcherTextCleaner

Task for text cleaning

Task for removing uninfomative data from text. This task uses TokenSearcherPredictor by default. For more details, see:

TokenSearcherPredictor

Subclass of NERTask.

Module: implementation.tasks

Methods and properties

Main methods and properties

init

Arguments:

predictor (Predictor[Any, Any], optional): Predictor that will be used in task. If equals to None, default TokenSearcherPredictor will be used. Defaults to None.
preprocess (Optional[Component], optional): Component executed before predictor. If equals to None, default component will be used. Defaults to None. Default component: TokenSearcherTextCleanerPreprocessor
postprocess (Optional[Component], optional): Component executed after predictor. If equals to None, default component will be used. Defaults to None. Default component: TokenSearcherTextCleanerPostprocessor
input_class (Type[Input], optional): Class for input validation. Defaults to TokenSearcherTextCleanerInput.
output_class (Type[NEROutputType], optional): Class for output validation. Defaults to TokenSearcherTextCleanerOutput.
name (Optional[str], optional): Name for identification. If equals to None, class name will be used. Defaults to None.

TokenSearcherTextCleanerInput

Subclass of IOModel.

init

Arguments:

text (str): Text to clean.

TokenSearcherTextCleanerOutput

Subclass of NEROutput. Type of NEROutput[Entity].

init

Arguments:

text (str): Input text.
cleaned_text (Optional[str], optional): Cleaned text. Equals to None, if clean was set to False in default postprocessor.
output (List[Entity]): Uninformative data.

TokenSearcherTextCleanerPreprocessor

Create prompt with providied text. Subclass of Action. Type of Action[Dict[str, Any], Dict[str, Any]].

execute

Arguments:

input_data (Dict[str, Any]): Expected keys:
- "text" (str): Text to process;

Returns:

Dict[str, Any]: Expected keys:
- "inputs" (List[str]): Model inputs;

TokenSearcherTextCleanerPostprocessor

Format output and clean text if specified. Subclass of Action. Type of Action[Dict[str, Any], Dict[str, Any]].

init

Arguments:

clean (bool): Remove uninformative data from text. Defaults to False.
threshold (float): Data threshold score. Defaults to 0.
name (Optional[str], optional): Name for identification. If equals to None, class name will be used. Defaults to None.

execute

Arguments:

input_data (Dict[str, Any]): Expected keys:
- "output" (List[List[Dict[str, Any]]]): Model output;
- "inputs" (List[str]): Model inputs;
- "text" (str): Processed text;

Returns:

Dict[str, Any]: Expected keys:
- "text" (str): Processed text;
- "output" (List[Entity]): uninformative data;
- "cleaned_text" (Optional[str], optional): Cleaned text. Equals to None, if clean was set to False.

PreviousComprehendIt NextTokenSearcherNER

Last updated 1 year ago

Module: implementation.tasks

Methods and properties

__init__

Arguments:

TokenSearcherTextCleanerInput

__init__

Arguments:

TokenSearcherTextCleanerOutput

__init__

Arguments:

TokenSearcherTextCleanerPreprocessor

execute

Arguments:

Returns:

TokenSearcherTextCleanerPostprocessor

__init__

Arguments:

execute

Arguments:

Returns:

init

init

init

init