TokenSearcherTextCleaner
Task for text cleaning
Last updated
Task for text cleaning
Last updated
Task for removing uninfomative data from text. This task uses TokenSearcherPredictor by default. For more details, see:
Subclass of NERTask.
Main methods and properties
predictor (Predictor[Any, Any], optional): Predictor that will be used in task. If equals to None, default TokenSearcherPredictor will be used. Defaults to None.
preprocess (Optional[Component], optional): Component executed before predictor. If equals to None, default component will be used. Defaults to None. Default component: TokenSearcherTextCleanerPreprocessor
postprocess (Optional[Component], optional): Component executed after predictor. If equals to None, default component will be used. Defaults to None. Default component: TokenSearcherTextCleanerPostprocessor
input_class (Type[Input], optional): Class for input validation. Defaults to TokenSearcherTextCleanerInput.
output_class (Type[NEROutputType], optional): Class for output validation. Defaults to TokenSearcherTextCleanerOutput.
name (Optional[str], optional): Name for identification. If equals to None, class name will be used. Defaults to None.
Subclass of IOModel.
text (str): Text to clean.
Subclass of NEROutput. Type of NEROutput[Entity].
text (str): Input text.
cleaned_text (Optional[str], optional): Cleaned text. Equals to None, if clean was set to False in default postprocessor.
output (List[Entity]): Uninformative data.
Create prompt with providied text. Subclass of Action. Type of Action[Dict[str, Any], Dict[str, Any]].
input_data (Dict[str, Any]): Expected keys:
"text" (str): Text to process;
Dict[str, Any]: Expected keys:
"inputs" (List[str]): Model inputs;
Format output and clean text if specified. Subclass of Action. Type of Action[Dict[str, Any], Dict[str, Any]].
clean (bool): Remove uninformative data from text. Defaults to False.
threshold (float): Data threshold score. Defaults to 0.
name (Optional[str], optional): Name for identification. If equals to None, class name will be used. Defaults to None.
input_data (Dict[str, Any]): Expected keys:
"output" (List[List[Dict[str, Any]]]): Model output;
"inputs" (List[str]): Model inputs;
"text" (str): Processed text;
Dict[str, Any]: Expected keys:
"text" (str): Processed text;
"output" (List[Entity]): uninformative data;
"cleaned_text" (Optional[str], optional): Cleaned text. Equals to None, if clean was set to False.