TransformersVisualQandA
Visual Q&A task
Last updated
Visual Q&A task
Last updated
Subclass of Task.
This task uses by default with this configuration:
Main methods and properties
predictor (Predictor[Any, Any], optional): Predictor that will be used in task. If equals to None, default predictor will be used. Defaults to None.
preprocess (Optional[Component], optional): Component executed before predictor. If equals to None, default component will be used. Defaults to None. Default component: VisualQandAPreprocessor If default chain is used, VisualQandAPreprocessor will use ViltProcessor from model used in predictor.
postprocess (Optional[Component], optional): Component executed after predictor. If equals to None, default component will be used. Defaults to None. Default component: VisualQandASingleAnswerPostprocessor If default chain is used, VisualQandASingleAnswerPostprocessor will use labels from model used in predictor.
name (Optional[str], optional): Name for identification. If equals to None, class name will be used. Defaults to None.
answer (Optional[Tuple[str, float]])
answers (Dict[str, float])
Prepare model input. Subclass of Action. Type of Action[Dict[str, Any], Dict[str, Any]].
processor (Processor): Feature extractor.
name (Optional[str], optional): Name for identification. If equals to None, class name will be used. Defaults to None.
input_data (Dict[str, Any]): Expected keys:
"image" (Image.Image): Image to analyze;
"question" (str): Question to answer;
Dict[str, Any]: Expected keys:
"input_ids" (Any);
"token_type_ids" (Any);
"attention_mask" (Any);
"pixel_values" (Any);
"pixel_mask" (Any);
Process model output. Subclass of VisualQandAMultianswerPostprocessor.
input_data (Dict[str, Any]): Expected keys:
"logits" (Any): Model output;
Dict[str, Any]: Expected keys:
"answer" (Optional[Tuple[str, float]]): Answer with highest score, if score higher or equal to threshold, else - None.
Process model output. Subclass of Action. Type of Action[Dict[str, Any], Dict[str, Any]].
labels (List[str]): Labels for classification.
threshold (float): Labels threshold score. Defaults to 0.
name (Optional[str], optional): Name for identification. If equals to None, class name will be used. Defaults to None.
input_data (Dict[str, Any]): Expected keys:
"logits" (Any): Model output;
Dict[str, Any]: Expected keys:
"answers" (Dict[str, float]): Classified labels and scores.
input_class (Type[], optional): Class for input validation. Defaults to .
output_class (Type[], optional): Class for output validation. Defaults to TransformersVisualQandAOutput.
Subclass of .
Subclass of .