Actions for handling PDF documents
Last updated
Actions for handling PDF documents
Last updated
Read PDF document. Subclass of . Type of [Dict[str, Any], Dict[int, Page]]
input_data (Dict[str, Any]): Expected keys:
"path_to_file" (str): Path to PDF file;
"pages" (List[int], optional): Pages to read. If not provided, read complete document;
Dict[int, Page]: PDF document pages;
tables (bool, optional): If equals to True, include text from tables. Defaults to True.
name (Optional[str], optional): Name for identification. If equals to None, class name will be used. Defaults to None.
input_data (Dict[int, Page]): PDF document pages.
Dict[int, str]: Extracted texts.
input_data (Dict[int, Page]): PDF document pages.
Dict[int, List[Table]]: Founded tables.
input_data (Dict[int, Page]): PDF document pages.
Dict[int, Any]: Extracted tables.
input_data (Dict[int, Page]): PDF document pages.
input_data (Dict[str, Any]): Data to process. Expected keys:
"path_to_file" (str): Path to audio file;
"page_width" (float, optional): Page width in cm;
"page_height" (float, optional): Page height in cm;
"x_padding" (float, optional): x padding in cm;
"y_padding" (float, optional): y padding in cm;
"text" (str): text to write;
Extract texts from pages. Subclass of . Type of [Dict[int, Page], Dict[int, str]]
Find tables on pages. Subclass of . Type of [Dict[int, Page], Dict[int, List[Table]]]]
Extract tables from pages. Subclass of . Type of [Dict[int, Page], Dict[int, List[Table]]]]
Extract images from pages. Subclass of . Type of [Dict[int, Page], Dict[int, List[]]]]
Dict[int, List[]]: Extracted images.
Write PDF file. Subclass of . Type of [Dict[str, Any], None]