RequestsHTML

Basic requests-html scraper

Subclass of Executable. Type of Executable[Web2MeaningInput, Web2MeaningOutput]

Module: implementation.schemas

Methods and properties

Main methods and properties


__init__

Arguments:

  • js_rendering (bool, optional): Specifies whether the page should be rendered. Defaults to False.

  • input_class (Type[RequestsHTMLInput], optional): Class for input validation. Defaults to RequestsHTMLInput.

  • output_class (Type[RequestsHTMLOutput], optional): Class for output validation. Defaults to RequestsHTMLOutput.

  • name (Optional[str], optional): Name for identification. If equals to None, class name will be used. Defaults to None.




RequestsHTMLInput

Subclass of IOModel.


__init__

Arguments:

  • url (str): The URL of the page to be processed.




RequestsHTMLOutput

Subclass of IOModel.


__init__

Arguments:

  • text (str): Text from page.

  • links (List[str]): Links from page.

Last updated