diff --git a/docs/page-objects/frameworks.rst b/docs/page-objects/frameworks.rst index 10a0854f..214df0c8 100644 --- a/docs/page-objects/frameworks.rst +++ b/docs/page-objects/frameworks.rst @@ -7,7 +7,7 @@ Frameworks :ref:`Page objects ` are not meant to be used in isolation with web-poet. They are meant to be used with a web-poet framework. -A web-poet framework is a Python web scraping framework, library, of plugin +A web-poet framework is a Python web scraping framework, library, or plugin that implements the :ref:`web-poet specification `. At the moment, the only production-ready web-poet framework that exists is diff --git a/docs/page-objects/page-params.rst b/docs/page-objects/page-params.rst index 16279ac7..1d7177f8 100644 --- a/docs/page-objects/page-params.rst +++ b/docs/page-objects/page-params.rst @@ -4,45 +4,35 @@ Using page params ================= -In some cases, Page Objects might require additional information to be passed to -them. Such information can dictate the behavior of the Page Object or affect its -data entirely depending on the needs of the developer. +In some cases, :ref:`page object classes ` might require or allow +parameters from the calling code, e.g. to change their behavior or make +optimizations. -If you can recall from the previous basic tutorials, one essential requirement of -Page Objects that inherit from :class:`~.WebPage` would -be :class:`~.HttpResponse`. This holds the HTTP response information that the -Page Object is trying to represent. - -In order to standardize how to pass arbitrary information inside Page Objects, -we'll need to use :class:`~.PageParams` similar on how we use -:class:`~.HttpResponse` as a requirement to instantiate Page Objects: +To support parameters, add :class:`~.PageParams` to your :ref:`inputs +`: .. code-block:: python import attrs - import web_poet + from web_poet import PageParams, WebPage @attrs.define - class SomePage(web_poet.WebPage): - # The HttpResponse attribute is inherited from WebPage - page_params: web_poet.PageParams + class MyPage(WebPage): + page_params: PageParams - # Assume that it's constructed with the necessary arguments taken somewhere. - response = web_poet.HttpResponse(...) +In your page object class, you can read parameters from a :class:`~.PageParams` +object as you would from a :class:`dict`: - # It uses Python's dict interface. - page_params = web_poet.PageParams({"arbitrary_value": 1234, "cool": True}) - - page = SomePage(response=response, page_params=page_params) +.. code-block:: python -However, similar with :class:`~.HttpResponse`, developers using -:class:`~.PageParams` shouldn't care about how they are being passed into Page -Objects. This will depend on the framework that would use **web-poet**. + foo = self.page_params["foo"] + bar = self.page_params.get("bar", "default") -Let's checkout some examples on how to use it inside a Page Object. +The way the calling code sets those parameters depends on your :ref:`web-poet +framework `. -Controlling item values ------------------------ +Example: Controlling item values +================================ .. code-block:: python @@ -69,7 +59,7 @@ Controlling item values @staticmethod def calculate_price_with_tax(item): - tax_rate = self.page_params.get("tax_rate") or self.default_tax_rate + tax_rate = self.page_params.get("tax_rate", self.default_tax_rate) item["price_with_tax"] = item["price"] * (1 + tax_rate) @@ -80,12 +70,13 @@ the **tax_rate** as optional information, notice that we also have a the ``default_tax_rate`` as a backup value just in case it's not available. -Controlling Page Object behavior --------------------------------- +Example: Controlling page object behavior +========================================= Let's try an example wherein :class:`~.PageParams` is able to control how -:ref:`additional-requests` are being used. Specifically, we are going to use -:class:`~.PageParams` to control the number of paginations being made. +:ref:`additional requests ` are being used. Specifically, +we are going to use :class:`~.PageParams` to control the number of pages +visited. .. code-block:: python @@ -110,7 +101,7 @@ Let's try an example wherein :class:`~.PageParams` is able to control how async def get_product_urls(self) -> List[str]: # Simulates scrolling to the bottom of the page to load the next # set of items in an "Infinite Scrolling" category list page. - max_pages = self.page_params.get("max_pages") or self.default_max_pages + max_pages = self.page_params.get("max_pages", self.default_max_pages) requests = [ self.create_next_page_request(page_num) for page_num in range(2, max_pages + 1) @@ -134,5 +125,5 @@ Let's try an example wherein :class:`~.PageParams` is able to control how From the example above, we can see how :class:`~.PageParams` is able to arbitrarily limit the pagination behavior by passing an optional **max_pages** -info. Take note that a ``default_max_pages`` value is also present in the Page -Object in case the :class:`~.PageParams` instance did not provide it. +info. Take note that a ``default_max_pages`` value is also present in the page +object class in case the :class:`~.PageParams` instance did not provide it.