Update Example Model Used in Documention #18327

jxtngx · 2023-08-16T17:21:48Z

📚 Documentation

Overview

This issue serves to track multiple updates to examples provided in PyTorch Lightning and Lightning Fabric documentation.

Examples will be updated to reflect a common model called LightningTransformer. This common model will replace all occurrences of LitAutoEncoder and similar encoder-decoder modules.

The purpose of these updates is to modernize the examples. However, care should be taken to provide examples which will run on most machines i.e. it is not suitable to provide an example of pretraining or finetuning an LLM as the default example, as this would result in issues for new users who do not possess a machine capable of such a task.

Each update should have a PR of its own, or be grouped in a cohesive manner in order to consolidate.

cc @Borda

Additional Resources

Follow the docs/README and documentation guidelines for information on docstring conventions and using doctest.

Tasks

Please comment on this issue to have an example added to the to-do list or to be assigned to a particular task.

PyTorch Lightning

Must look in docs/source-pytorch/.

Priorities

Secondary

Lightning Fabric

Must look in docs/source-fabric/

README example

LightningTransformer Demo for PyTorch Lightning

Below is the Transformer that will be used. The model exists in demos, and can be abstracted from the docs in order to keep the example high-level.

Note

this is the same demo Transformer used for Lightning Fabric's home page.

import lightning.pytorch as pl
import torch

from lightning.pytorch.demos import Transformer


class LightningTransformer(pl.LightningModule):
    def __init__(self, vocab_size):
        super().__init__()
        self.model = Transformer(vocab_size=vocab_size)

    def forward(self, batch):
        input, target = batch
        return self.model(input.view(1, -1), target.view(1, -1))

    def training_step(self, batch, batch_idx):
        input, target = batch
        output = self.model(input, target)
        loss = torch.nn.functional.nll_loss(output, target.view(-1))
        return loss

    def predict_step(self, batch):
        return self(batch)

    def configure_optimizers(self):
        return torch.optim.SGD(self.model.parameters(), lr=0.1)


if __name__ == "__main__":
    from lightning.pytorch.demos import WikiText2
    from torch.utils.data import DataLoader

    dataset = WikiText2()
    dataloader = DataLoader(dataset)
    model = LightningTransformer(vocab_size=dataset.vocab_size)

    trainer = pl.Trainer(fast_dev_run=True)
    trainer.fit(model=model, train_dataloaders=dataloader)

WikiText2DataModule Demo for PyTorch Lightning

from pathlib import Path

from torch.utils.data import DataLoader, random_split

import lightning.pytorch as pl
from lightning.pytorch.utilities.types import EVAL_DATALOADERS, TRAIN_DATALOADERS
from lightning.pytorch.demos.transformer import WikiText2


class WikiText2DataModule(pl.LightningDataModule):
    def __init__(
        self,
        num_workers: int = 2,
        data_dir: Path = Path("./data"),
        block_size: int = 35,
        download: bool = True,
        train_size: float = 0.8,
    ) -> None:
        super().__init__()
        self.data_dir = data_dir
        self.block_size = block_size
        self.download = download
        self.num_workers = num_workers
        self.train_size = train_size
        self.dataset = None

    def prepare_data(self) -> None:
        self.dataset = WikiText2(data_dir=self.data_dir, block_size=self.block_size, download=self.download)

    def setup(self, stage: str) -> None:
        if stage == "fit" or stage is None:
            train_size = int(len(self.dataset) * self.train_size)
            test_size = len(self.dataset) - train_size
            self.train_data, self.val_data = random_split(self.dataset, lengths=[train_size, test_size])
        if stage == "test" or stage is None:
            self.test_data = self.val_data

    def train_dataloader(self) -> TRAIN_DATALOADERS:
        return DataLoader(self.train_data, num_workers=self.num_workers)

    def val_dataloader(self) -> EVAL_DATALOADERS:
        return DataLoader(self.val_data, num_workers=self.num_workers)

    def test_dataloader(self) -> EVAL_DATALOADERS:
        return DataLoader(self.test_data, num_workers=self.num_workers)

The text was updated successfully, but these errors were encountered:

aniketmaurya · 2023-08-18T16:36:03Z

AutoEncoder is a great example for image classification tasks and for beginner to intermediate folks. I'd propose to only replace where we really need a more advanced demo. We can use Stable Diffusion and LitGPT as example for the GenAI models.

Dev-Khant · 2023-09-11T04:14:49Z

Can I pick this up @JustinGoheen?

jxtngx · 2023-10-05T10:32:42Z

Hi @Dev-Khant 👋 let's check with @carmocca and @awaelchli where they may like for you to assist.

carmocca · 2023-10-06T14:43:16Z

Hi @JustinGoheen, I don't have context on the objective of this issue, but I can see there's a long list of "secondary" items in the top post. Since you created it I think it's up to you to decide whether they should be changed or not

sbshah97 · 2023-10-28T00:06:22Z

Hey @JustinGoheen I'd love to contribute but I am a new contributor. Can I add some help?

jxtngx · 2023-10-30T13:29:12Z

@sbshah97 would you like to work on lightning/pytorch/core/module?

jxtngx · 2023-10-30T13:29:27Z

@Dev-Khant would you like to work on lightning/pytorch/trainer/trainer?

sbshah97 · 2023-10-30T19:32:04Z

Hello yes I can give it a try.

Dev-Khant · 2023-10-31T04:32:32Z

@Dev-Khant would you like to work on lightning/pytorch/trainer/trainer?

Yes sure @JustinGoheen

jxtngx · 2023-10-31T14:20:37Z

@Dev-Khant & @sbshah97: here are some clarifying instructions for you since these each involve docstrings.

The basic steps are:

Fork and clone the Lightning repo.
Create a working branch. If you don't want to use the command line to manage your gitops, GitKraken is a really nice tool to help manage git repos and PRs.
Go to the respective pages for your particular task: LightningModule, Trainer
Read through the API section, scanning for grammar and syntax errors, and errors in the example code snippets.
If there are no errors, that is completely okay – just let me know and I will do one final check before marking off the task and working with you to select your next contribution.
If there are errors, make the changes on your working branch, and push the changes.
If there are errors , and you have pushed the changes to your working branch. Then open a draft PR to submit your changes. Once the changes are submitted, I can help review the changes before changing the PR's status to ready for review.
Once the PR is marked as ready for review, the core maintainers will either approve and merge the PR, or suggest additional changes.

Thank you for your willingness to help, and definitely let me know if you have any questions 🙂

sbshah97 · 2023-11-08T23:52:08Z

Hey Justin. I wanted to understand how to sort of approach this task. From what I understand there are two parts to it.

Check for any grammatical mistakes. For Proposal for help #1 I just put it through ChatGPT and from the looks of it there don't seem to be any errors.
Check for any function errors in the documentation. - I am not sure how to proceed on this. Any pointers?

@JustinGoheen

jxtngx · 2023-11-09T13:54:32Z

@sbshah97 I'll put together a guide for doctest for you.

sbshah97 · 2023-11-09T15:36:26Z

Thank you Justin. Looking forward to that.

sbshah97 · 2023-11-13T20:08:49Z

Hey Justin anything on this ?

jxtngx added docs Documentation related needs triage Waiting to be triaged by maintainers labels Aug 16, 2023

Borda added help wanted Open to be worked on good first issue Good for newcomers and removed needs triage Waiting to be triaged by maintainers labels Aug 16, 2023

This was referenced Sep 1, 2023

Update docs/source-pytorch/common/lightning_module.rst #18451

Merged

Add LightningTransformer demo for PyTorch Lightning in lightning.pytorch.demos.transformer #18452

Merged

Update LightningDataModule Docs #18461

Closed

jxtngx closed this as completed Jul 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Example Model Used in Documention #18327

Update Example Model Used in Documention #18327

jxtngx commented Aug 16, 2023 •

edited

Loading

aniketmaurya commented Aug 18, 2023

Dev-Khant commented Sep 11, 2023

jxtngx commented Oct 5, 2023

carmocca commented Oct 6, 2023

sbshah97 commented Oct 28, 2023

jxtngx commented Oct 30, 2023

jxtngx commented Oct 30, 2023

sbshah97 commented Oct 30, 2023

Dev-Khant commented Oct 31, 2023

jxtngx commented Oct 31, 2023

sbshah97 commented Nov 8, 2023

jxtngx commented Nov 9, 2023

sbshah97 commented Nov 9, 2023

sbshah97 commented Nov 13, 2023

Update Example Model Used in Documention #18327

Update Example Model Used in Documention #18327

Comments

jxtngx commented Aug 16, 2023 • edited Loading

📚 Documentation

Overview

Additional Resources

Tasks

PyTorch Lightning

Lightning Fabric

LightningTransformer Demo for PyTorch Lightning

WikiText2DataModule Demo for PyTorch Lightning

aniketmaurya commented Aug 18, 2023

Dev-Khant commented Sep 11, 2023

jxtngx commented Oct 5, 2023

carmocca commented Oct 6, 2023

sbshah97 commented Oct 28, 2023

jxtngx commented Oct 30, 2023

jxtngx commented Oct 30, 2023

sbshah97 commented Oct 30, 2023

Dev-Khant commented Oct 31, 2023

jxtngx commented Oct 31, 2023

sbshah97 commented Nov 8, 2023

jxtngx commented Nov 9, 2023

sbshah97 commented Nov 9, 2023

sbshah97 commented Nov 13, 2023

jxtngx commented Aug 16, 2023 •

edited

Loading