Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a JupyterLite-powered interactive shell for the pandas website (reprise of #47428) #60758

Open
wants to merge 26 commits into
base: main
Choose a base branch
from

Conversation

agriyakhetarpal
Copy link
Contributor

@agriyakhetarpal agriyakhetarpal commented Jan 22, 2025

This pull request aims to reinstate the machinery for adding a JupyterLite shell for the pandas website, based on discussions in #60747 and on Slack, please see them for a rationale on this change (short answer: it's been a while in 2025 and things might be smoother now).

An interactive shell was first discussed in #46682 and later removed in #49807. There was a request to reinstate this in #49807 (comment) which might not have been noticed, so I hope #60747 and this PR bring more visibility to the proposal.

In particular, this PR adds back the previous changes from #47428, with only minor differences in the configuration. I've used (and pinned) the jupyterlite-pyodide-kernel for the REPL, which in-turn provides Pyodide version 0.5.2, which comes with pandas version 2.2.3.

The differences in the changes from the previous PR are as follows:

  • newer versions of JupyterLite and jupyterlite-pyodide-kernel have been incorporated, i.e., 0.5.0 and 0.5.2 respectively
  • source maps have been disabled when building JuptyerLite, which should reduce the size of static assets significantly and allow for faster load times
  • the word "Experimental" has been added to the REPL's Markdown heading to indicate it as such

As mentioned in #60747, it is also possible to use the https://jupyterlite.github.io/demo/ REPL in the iframe to remove dependencies, however, it would come at the cost of disabled optimisations and a general lack of control for how the REPL is built.

cc: @jtpio, please feel free to add any suggestions!

@mroeschke mroeschke added the Web pandas website label Jan 22, 2025
@agriyakhetarpal
Copy link
Contributor Author

Well, looking at the website from the downloaded artifact, having it at 100% width definitely doesn't look great:

pandas website displayed from the documentation build artifact from the build from gh-60758. This image is that of the Getting Started section, where a heading 'Experimental, try pandas in your browser' is displayed, with an interactive shell to run pandas code in the browser. It occupies the entire width of the website area from the left margin to the right

and reducing the width to 50% would affect mobile devices and/or narrower screen sizes. I'll try to fix this by adding some inline CSS.

@agriyakhetarpal
Copy link
Contributor Author

I hope it looks better now:

pandas website displayed from the documentation build artifact from the build from gh-60758. This image is that of the Getting Started section, where a heading 'Experimental, try pandas in your browser' is displayed, with an interactive shell to run pandas code in the browser. It now occupies a smaller portion of the screen and stays true to its given width, with a maximum width of 650px. This retains the aspect ratio and makes the REPL look cleaner. There is also a dark blue border around the REPL, as it is

I also fixed a scaling issue for narrower screens for the YouTube video a few sections above this, which also affected how the REPL looked on mobile devices.

@rhshadrach
Copy link
Member

cc @datapythonista if you're interested in taking a look, I plan to review in the next few days.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@agriyakhetarpal
Copy link
Contributor Author

Many thanks for the review, @rhshadrach! I'm sorry for not having responded earlier than I could; I was on vacation recently. This should be ready for another look whenever you get a chance.

@agriyakhetarpal
Copy link
Contributor Author

Also, re-requesting a review from @jtpio, just in case I missed something.

@jtpio
Copy link
Contributor

jtpio commented Feb 14, 2025

Well, looking at the website from the downloaded artifact, having it at 100% width definitely doesn't look great:

Maybe this could be taking 100% of the width in a future iteration, when JupyterLite 0.6.0 final is released, which will let users configure where the code console prompt cell should be placed:

image

Copy link
Contributor

@jtpio jtpio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @agriyakhetarpal, tested locally and it looks great!

Co-authored-by: Jeremy Tuloup <[email protected]>
@agriyakhetarpal
Copy link
Contributor Author

agriyakhetarpal commented Feb 14, 2025

Thanks for the review, @jtpio – it would be nice to customise the location of the prompts (through URL parameters, I assume – I'm waiting for jupyterlite/jupyterlite#148 and jupyterlite/jupyterlite#1573 to land)!

@rhshadrach
Copy link
Member

@pandas-dev/pandas-core - Is anyone concerned about the increased load on the server? This was tried previous so I think we should be okay, but wanted to make sure before merging.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Feb 20, 2025

@pandas-dev/pandas-core - Is anyone concerned about the increased load on the server? This was tried previous so I think we should be okay, but wanted to make sure before merging.

If I understand how this would work, every time you opened up the getting started page, it would load an iframe with the interactive shell, and there is a lot of bandwidth used to do that (70 MB?). If that's the case, I think this should have its own page, i.e., people who are just looking at the Getting Started page shouldn't have to load the JupyterLite iframe . Then the Getting Started page could point to the page with the shell, and we can measure how often people are using it.

@agriyakhetarpal
Copy link
Contributor Author

agriyakhetarpal commented Feb 20, 2025

@pandas-dev/pandas-core - Is anyone concerned about the increased load on the server? This was tried previous so I think we should be okay, but wanted to make sure before merging.

If I understand how this would work, every time you opened up the getting started page, it would load an iframe with the interactive shell, and there is a lot of bandwidth used to do that (70 MB?). If that's the case, I think this should have its own page, i.e., people who are just looking at the Getting Started page shouldn't have to load the JupyterLite iframe . Then the Getting Started page could point to the page with the shell, and we can measure how often people are using it.

Actually, the bandwidth usage will be in two steps. The first iteration will use up to ~7 MiB of bandwidth, and it is only when one presses Enter in the import pandas as pd code prompt that it will start loading pandas and its dependencies.

I would be happy to move to a dedicated page, though, based on the consensus that will be achieved here – that also helps Plausible do its job better.

@rhshadrach
Copy link
Member

@Dr-Irv - it doesn't load with the page. See #60758 (comment)

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Feb 20, 2025

The first iteration will use up to ~7 MiB of bandwidth

That's still a fairly big load for a web page (or at least it seems to me to be a big load)

that also helps Plausible do its job better.

Can you explain what you mean here by "Plausible do its job better" ?

@agriyakhetarpal
Copy link
Contributor Author

The first iteration will use up to ~7 MiB of bandwidth

That's still a fairly big load for a web page (or at least it seems to me to be a big load)

I agree – I can't deny it :) Though, from a quick test just now, I noticed that https://pandas.pydata.org/ without the shell accumulates 4.35 MiB of bandwidth, so it might not be too bad.

that also helps Plausible do its job better.

Can you explain what you mean here by "Plausible do its job better" ?

Oh, I meant that a dedicated page would show up separately in the dashboard on https://views.scientific-python.org/pandas.pydata.org/, so the pandas team would be able to see better metrics; viz. entry/exit links, the time spent on the page, and so on, to see how much the shell would be utilised in the wild. Having it on the Getting Started page makes gaining said metrics a bit more complex, as pageview goals need to be set up.

I'd probably rephrase my statement from "helps Plausible do its job better" to "helps the pandas website team that makes use of Plausible do their job better" – this seems to better convey what I wanted to say.

@datapythonista
Copy link
Member

I agree with @Dr-Irv. The terminal is very cool, but at the same time tricky. It didn't work very well when we already tried in the past. And even in a best case scenario, it can be a problem for example to mobile users with a limited data plan. Users should be able to access our getting started page without exposure to the terminal in my opinion. And personally, I'd also prefer if the terminal is hosted elsewhere and we just link to it. No strong objection to having it in this repo amd our hosting, but I think it's better if it has it's own space (I think another repo in our organization and github pages for example). It allows to iterate quickly on it, and puts less pressure on having to deliver a fast and reliable experience to the users, if it's perceived as a external resource. Just my opinion.

@agriyakhetarpal
Copy link
Contributor Author

agriyakhetarpal commented Feb 21, 2025

Thanks for your thoughts, @datapythonista. As two people have now suggested moving it to a separate page, I'm happy to do so. I agree with moving the shell to a separate repository in pandas wired up with GitHub Pages, and I can help with the maintenance of such a repository if one could be set up in the organisation from the template at https://github.com/jupyterlite/demo.

One option is to use an external deployment from that repository at https://jupyterlite.github.io/demo, but I'll have to note that it comes with source maps enabled (jupyterlite/demo#151), which means that the JupyterLite assets are slightly bigger in size. I could further switch to that if you would like, as NumPy uses it as well for https://numpy.org/, but I would suggest not doing so at the moment as these extra infrastructure changes where we set up a shell of our own will help me with a follow-up PR after this is merged that will enable interactivity for the "API reference" examples for pandas, which relies on the same tooling. For example, here's a parallel attempt for NumPy from a while ago: numpy/numpy#26745.

Like it is here, I have plans to change the NumPy website's shell to a deployment that doesn't enable source maps in the short term.

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Feb 21, 2025

@agriyakhetarpal thanks for separating the page.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm - will merge in a week if there are no further reviews / discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Web pandas website
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: Reinstate the JupyterLite-based live shell for the pandas website
6 participants