Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: simplify the Quickstart guide #27612

Merged
merged 6 commits into from
Mar 29, 2024
Merged

docs: simplify the Quickstart guide #27612

merged 6 commits into from
Mar 29, 2024

Conversation

mistercrunch
Copy link
Member

@mistercrunch mistercrunch commented Mar 21, 2024

For a true quickstart, let's direct people towards docker-compose
where they can get a multi-container instance setup with 2 shell
commands.

The current docker setup is limited and forces the user to run
an extra set of commands to load examples, create a user, ...
and they end up in a slower single-container instance.

On docker compose, we have all this setup as well as the pointer
to the latest release by default.

It also looks like the current setup is also sqlite(?) and isn't persisted (?)
Having a Postgres and some persistence in the image should be
a positive upgrade here too.

For a true quickstart, let's direct people towards `docker-compose`
where they can get a multi-container instance setup with 2 shell
commands.

The current docker setup is limited and forces the user to run
an extra set of commands to load examples, create a user, ...
and they end up in a slower single-container instance.

On docker compose, we have all this setup as well as the pointer
to the latest release by default.
@github-actions github-actions bot added the doc Namespace | Anything related to documentation label Mar 21, 2024
@rusackas
Copy link
Member

@artofcomputing @michael-s-molina @eschutho @sfirke for more opinions. I wonder if there's a place for both methods.

@mistercrunch
Copy link
Member Author

My take is I think we need to simplify things, use and expose higher level constructs. If people want to go lower level they can refer to the underlying scripts where all the information resides.

Also related is I would love to strip "external" information that is not Superset-specific and is better maintained elsewhere. You need to install docker on a Mac? on Ubuntu? Look at the docker docs!

@artofcomputing
Copy link
Contributor

artofcomputing commented Mar 22, 2024

@mistercrunch I agree in the sense of simplification, which was one of the main reasons that the Quickstart was created, as users weren't able to get a instance of Superset up and running easily beforehand.

It's possible to have both installation methods in place, and let the user choose which one fits best their needs while informing the tradeoff of each.

It also looks like the current setup is also sqlite(?) and isn't persisted (?)
Having a Postgres and some persistence in the image should be
a positive upgrade here too.

That's right, currently the standalone instance uses sqlite, yet the data is persisted by stopping the container instead of removing it, as the documentation informs.

@mistercrunch
Copy link
Member Author

Whenever we offer more options it dilutes what is truly supported and creates confusion. "Just tell me what I should use" is the way I'd feel entering a new community. There's shouldn't be many options for a quick start.

We could factor out this method as yet another way to run Superset into the "installation and configuration" under "shaky single host docker setup with SQL lite"

@mistercrunch
Copy link
Member Author

Oh one more thing about the current quickstart is that AFAIK it cannot be configured easily, meaning it's a configurability dead end. If someone wanted to install a database driver to connect to their database or simply flipping a feature flag, they'd probably have to go inside the docker and mess around with things. On the docker-compose side of the house things are evolutive and well documented around configurability. There are clear prescribed ways to do these things.

@artofcomputing
Copy link
Contributor

Whenever we offer more options it dilutes what is truly supported and creates confusion

I can't find in the documentation what is the official installation methods, would you mind telling me where it is?

If not, "truly supported" means nothing if the user can't find it.

We could factor out this method as yet another way to run Superset into the "installation and configuration" under "shaky single host docker setup with SQL lite"

Again, simply declare the official supported installation methods and we can go from there. Only Kubernetes is supported? Let's do a Quickstart with Kubernetes. Both Kubernetes and Compose? Let's offer both.

Oh one more thing about the current quickstart is that AFAIK it cannot be configured easily, meaning it's a configurability dead end.

I agree, that was a big oversight when it was written, since Superset configuration is catered towards a declarative method. Yet, there's no mention about Superset being declarative at the start, it should be actually documented in the intro.

@mistercrunch
Copy link
Member Author

Oh for quickstart what I'm suggesting is in this PR (docker-compose pointing to the latest official release)

For installation there's a whole section, starting now with k8s, but getting on all sorts of nooks and crannies... ->
https://superset.apache.org/docs/installation/running-on-kubernetes
Screenshot 2024-03-22 at 4 52 34 PM

For development setups -> https://github.com/apache/superset/blob/master/CONTRIBUTING.md

What's clear is that this documentation evolved bits by bits through different generations of contributors. This quickstart re-write is a first take towards re-thinking / merging / sorting through all this.

I think it's becoming clear to me that for quickstart/sandbox/development docker-compose is the preferred way and for production Helm/k8s is the way, supported by minikube for smaller, single-host-type setups.

@artofcomputing
Copy link
Contributor

Oh for quickstart what I'm suggesting is in this PR (docker-compose pointing to the latest official release)
For installation there's a whole section, starting now with k8s, but getting on all sorts of nooks and crannies... ->
https://superset.apache.org/docs/installation/running-on-kubernetes
...
For development setups -> https://github.com/apache/superset/blob/master/CONTRIBUTING.md

That wasn't the question, which are the official and supported installation methods and where in the documentation they are declared? Yes, there's multiple installation pages, however Docker Compose is now declared as a method not suitable for production deployment on top of the page, would that be suitable for having a Quickstart with it? And what about PyPI? If so, there needs to be a page where users can be informed which are the methods the Superset community supports.

Again, without it, "truly supported" means nothing.

What's clear is that this documentation evolved bits by bits through different generations of contributors. This quickstart >re-write is a first take towards re-thinking / merging / sorting through all this.

That's true, IMHO Superset has advanced quite a lot throughout the years, however the documentation hasn't so far been able to keep up with the development pace, making it the least favorable selling point of Superset. There's a lot of work to be done.

I think it's becoming clear to me that for quickstart/sandbox/development docker-compose is the preferred way and for >production Helm/k8s is the way, supported by minikube for smaller, single-host-type setups.

I agree, the documentation can be definitely geared towards these methods, but they must be agreed upon by the community, and also stated on the documentation so as not to generate any confusion to the users. That was also another oversight when the Quickstart was written (which is fixed in this PR).

@mistercrunch
Copy link
Member Author

mistercrunch commented Mar 25, 2024

which are the official and supported installation methods

This is beyond the scope of this PR. This PR is about improving quick start. Can we agree that this approach is better than the previous single-docker/sqlite/5-steps approach? I'm planning on improving the rest of the docs to clarify installation, but let's start here.

One big assumption here is that "quick start" is about sandboxing (as in standing up something quickly that someone can play with, in an non-production environment). This fits the scope and direction I'm advocating for our usage of docker-compose:

  • easy and quick to fire up
  • opinionated (picks postgres/redis)
  • reproducible / deterministic
  • lightly configurable, but suitable for a test drive (feature flags)

If "quick start" was to point towards our preferred production method (Helm/k8s), I think it'd be harder to make it a good experience since the concerns on that side of the house are around getting a solid and secure setup, forcing you to think through setting you your SECRET_KEY and secret management for it, setting up and configuring a database outside of docker, ...

About the question that's beyond the scope of this PR of "what we support", I want to say that it's a complex multi-dimensional matrix with different levels of support. No silver bullet here. The architecture gives people lots of flexibility and you're free to use anything as your metadata database (MySQL, Oracle, SQL Server, ...), different solution for caching, for your message queue, and for observability. I think overall this flexibility hurts us and creates more burden for maintainers. Clarifying preferred solutions would help here, but we have to strike balance between flexibility, and making it more clear where we draw the line.

@artofcomputing
Copy link
Contributor

artofcomputing commented Mar 25, 2024

Can we agree that this approach is better than the previous single-docker/sqlite/5-steps approach? I'm planning on > improving the rest of the docs to clarify installation, but let's start here.

One big assumption here is that "quick start" is about sandboxing (as in standing up something quickly that someone can play with, in a non-production environment). This fits the scope and direction I'm advocating for our usage of docker-compose...

IMHO your direction and view is much more tangible and easier to grasp now, which I fully agree. It's also possible to use this direction to rewrite other pages by having the quick start as an example/baseline to simplify instructions.

About the question that's beyond the scope of this PR of "what we support"...

I agree, there's no silver bullet. However, I think that the direction you're advocating for the quick start can be a good starting ground to provide better instructions on how to change between different solutions.

Copy link
Contributor

@artofcomputing artofcomputing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, will follow the step-by-step to check for any issues.

After configuring your fresh instance, head over to [http://localhost:8080](http://localhost:8080) and
log in with the default created account:
### 3. Log into Superset
Now head over to [http://localhost:8080](http://localhost:8080) and log in with the default created account:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Superset doesn't start at port 8080, instead it starts at 8088 while using the docker-compose-image-tag.yml file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

Copy link
Contributor

@artofcomputing artofcomputing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and it's simple to follow, IMO ready to merge.

@mistercrunch mistercrunch merged commit 79cf206 into master Mar 29, 2024
25 checks passed
EandrewJones pushed a commit to UMD-ARLIS/superset that referenced this pull request Apr 5, 2024
EnxDev pushed a commit to EnxDev/superset that referenced this pull request Apr 12, 2024
@rusackas rusackas deleted the docs_quickstart branch April 16, 2024 16:52
qleroy pushed a commit to qleroy/superset that referenced this pull request Apr 28, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 4.1.0 labels Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels doc Namespace | Anything related to documentation preset-io size/M 🚢 4.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants