Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make openHABian less dependent of upstreams #655

Closed
mstormi opened this issue Aug 7, 2019 · 18 comments
Closed

Make openHABian less dependent of upstreams #655

mstormi opened this issue Aug 7, 2019 · 18 comments

Comments

@mstormi
Copy link
Contributor

mstormi commented Aug 7, 2019

We have a couple of places where - in addition to where we use debian repos - we wget some URL or git clone Github repos.
We're usually retrieving the latest version which means it is not known to work at the time of install. Some of these instances even require to compile code, others in turn use yet another upstream.

I suggest we increase reliability by either of these methods or combinations thereof:

  • fetch tagged git versions only that we know to work
    (instead of master or anything that's subject to change quickly and without that we notice).
  • create a cache by creating github mirror repos (or mirror them into openhabian repo)
  • fetch packages and store them in openhabian repo for download
  • compile binaries at image creation time and store them in openhabian repo for download

Instances to work on (will track here, let me know more via comments)

Critical (any user needs it):
openhabian repo itself
openhabian-zram, overlayfstools
Zulu Java, Adopt Java, (OpenJDK ?)

Optional:
knxd
Tellstick
miflora daemon

Fixed for zram in PR #656
Created 2 nested(!) mirror repos and added/use "openhabian_v1.5" tag for working code.

@mstormi mstormi changed the title Un Make openHABian independent of upstreams Aug 7, 2019
@EliasGabrielsson
Copy link
Contributor

EliasGabrielsson commented Aug 8, 2019

I actually discussed this topic with @ThomDietrich not long ago. First off all I don't see "one" correct way of solving this matter. I was exploring the thought of providing a verified configuration, per-installed image which is somewhat what you are suggesting. We ended up the discussion that is about where to put the maintenance resources and that when issues have occurred they have been solved rather quickly.

But maybe we need cherrypick some unstable upstream packages to get a stable build. Which things have been-longterm broken so far because of upstream patches weren't merged?

@mstormi
Copy link
Contributor Author

mstormi commented Aug 8, 2019

zram, knxd and Tellstick I believe, all of which were not ready for Buster.
Ok not longterm but proactive 'safe' works are better than a need to react fast.
Agree there's no single "correct" way but I think what I did for zram is a good solution for Github repos.
Mirror repos effectively don't change unless we actively pull/merge the upstream so they cost no maintenance resources unless we're willing to spend them as part of component or openHABian upgrade activities. Meanwhile no untested upstream changes can break installs.

@EliasGabrielsson
Copy link
Contributor

Hmm, I think we should as long as possible stay with original repos to easliy track work. We should only use packages which are well maintained. It's always possible to checkout a single commit as well.

@mstormi
Copy link
Contributor Author

mstormi commented Aug 8, 2019

I see no contradiction. Our mirror is just that, a mirror of the original repo. Albeit a more reliable, less volatile one. We could change something ourselves in there if needed or at least determine when we expose openHABian users to changes in the original. Cloning the original repo during a user's install will not provide that level of control.

@EliasGabrielsson
Copy link
Contributor

Let's continue discussion on #655 for that - we don't want updates to happen automatically.
But we can enforce them at any time by updating the repo mirror.

I don't agree on this approach. I would say that the normal usecase shall be fetch upstream patches as fast as possible and rather fix bugs upstream or integration-issues downstream in our codebase. It is important to be in sync with upstream sources so we not get in a state with a lot of technical debt.

There can of course be exception to this as well.

@BClark09 and @ThomDietrich, do you have any reflections on this?

@BClark09
Copy link
Member

BClark09 commented Aug 10, 2019

I agree with @EliasGabrielsson. Original upstreams should be used where possible.

The workload of knowing which fixes/patches/features are maintained where becomes greater. Some upstream libraries may be slow to respond, but I would much rather this than figuring out how bugfix B from the upstream can be merged with additional feature A from a mirror.

To add, the zram feature has been moved from the original upstream (which looks well maintained) to a different upstream and this does not really solve the issue of making the openHABian project independant. Say we encounter an error that needs fixing, it's just more work to send a bugfix in two directions. (We shouldn't just patch our version of the code. I don't feel this is very friendly to an open source ecosystem.)

As @EliasGabrielsson mentioned, a specific commit from an upstream can be used if volitility is an issue.

@mstormi
Copy link
Contributor Author

mstormi commented Aug 10, 2019

To add, the zram feature has been moved from the original upstream (which looks well maintained) to a different upstream and this does not really solve the issue of making the openHABian project independant.

Err, no, it isn't really well-maintained. That maybe ain't visible at first, but that in fact was the reason for me to introduce that mirror level.

Say we encounter an error that needs fixing, it's just more work to send a bugfix in two directions. (We shouldn't just patch our version of the code. I don't feel this is very friendly to an open source ecosystem.)

Agree insofar as in the normal case, we should send fixes of ours to the upstream only in the first place. But if he's slow or unwilling to react, we do have a means of overriding and mitigate impact that we do not have today.
And remember most upstreams don't "produce" for openHABian or are even just aware that we use their code. Actually if there is a change to the upstream that does not mean we need to have it or it is beneficial to us at all. It can even be the opposite. The consequence and downside of unvalidatedly forwarding changes to our users is that you take a fair risk to install something that's broken which shows up in openHABian first or only. And to most of our own users this will look like a openHABian problem.
Most of the time it just doesn't matter if the old or new version is used thus there's no benefit in changing. Meanwhile any change has a potential to break/interrupt operations, and only in very few cases it's a fix we benefit from. In those cases we of course will take over the change by simply copying the repo (git rebase). Note this will also reduce the number of different versions in use which is desirable from the professional point of view.
We should aim to professionalize our release management. It's known best practice in professional environments not to deploy any software that you have not previously tested yourself or to deploy versions that you don't need.

@EliasGabrielsson
Copy link
Contributor

In those cases we of course will take over the change by simply copying the repo (git rebase).

Merging diverted codebases are not as simply as you describe it and require extensive amount of work. I do not agree that your proposed method of release management is suitable for this project. I don't say it is wrong, but not suitable. To give the end user a good first-time experience using openHAB they should have the latest and greatest of all optional packages etc.

To achieve quality with that goal in mind I would say we shall focus on integrate and deploy often.
(Check this slidedeck why: http://frequentdeploys.club/)

If a user wants a super stable conservative installation for running critical automation they should not use openHABian as base installation.

@mstormi
Copy link
Contributor Author

mstormi commented Aug 10, 2019

Merging diverted codebases are not as simply as you describe it and require extensive amount of work.

That makes me believe we're not talking about the exactly same thing. I mean to use the origin repo as the only source (except in emergencies). All fixes should be sent & applied upstream and will trickle back down to us when we mirror (merge, but it's all from that single source so not really a merge but rather a pull). Note I mean of course to merge ALL commits that accumulated at that point in time, not just the one we consider to have a benefit.
Note having this intermediate mirror gives you full flexibility on a per-mirror (per-feature) basis ... you can mirror any commit immediately but you don't have to and can do it in batches or even not at all if there's nothing wrong with the old (= proven for our application) version. Remember upstream don't build for exclusive use in openHABian only. Eventually they even are not aware at all that and how we use their tool.
If for your feature you believe in the upstream's capabilities and really believe that the latest is always the greatest, you can auto-replicate every change of the upstream repo into openHABian (for your feature only).
But I don't believe in that - it's not only openHAB to introduce breaking changes at times, and as long as we don't have stable and HEAD versions of openHABian like we have in openHAB, the Levenshtein distance between leading edge and bleeding edge is just 2 ...

If a user wants a super stable conservative installation for running critical automation they should not use openHABian as base installation.

Disagree. What an openHABian user expects is hassle-free-ness, first and foremost that's reliability. That means to drive a conservative approach in base units such as kernel, Raspbian release, install routines etc. They MUST work stable and need not be the latest.
At the same time you can add innovative features that mirror the upstream developer's version 1:1 if you feel a need to include this to attract users.

PS: this approach already was pretty helpful in quickly solving the first zram issue.

@mstormi mstormi changed the title Make openHABian independent of upstreams Make openHABian less dependent of upstreams Aug 13, 2019
@mstormi
Copy link
Contributor Author

mstormi commented Sep 7, 2019

@EliasGabrielsson @holgerfriedrich @ThomDietrich please spend another thought or two on this proposal.
The recent problems with Azul Java are a good example why we really need this IMHO.
While I still have the firm point of view that latest isn't automatically greatest - i.e. we shouldn't install HEAD from remote repos -, it isn't one-or-the-other. We at least should increase reliability e.g. this:
We could download and 'cache' a known-to-work package by storing it in the openhabian repo.
The install routine could then be reworked to try retrieving it from the source, but fall back to our copy if the source is unavailable.

@EliasGabrielsson
Copy link
Contributor

The downtime during Azul Java change was around ~1day before it was fixed. I would say that is good enough tradeoff of always using the latest possible SW artifacts. Better to integrate often so we don't lag behind.

@mstormi
Copy link
Contributor Author

mstormi commented May 8, 2020

FYI, we have another problem right now. Latest Azul package is broken.
We really need to come with a solution to cache "good" releases of upstreams.

Needless to say that the "integrate often" approach doesn't work when maintainers are absent.

@ecdye
Copy link
Member

ecdye commented May 12, 2020

I don't believe that this is entirely necessary, in the Azul instance you refer to, it was more of a oversight than anything else. Additionally, the fix you suggested does not work in the case outlined as the system still downloads the binary and then bypasses the cache.

Also, the whole principle of openHABian is to create a lightweight system to be run on a small system like a RPi. The principle of caching is not holding true to the intended goal of the project. Furthermore, as it states in the installation article "The good news: openHABian helps you to stay away from Linux - The bad news: Not for long..." Which goes to show that it remains a user involved process. That is not to say that we should make it noob unfriendly but that there will be some errors and users should be able to report issues when they arise.

If the expectation is to provide a up-to-date system that is fully featured then I think that the approach we are implementing is perfectly fine. Particularly after the creation of a stable branch. Otherwise, we will effectively be changing the scope of the project.

@mstormi
Copy link
Contributor Author

mstormi commented May 13, 2020

I don't believe that this is entirely necessary,
Additionally, the fix you suggested does not work in the case outlined

Sure, any implementation would need to be tailored to match the source but that it needs to be done anyway. Not sure what you mean here , if say the apt install method say runs a script to download the file, we should catch the download. not to be discussed here but in the respective issue please.
Please don't mix implementation questions (#878 etc) with strategic ones (this, #655).
I still fail to understand why in principle we cannot (pre-)store a proven working copy.

Also, the whole principle of openHABian is to create a lightweight system to be run on a small
system like a RPi. The principle of caching is not holding true to the intended goal of the project.

That's just an interpretation of yours and to be frank it's wrong. System size does not matter much, there's even space left on a 8GB SD to do that. And don't take too much on the wording ("caching"), it's just one way of describing my intention.
I see no conflict with project goals in using caching/pre-storage/whatever .... that's what the image already does today as well, it's a (partial) cache of packages so you don't have to download at install time. On the contrary, this is a contribution to "hassle-free-ness" in its best possible sense.

Which goes to show that it remains a user involved process. That is not to say that we should
make it noob unfriendly but that there will be some errors and users should be able to report
issues when they arise

That's yet another completely different thing, please don't mix all of these.
[ BTW the cited statement and a major part of the README are very old so don't take everything too literally in there - it really needs reworking, too. Supposed to be addressed in #730 I think ]

First and foremost goal is a great openHABian UX (to be "hassle-free" as Thomas has once put it) and we want to avoid a need for users to manually step in (and hence a need for them to be proficient in Linux) wherever possible.
Today's openHABian is less so than it would be with my proposal here in place because today it depends on external sources to be available at installation time.
Installation time is the first time a user makes contact with openHABian and openHAB.
If that does not work right away for the user, it's also often the last time he touches this stuff.
Remember there's no chance for a 2nd first impression!

Enhancements to troubleshoot openHABian again is yet another story, but this was quite recently adressed as well with the debug guide in #683 and continues to be of relevance in everything we do.

@mstormi
Copy link
Contributor Author

mstormi commented May 15, 2020

BTW The latest Raspbian (lite) image that is available for download includes Oracle JDK and other packages that require licensing. For their handling see here

@ecdye
Copy link
Member

ecdye commented May 15, 2020

Is it possible to remove licensed packages?

@mstormi
Copy link
Contributor Author

mstormi commented May 16, 2020

Is it possible to remove licensed packages?

Of course, Raspbian is no fixed bundle.

@mstormi mstormi added this to the Refinement and Refocus milestone May 23, 2020
@mstormi
Copy link
Contributor Author

mstormi commented Jun 18, 2020

openHABian has essentially been made capable of an install to succeed even if Internet connectivity is unavailable.
Enhancements to make this work for optional components too, like Tellstick, are of course still welcome but they're not crucial.
As the main goal is reached (and the main driver gone), I'll close this issue.

@mstormi mstormi closed this as completed Jun 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants