-
-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release(prod): Update miner to latest GA 2021.11.22.0 #269
Conversation
Create production PR automatically when pushing a miner release to testnet
Add testnet to name
Due to new multi architecture building in hm-miner we need to now add the arm64 tag here Relates-to: #NebraLtd/hm-miner#56
bump miner to latest arm64-version
fix: add arch type in name
fix environment variable
fix environment variable
release(testnet): Update miner to latest GA 2021.11.22.0
release(testnet): Update miner to latest GA 2021.11.22.0
Fetch the whole repo
release(testnet): Update miner to latest GA 2021.11.22.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get it out to production asap. Testnet looking good
Guys, with all respect what you are doing, why is PR not automerged to production branch after two approvals? It looks like step of authorized users to click "Merge" button is redundant and it is creating a bottle neck. |
@usalis I think they forgot how important this is. Because many devices are stuck now because this PR is not merged yet.. |
@justkidding96, I do not think they forgot about importance, just because PR and actions related were done at night. We appreciate it. I am questioning a release to production process. If all checks passed (2 reviews + testnet), but we still wait for one person to do a merge. Isn't it a counter productive? |
Testnet checks are not always immediate which is why we don't want to auto merge. But thanks for the suggestions |
@shawaj, my understanding is that testnet checks are done by human observation: someone checks hotspot activity for a first beacon witnessed after deployment to testnet to confirm it is working. I am sure there will be volunteers in a community to help to automate this, if there are no plans or resources to do it soon by team. |
@usalis it isn't really something the community can help with per-se as we need someone on our team who knows the codebase quite intimately to perform tests internally on our testnet (this part isn't documented here on the README currently for the internal part) Having said that, we potentially could put some customers devices onto our testnet to help verify releases....however I dont think many people will be interested in that as the devices will have a considerable amount more down time on testnet than in production and people don't like that open to ideas though for sure |
@shawaj We'd be happy to see if some of our hosts are opening to running an internal beta hotspot in addition to their regular hotspot. They're all experiencing so much downtime due to firmware issues already that I don't think anyone would mind. |
@rawrmaan will discuss this internally and see if there is an easy way to achieve something like this (although having multiple onboarded hotspots in same location may not be good for hosts either?). perhaps we could have two levels of testnet or something. need to have a think and see what can fit in with the existing pipeline/workflow. And will need to make sure that people are fully aware that it will be far more unreliable to be on bleeding-edge updates flow. unfortunately though, the vast majority of issues at the moment are coming from helium side - either due to the excessive number of issues on the network (including a GA every 2 days on average the last few weeks, 2 major chain halts and a "slow down"!) as well as the fact that GAs are not well documented with breaking changes and often breaking changes are dropped with no prior warning. To be honest, it feels like there needs to be something more substantial from the helium side - building a robust CI/CD pipeline with lots of automated testing but also a better / more standardised release flow as per helium/HIP#309 Whilst the manufacturer has responsibility for a lot of stuff on the hardware and software side, AFAIK helium right now does not test on all vendors hardware/software stack when pushing GAs which is bound to cause issues and these will only increase as the network (and number of different types of hardware) grows. Helium also has the resources and the recurring incentives (large % of HNT earnings) to fix this - more so than any other network stakeholder IMO. Probably worth further discussion though I guess from all aspects. |
I understand that Helium has been in massive flux and nothing is stable right now. It has to be very frustrating to deal with this as you're trying to improve your own software and systems. However, writing 3 paragraphs about how Helium needs to fix things when Nebra is uniquely failing to get their hotspots working is not a good look. No other major manufacturer has such an awful PoC success rate. We have first-hand experience with large sample size that shows Nebra hotspots are failing to even boot at a 10-30% rate. This happens before any of Helium's code comes into the picture. Look, I get it. Software is hard. I've built companies. Sometimes you ship to production 20 times in a row and it's still broken and you don't know why. We need an acknowledgement that you guys are having issues stabilizing the firmware and openness about what issues you're seeing and how you're planning to fix it. Since this is an open source project, maybe we could even help fix the issues if we know what's wrong. What will it be, @shawaj? Continued denial and gaslighting your customers, or acknowledging the situation so we can all move forward constructively? |
We aren't having issues stabilising the software (over and above the huge number of outages and GAs lately). In fact, many others operating large quantities of our miners are telling us that our software is far superior to others. We use balenaCloud which is extremely reliable (I believe Emrit and others use it too). We have had a small percentage of hotspots facing the issues described here #266 (there are about 4 separate issues, which relate to balena-engine not killing containers properly) after the large number of GAs in recent weeks. But other than that there are no issues we are aware of. If you are having particular problems, please reach out to our support at [email protected] |
Update miner to latest GA 2021.11.22.0
Ref #268
Pushed to testnet at Tue Nov 23 06:15:44 UTC 2021