-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scraper 2.0 improvements - Part I #1481
Scraper 2.0 improvements - Part I #1481
Conversation
to support retaining of team and host files for the explorer while not including in CScraperManifests. Also maintains backward compatibility with ver 1 file manifests.
This adds support for the -explorer flag, which changes the behavior of the scraper to hold files for a longer period of time and also download team and host files. The publishing of manifests is not affected. This is the initial implementation of the explorer flag, team and host file downloading and retention.
Also do not do hash check of files excluded from publishing, since these are very large and it is very expensive and unnecessary.
Also minor other cleanup. Some structures in ConveredManifest and the cache added here may be eliminated after testing/fine-tuning.
This implements a bClean boolean in that is marked false in scraper_net when manifests are received from the network or published locally. It is marked true when a new set of statistics and SBContract core is computed. The rule is that the cached contract will be used when the cache age is younger than nScraperSleep in seconds OR the cache is clean (i.e. no new manifests have been published (if a scraper node) or received (if a normal node). This will avoid the statistics calculation pulse seen on mainnet every 300 seconds during times the scrapers are not active and publishing new manifests.
Also remove unnecessary bByParts flag check in GUI tooltip
17bce0f
to
ca85096
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't comment on it inline: is this line removing the unprocessed user.gz file for a project? Do we want to check fExplorer
before deleting it like with the team file?
Edit: that's line 2047 if the link doesn't work.
Just give me the line number. The link didn't work. |
Line 2047 |
I didn't change the behavior for the user files, because I didn't think startail needed to process the full user file. He and I only talked about the team and host files. If he wants the user file too, I will have to make modifications. |
I have pinged him to clarify. |
ca85096
to
5ac717c
Compare
Use boost::algorithm::join to compress joining of vector elements in strings for tooltip.
5ac717c
to
7e4ce4d
Compare
Ok. I think we are good pending @startailcoon's clarification. |
I'm interested in processing the full user files as well, sorry if this was overlooked in our previous talks @jamescowens |
for explorer mode. Normalized common code for aligning scraper file manifest entries into separate function AlignScraperFileManifestEntries to eliminate repeated code.
Both of those vectors must only include scrapers marked active in the appcache.
Ok. I think we are ready to merge this. Please take a last look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running with -explorer
: looks like it's downloading and retaining the unprocessed export files as expected and the manifest looks correct.
Gotta keep an eye on disk space in explorer mode. 8.2 GB after one day. 🙂
Perhaps a future optimization keeps only the latest unprocessed stats files. I wonder if explorers will need the same etag versions to match the converged stats. The extra space is probably minor after all.
I am not sure about what @startailcoon is going to need with these unprocessed files. He wanted a weeks worth, so I have a feeling just keeping the latest is not going to work. We may want to save just one per day, as for several of the projects they update the files multiple times per day. I think for right now, we should stick to keeping the unprocessed files for each and every etag change... |
It eats up a lot of disk space, but I think he is prepared for that. No telling what his explorer is already using disk-space wise. I imagine quite a bit. |
6e86f12
to
b4dfde6
Compare
Added: - Add freedesktop.org desktop file and icon set #1438 (@a123b) - Add warning in help for blockchain scan for importprivkey #1469 (@jamescowens) - Consolidateunspent rpc function #1472 (@jamescowens) - Scraper 2.0 improvements #1481, #1488, #1509, and #1514 (@jamescowens, @cyrossignol) - explorer mode operation - simplified explainmagnitude output - improved convergence reporting, including scraper information in the tooltip when fDebug3 is set - improved statistics and SB contract core caching based on a bClean flag in the cache global - new SB format and packing for bv11 - new SB contract hashing (native) for bv11 - changes to accomodate new beacon approach - Implement in memory versioning for team file ETags - Implement local dynamic team requirement removal and whitelist #1502 (@cyrossignol) Changed: - Quiet logging for getmininginfo and scraper INFO logging level #1460 (@jamescowens) - Spelling corrections #1461, #1462 (@caraka) - Update crypto module #1453 (@denravonska) - Update .travis.yml for Bionic #1475 (@jamescowens) - Create CPID classes and clean up CPID code #1477 (@cyrossignol) - Refactor researcher context and CPID harvesting #1480 (@cyrossignol) - Remove boinckey export RPC method and import handler - Notify when wallet locked in advertisebeacon RPC method #1504 (@cyrossignol) - Notify when wallet locked in beaconstatus RPC method #1506 (@cyrossignol) - Change spacer minimum height hint #1511 (@jamescowens) Removed: - Remove safe mode #1434 (@denravonska) - Remove bitcoin.moc in Makefile.qt.include #1444 (@RoboticMind) - Clean up legacy Proof-of-Work functions #1497 (@cyrossignol) Fixed: - Constrain walletpassphrase to 10000000 seconds #1459 (@jamescowens) - Straighten out localization in the scraper. #1471 (@jamescowens) - Quick fix for rainbymagnitude #1473 (@jamescowens) - Correct negation error in scraper tooltip for vScrapersNotPublishing #1484 (@jamescowens) - Fix staked block rejection when active researcher #1485 (@cyrossignol) - Add back informational magnitude to generated blocks #1489 (@cyrossignol) - Add back in the in sync check in ScraperGetNeuralContract #1492 (@jamescowens) - Scraper correct team file processing. #1501 (@jamescowens) - Have importwallet file path default to datadir #1508 (@jamescowens) - Scraper add Beacon Map size check to ensure convergence #1515 (@jamescowens)
This Part I of Scraper 2.0 implements a collection of improvements including
Part II is anticipated to have