-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor the indexer #726
refactor the indexer #726
Conversation
* seperate commit for easier review *
- organize the schema file & add new indexer tables - add generator for the new indexer tables - use gorm instead of plain sql in the crafter
…-sdk-go into development_proxy_indexer
grid-proxy/cmds/proxy_server/main.go
Outdated
healthIndexerInterval uint | ||
healthIndexerWorkers uint | ||
healthIndexerInterval uint | ||
dmiWatcherWorkers uint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did we move from the word "indexer" to watcher?
grid-proxy/pkg/types/nodes.go
Outdated
@@ -53,6 +55,11 @@ type CapacityResult struct { | |||
Used Capacity `json:"used_resources"` | |||
} | |||
|
|||
type Speed struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed there're multiple speeds for upload / download there's ipv4, ipv6 versions? which is it? and are
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we have ipv4
and ipv6
for both tcp
and udp
these are four per an iperf server,
for example devnet has two iperf servers on the two public nodes so we have total of 8
tests
a discussion was raised on the chat group and i remember the decision was to show the only non-zero value. should we do something different? or view them all?
@@ -0,0 +1,34 @@ | |||
package indexer | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename the file to filter_up_nodes.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was planned to have more utils method for filtering up nodes/healthy nodes or call rmb
grid-proxy/internal/indexer/types.go
Outdated
// item2 | ||
// ... | ||
// ] | ||
type DMI struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why didn't you bring the dmidecoder from zos ?
) | ||
|
||
const ( | ||
indexerCallTimeout = 10 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump that to 30
|
||
func NewIndexer( | ||
ctx context.Context, | ||
paused bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be here, if for internal state, it should be defaulted to the zero value, false
"github.com/threefoldtech/tfgrid-sdk-go/grid-proxy/internal/explorer/db" | ||
"github.com/threefoldtech/tfgrid-sdk-go/grid-proxy/pkg/types" | ||
"github.com/threefoldtech/tfgrid-sdk-go/rmb-sdk-go/peer" | ||
rmbTypes "github.com/threefoldtech/tfgrid-sdk-go/rmb-sdk-go/peer/types" | ||
) | ||
|
||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefix all of the consts with gpu (given there're multiple indexers in the same package)
grid-proxy/internal/indexer/gpu.go
Outdated
db: db, | ||
rpcClient: rpcClient, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that client a reconnecting one, in long running processes you should be careful not to lose that single connection you have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
afaik yes, the rpcClient taking a substrate manager and handle the open/close connections when needed
grid-proxy/internal/indexer/gpu.go
Outdated
@@ -130,11 +108,25 @@ func (n *NodeGPUIndexer) runQueryGridNodes(ctx context.Context) { | |||
} | |||
|
|||
func (n *NodeGPUIndexer) getNodeGPUInfo(ctx context.Context, nodeTwinID uint32) error { | |||
ctx, cancel := context.WithTimeout(ctx, 10*time.Second) | |||
subCtx, cancel := context.WithTimeout(ctx, 10*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bump to 30
grid-proxy/internal/indexer/dmi.go
Outdated
} | ||
} | ||
|
||
func parseDmiResponse(dmiResponse DMI) types.DmiInfo { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace everything related to parsing by zos dmidecode package, don't do it yourself
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checked the dmi pkg in zos and the DMI
response is already parsed from the pkg in zos.
what i am doing here is just selecting what we want to store instead of storing the whole dmi response
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to stop the review.
1- Please make sure all of the indexers are having the similar interface
2- Remove everything related to any manual parsing of the dmidecode
3- Prefix all of the constants related to each indexer with the indexer name e.g healthIndexerNumWorkers
4- for every indexer there should be
- a
Start
method, can have its own interfaceStarter
- a
Stop
method, can have its own interfaceStopper
- a
FindHealthyNodes
method - a
ResultsBatcher
method - a
DBBatchesUpserter
method
All of them can be converted to functions whenever needed.
Each indexer file should have the logic top to bottom
e.g
- consts
- vars
- types
- factories
- entry point
Start
function - then each of the methods asked for
Note: consts and vars should be prefixed with the indexer name
Use the same terms everywhere: e.g status, state, numWorkers, workersCount, .. etc All of these need to be unified
in an upcoming improvement we can use singleflight to avoid expensive DB/network calls e.g all are asking for the healthy nodes within the same timeframe
Aside from that, the PR already has too many concerns, supporting the dmi, refactoring in the indexers, introducing more dependencies. Please in the future make sure it's scoped.
…-sdk-go into development_proxy_indexer
…exers: - configure a flag to stop or start the indexer manager on the startup. it is useful to use `-no-indexer` flag while running integration tests, to keep the loaded data in the mock client consistent with the actual data in DB. - organize the database calls in `postgres.go` file by separating the indexer calls into a new file. also, save consistency for numeric `NodeTwinId` filed to be `uint32` in all methods - remove the duplicated types across different pkgs and use only the ones in `pkg/types` pkgs db/tests/generator - add new trigger on the `node_gpu` table, this trigger should be fired on insert/delete to update the `num_gpu` filed on the `resources_cache` table - remove all implementation for `Scanner/Valuer` interfaces for jsonb fields in gorm model. instead easier and cleaner, use the registered json serializer. apply this for dmi fields/node power object - modify the generation of `healthy_reports` mock data to use `gorm` functionality instead of creating a plain SQL string - update and organize the schema file used on the db mock generation to be aligned with the new defined struct models. Indexer pkg changes: - import types directly from zos once needed instead of re-implement it, applied this for IperfTestResult and DMI - create utils method, generic methods that can be used in all indexers to avoid repeating the code - create a manager that register/control all the indexers - define an `Indexer` interface that defines the needed methods for each indexer - document the indexer package along with the registered indexers and their metadata - implement `Indexer` interface for each registered interface, this requires some refactoring for the indexer especially the GPU indexer - a better decision on old GPU node deletion by adding a timestamp expiration and check it is expired or not with each upsert, the allowed interval is the same as the check interval in the indexer an hour for now. this is better than checking the lastAdded twin id cause we have multiple caller/batcher that may conflict the results. - add new trigger for the dmi indexer, to start querying the newly added nodes and don’t wait for the new rotation
…-sdk-go into development_proxy_indexer
- correctly handle the indexer manager start is !noIndexer after debugging - change the updated_at field on gpu table in schema to be a bigint to align with the struct model - use gorm instead of plain sql for scanning in loaders in both load nodes/dmi to benefit from the registered json serializer instead of implementing the scanning logic for the jsonb fields
Update changes:
Indexer pkg changes:
|
flag.UintVar(&f.dmiIndexerIntervalMins, "dmi-indexer-interval", 60*24, "node dmi check interval in min") | ||
flag.UintVar(&f.dmiIndexerNumWorkers, "dmi-indexer-workers", 1, "number of workers checking on node dmi") | ||
flag.UintVar(&f.speedIndexerIntervalMins, "speed-indexer-interval", 5, "node speed check interval in min") | ||
flag.UintVar(&f.speedIndexerNumWorkers, "speed-indexer-workers", 100, "number of workers checking on node speed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering why do we need multiple indexer workers to query things from nodes, can't we have a single set of workers that they ask the node for the same thing for everything?
I mean i see here, dmi-indexer-workers, gpu-indexer-workers, speed-indexer-workers (whatever that means), etc...
This is just a question, not request for change
grid-proxy/internal/indexer/gpu.go
Outdated
go n.StartNodeFinder(ctx) | ||
go n.startNodeTableWatcher(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these methods need to be public? I believe they are intended for private use by the indexer object. I suggest they are renamed to be private instaed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This applies to all other indexers here
batchSize uint | ||
} | ||
|
||
func NewNodeHealthIndexer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that the an indexer mainly does the same operation, I was wondering that an indexer logic can be abstracted more.
From what i see an indexer has a set of options (say workers), he run indexing every certain amount of time. Then does some sort of batching to update the db.
All this logic then can be unified. Then each indexer need to define 2 things.
- The function that does the "fetch" of the data.
- The logic of "batch" db update.
Those 2 functions then can be part of an IndexerWork interface. And can has multiple implementation (say gpu and dmi) and then an Indexer instance can be created for each something like NewIndexer(GPUIndexerWork())
for example. Then call indexer.Start()
Description: - redefining the indexer structs to be a single generic indexer without manager, each indexer should do the generic functions of find/get/batch/upsert results. with each indexer have different job based on the defined type will creating the indexer. - each indexer job should only handle its specific job of: - just defining its finders or triggers - preparing the rmb call to the node, parse the response and return a ready db model template that can stored - upserting its restults by updating or descarding or inserting the passed arguments - the indexer code is the one responsible for doing the logic, by - executing the finders methods for each indexer as it is defined in its job based on the interval - starting the getter worker that invoke the get method on the work - starting a batcher for each indexer, that have two channels as its ends - starting the upserter worker that invoke the upsert method on the work Changes: - remove the indexer manager for easily starting a separate generic indexer based on the resource type. - remove the redundant `NewIndexer` `Start` `Find` `Batch` methods for each indexer for a generic method on the `Indexer[T any]` struct - update the return of the get method on work to be `[]T` instead `T` to cover all the types indexer responses like gpu call which return a slice of gpus - creating the finders map for easly reusablity of the finders method on new/up/health nodes - check the uniqueness of the nodes on the batch before inserting to prevent having duplicated data comming from two different triggers like healthy/up or up/new - modify the default sql-log-level in the makefile - add more debug logging statement - update docs
b7916c1
to
2db2a21
Compare
Converting to draft to not merge till 3.13 reaches mainnet. |
…-sdk-go into development_proxy_indexer
c8f7886
to
119a62a
Compare
yalla beena |
Description & Changes
-no-indexer
andno-modify
to the commands & update the workflow to stop the indexer while integration testsgorm
on the crafter for easier mock data generations for the custom model typesRelated issues
Checklist