refactor the indexer #726

Omarabdul3ziz · 2024-01-31T15:48:21Z

Description & Changes

combine indexers to be a single indexer with different watchers
use a single RpcRmbClient across the proxy
add more flags -no-indexer and no-modify to the commands & update the workflow to stop the indexer while integration tests
add watchers for dmi/speed
implement Scanner/Valuer for model structs
add triggers on the new table to update the resources cache
use gorm on the crafter for easier mock data generations for the custom model types
add loader/tests for the new fields

Related issues

Checklist

Tests included
Build pass
Documentation
Code format and docstring

* seperate commit for easier review *

… dmi model

- organize the schema file & add new indexer tables - add generator for the new indexer tables - use gorm instead of plain sql in the crafter

…-sdk-go into development_proxy_indexer

…m on tests

xmonader · 2024-02-11T12:07:18Z

grid-proxy/cmds/proxy_server/main.go

-	healthIndexerInterval uint
+	healthIndexerWorkers        uint
+	healthIndexerInterval       uint
+	dmiWatcherWorkers           uint


why did we move from the word "indexer" to watcher?

xmonader · 2024-02-11T12:08:56Z

grid-proxy/pkg/types/nodes.go

@@ -53,6 +55,11 @@ type CapacityResult struct {
 	Used  Capacity `json:"used_resources"`
 }

+type Speed struct {


I noticed there're multiple speeds for upload / download there's ipv4, ipv6 versions? which is it? and are

yes we have ipv4 and ipv6 for both tcp and udp these are four per an iperf server,
for example devnet has two iperf servers on the two public nodes so we have total of 8 tests

a discussion was raised on the chat group and i remember the decision was to show the only non-zero value. should we do something different? or view them all?

xmonader · 2024-02-11T12:11:19Z

grid-proxy/internal/indexer/utils.go

@@ -0,0 +1,34 @@
+package indexer
+


rename the file to filter_up_nodes.go

it was planned to have more utils method for filtering up nodes/healthy nodes or call rmb

xmonader · 2024-02-11T12:12:15Z

grid-proxy/internal/indexer/types.go

+//		item2
+//		...
+//	]
+type DMI struct {


Why didn't you bring the dmidecoder from zos ?

xmonader · 2024-02-11T12:13:38Z

grid-proxy/internal/indexer/indexer.go

+)
+
+const (
+	indexerCallTimeout = 10 * time.Second


bump that to 30

xmonader · 2024-02-11T12:17:50Z

grid-proxy/internal/indexer/indexer.go

+
+func NewIndexer(
+	ctx context.Context,
+	paused bool,


Shouldn't be here, if for internal state, it should be defaulted to the zero value, false

xmonader · 2024-02-11T12:20:36Z

grid-proxy/internal/indexer/gpu.go

 	"github.com/threefoldtech/tfgrid-sdk-go/grid-proxy/internal/explorer/db"
 	"github.com/threefoldtech/tfgrid-sdk-go/grid-proxy/pkg/types"
 	"github.com/threefoldtech/tfgrid-sdk-go/rmb-sdk-go/peer"
-	rmbTypes "github.com/threefoldtech/tfgrid-sdk-go/rmb-sdk-go/peer/types"
 )

 const (


prefix all of the consts with gpu (given there're multiple indexers in the same package)

xmonader · 2024-02-11T12:21:20Z

grid-proxy/internal/indexer/gpu.go

 		db:                     db,
+		rpcClient:              rpcClient,


is that client a reconnecting one, in long running processes you should be careful not to lose that single connection you have.

afaik yes, the rpcClient taking a substrate manager and handle the open/close connections when needed

xmonader · 2024-02-11T12:26:57Z

grid-proxy/internal/indexer/gpu.go

@@ -130,11 +108,25 @@ func (n *NodeGPUIndexer) runQueryGridNodes(ctx context.Context) {
 }

 func (n *NodeGPUIndexer) getNodeGPUInfo(ctx context.Context, nodeTwinID uint32) error {
-	ctx, cancel := context.WithTimeout(ctx, 10*time.Second)
+	subCtx, cancel := context.WithTimeout(ctx, 10*time.Second)


xmonader · 2024-02-11T12:33:37Z

grid-proxy/internal/indexer/dmi.go

+	}
+}
+
+func parseDmiResponse(dmiResponse DMI) types.DmiInfo {


replace everything related to parsing by zos dmidecode package, don't do it yourself

checked the dmi pkg in zos and the DMI response is already parsed from the pkg in zos.
what i am doing here is just selecting what we want to store instead of storing the whole dmi response

xmonader

I had to stop the review.

1- Please make sure all of the indexers are having the similar interface
2- Remove everything related to any manual parsing of the dmidecode
3- Prefix all of the constants related to each indexer with the indexer name e.g healthIndexerNumWorkers
4- for every indexer there should be

a Start method, can have its own interface Starter
a Stop method, can have its own interface Stopper
a FindHealthyNodes method
a ResultsBatcher method
a DBBatchesUpserter method

All of them can be converted to functions whenever needed.

Each indexer file should have the logic top to bottom

e.g

consts
vars
types
factories
entry point Start function
then each of the methods asked for

Note: consts and vars should be prefixed with the indexer name

Use the same terms everywhere: e.g status, state, numWorkers, workersCount, .. etc All of these need to be unified

in an upcoming improvement we can use singleflight to avoid expensive DB/network calls e.g all are asking for the healthy nodes within the same timeframe

Aside from that, the PR already has too many concerns, supporting the dmi, refactoring in the indexers, introducing more dependencies. Please in the future make sure it's scoped.

…-sdk-go into development_proxy_indexer

…exers: - configure a flag to stop or start the indexer manager on the startup. it is useful to use `-no-indexer` flag while running integration tests, to keep the loaded data in the mock client consistent with the actual data in DB. - organize the database calls in `postgres.go` file by separating the indexer calls into a new file. also, save consistency for numeric `NodeTwinId` filed to be `uint32` in all methods - remove the duplicated types across different pkgs and use only the ones in `pkg/types` pkgs db/tests/generator - add new trigger on the `node_gpu` table, this trigger should be fired on insert/delete to update the `num_gpu` filed on the `resources_cache` table - remove all implementation for `Scanner/Valuer` interfaces for jsonb fields in gorm model. instead easier and cleaner, use the registered json serializer. apply this for dmi fields/node power object - modify the generation of `healthy_reports` mock data to use `gorm` functionality instead of creating a plain SQL string - update and organize the schema file used on the db mock generation to be aligned with the new defined struct models. Indexer pkg changes: - import types directly from zos once needed instead of re-implement it, applied this for IperfTestResult and DMI - create utils method, generic methods that can be used in all indexers to avoid repeating the code - create a manager that register/control all the indexers - define an `Indexer` interface that defines the needed methods for each indexer - document the indexer package along with the registered indexers and their metadata - implement `Indexer` interface for each registered interface, this requires some refactoring for the indexer especially the GPU indexer - a better decision on old GPU node deletion by adding a timestamp expiration and check it is expired or not with each upsert, the allowed interval is the same as the check interval in the indexer an hour for now. this is better than checking the lastAdded twin id cause we have multiple caller/batcher that may conflict the results. - add new trigger for the dmi indexer, to start querying the newly added nodes and don’t wait for the new rotation

…-sdk-go into development_proxy_indexer

- correctly handle the indexer manager start is !noIndexer after debugging - change the updated_at field on gpu table in schema to be a bigint to align with the struct model - use gorm instead of plain sql for scanning in loaders in both load nodes/dmi to benefit from the registered json serializer instead of implementing the scanning logic for the jsonb fields

Omarabdul3ziz · 2024-02-19T10:59:02Z

Update changes:

configure a flag to stop or start the indexer manager on the startup. it is useful to use -no-indexer flag while running integration tests, to keep the loaded data in the mock client consistent with the actual data in DB.
organize the database calls in postgres.go file by separating the indexer calls into a new file. also, save consistency for numeric NodeTwinId filed to be uint32 in all methods
remove the duplicated types across different pkgs and use only the ones in pkg/types pkgs db/tests/generator
add new trigger on the node_gpu table, this trigger should be fired on insert/delete to update the num_gpu filed on the resources_cache table
remove all implementation for Scanner/Valuer interfaces for jsonb fields in gorm model. instead easier and cleaner, use the registered json serializer. apply this for dmi fields/node power object
modify the generation of healthy_reports mock data to use gorm functionality instead of creating a plain SQL string
update and organize the schema file used on the db mock generation to be aligned with the new defined struct models.

Indexer pkg changes:

import types directly from zos once needed instead of re-implement it, applied this for IperfTestResult and DMI
create utils method, generic methods that can be used in all indexers to avoid repeating the code
create a manager that register/control all the indexers
define an Indexer interface that defines the needed methods for each indexer
document the indexer package along with the registered indexers and their metadata
implement Indexer interface for each registered interface, this requires some refactoring for the indexer especially the GPU indexer
a better decision on old GPU node deletion by adding a timestamp expiration and check it is expired or not with each upsert, the allowed interval is the same as the check interval in the indexer an hour for now. this is better than checking the lastAdded twin id cause we have multiple caller/batcher that may conflict the results.
add new trigger for the dmi indexer, to start querying the newly added nodes and don’t wait for the new rotation

…-sdk-go into development_proxy_indexer

muhamadazmy · 2024-02-20T09:42:10Z

grid-proxy/cmds/proxy_server/main.go

+	flag.UintVar(&f.dmiIndexerIntervalMins, "dmi-indexer-interval", 60*24, "node dmi check interval in min")
+	flag.UintVar(&f.dmiIndexerNumWorkers, "dmi-indexer-workers", 1, "number of workers checking on node dmi")
+	flag.UintVar(&f.speedIndexerIntervalMins, "speed-indexer-interval", 5, "node speed check interval in min")
+	flag.UintVar(&f.speedIndexerNumWorkers, "speed-indexer-workers", 100, "number of workers checking on node speed")


Just wondering why do we need multiple indexer workers to query things from nodes, can't we have a single set of workers that they ask the node for the same thing for everything?

I mean i see here, dmi-indexer-workers, gpu-indexer-workers, speed-indexer-workers (whatever that means), etc...

This is just a question, not request for change

muhamadazmy · 2024-02-20T09:58:55Z

grid-proxy/internal/indexer/gpu.go

+	go n.StartNodeFinder(ctx)
+	go n.startNodeTableWatcher(ctx)


Do these methods need to be public? I believe they are intended for private use by the indexer object. I suggest they are renamed to be private instaed

This applies to all other indexers here

muhamadazmy · 2024-02-20T10:04:04Z

grid-proxy/internal/indexer/health.go

+	batchSize       uint
+}
+
+func NewNodeHealthIndexer(


I can see that the an indexer mainly does the same operation, I was wondering that an indexer logic can be abstracted more.
From what i see an indexer has a set of options (say workers), he run indexing every certain amount of time. Then does some sort of batching to update the db.
All this logic then can be unified. Then each indexer need to define 2 things.

The function that does the "fetch" of the data.

The logic of "batch" db update.

Those 2 functions then can be part of an IndexerWork interface. And can has multiple implementation (say gpu and dmi) and then an Indexer instance can be created for each something like NewIndexer(GPUIndexerWork()) for example. Then call indexer.Start()

Description: - redefining the indexer structs to be a single generic indexer without manager, each indexer should do the generic functions of find/get/batch/upsert results. with each indexer have different job based on the defined type will creating the indexer. - each indexer job should only handle its specific job of: - just defining its finders or triggers - preparing the rmb call to the node, parse the response and return a ready db model template that can stored - upserting its restults by updating or descarding or inserting the passed arguments - the indexer code is the one responsible for doing the logic, by - executing the finders methods for each indexer as it is defined in its job based on the interval - starting the getter worker that invoke the get method on the work - starting a batcher for each indexer, that have two channels as its ends - starting the upserter worker that invoke the upsert method on the work Changes: - remove the indexer manager for easily starting a separate generic indexer based on the resource type. - remove the redundant `NewIndexer` `Start` `Find` `Batch` methods for each indexer for a generic method on the `Indexer[T any]` struct - update the return of the get method on work to be `[]T` instead `T` to cover all the types indexer responses like gpu call which return a slice of gpus - creating the finders map for easly reusablity of the finders method on new/up/health nodes - check the uniqueness of the nodes on the batch before inserting to prevent having duplicated data comming from two different triggers like healthy/up or up/new - modify the default sql-log-level in the makefile - add more debug logging statement - update docs

Omarabdul3ziz · 2024-02-27T10:06:28Z

Converting to draft to not merge till 3.13 reaches mainnet.

…-sdk-go into development_proxy_indexer

xmonader · 2024-03-19T09:40:58Z

yalla beena

Omarabdul3ziz added 14 commits January 31, 2024 12:02

convert indexers to watchers in a mono indexer

20ee9d7

* seperate commit for easier review *

use a single rmb client of type peer.RpcClient across the project

5e4787d

configure starting/stoping indexer/modifier from flag

7cde9d7

wip: introducing the dmi watcher

4c14d01

upsert dmi on database

fb32310

add network speed watcher

606f301

implement an Scanner/Valuer interfaces for custom gorm jsonb types in…

8ae0a14

… dmi model

fix network speed types

6f01097

add triggers for dmi/speed tables

a2ec728

add generators:

fae4ee7

- organize the schema file & add new indexer tables - add generator for the new indexer tables - use gorm instead of plain sql in the crafter

add loader/tests for the new dmi/speed data

23645c4

make the interval/worker configurable

5fcf9e3

Merge branch 'development' of https://github.com/threefoldtech/tfgrid…

7e6c246

…-sdk-go into development_proxy_indexer

stop data modification, remove unnecessary conversion, silent the gor…

2398e23

…m on tests

Omarabdul3ziz marked this pull request as ready for review February 8, 2024 15:38

Omarabdul3ziz requested review from muhamadazmy, mariobassem, ashraffouda, rawdaGastan, AbdelrahmanElawady, xmonader, sameh-farouk and Eslam-Nawara as code owners February 8, 2024 15:38

Omarabdul3ziz added the grid-proxy belongs to grid proxy label Feb 8, 2024

Omarabdul3ziz mentioned this pull request Feb 8, 2024

show more info for nodes #664

Closed

xmonader reviewed Feb 11, 2024

View reviewed changes

grid-proxy/internal/indexer/indexer.go Outdated

)

const (

indexerCallTimeout = 10 * time.Second

Copy link

Contributor

xmonader Feb 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump that to 30

xmonader reviewed Feb 11, 2024

View reviewed changes

xmonader requested changes Feb 11, 2024

View reviewed changes

Omarabdul3ziz added 4 commits February 11, 2024 16:17

Merge branch 'development' of https://github.com/threefoldtech/tfgrid…

1e38b61

…-sdk-go into development_proxy_indexer

Merge branch 'development' of https://github.com/threefoldtech/tfgrid…

20de686

…-sdk-go into development_proxy_indexer

Omarabdul3ziz added 3 commits February 19, 2024 13:45

update mod files

708df2e

Merge branch 'development' of https://github.com/threefoldtech/tfgrid…

cd2ae9a

…-sdk-go into development_proxy_indexer

silent the gorm logs in dump data generation

ad4d979

muhamadazmy requested changes Feb 20, 2024

View reviewed changes

Omarabdul3ziz force-pushed the development_proxy_indexer branch from b7916c1 to 2db2a21 Compare February 25, 2024 12:22

muhamadazmy approved these changes Feb 27, 2024

View reviewed changes

Omarabdul3ziz marked this pull request as draft February 27, 2024 10:06

Merge branch 'development' of https://github.com/threefoldtech/tfgrid…

119a62a

…-sdk-go into development_proxy_indexer

Omarabdul3ziz force-pushed the development_proxy_indexer branch from c8f7886 to 119a62a Compare March 12, 2024 08:26

remove already merged trigger on node_gpu

e068289

Omarabdul3ziz marked this pull request as ready for review March 13, 2024 10:46

Omarabdul3ziz mentioned this pull request Mar 19, 2024

integrate cpu and ram details in the dashboard threefoldtech/tfgrid-sdk-ts#2427

Closed

xmonader approved these changes Mar 19, 2024

View reviewed changes

Omarabdul3ziz merged commit 3f788cb into development Mar 20, 2024
37 checks passed

Omarabdul3ziz deleted the development_proxy_indexer branch March 20, 2024 08:21

Omarabdul3ziz mentioned this pull request Jun 6, 2024

🐞 [Bug]: The GPU endpoint for the node returns a value that can be null. #1051

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor the indexer #726

refactor the indexer #726

Omarabdul3ziz commented Jan 31, 2024 •

edited

Loading

xmonader Feb 11, 2024

xmonader Feb 11, 2024

Omarabdul3ziz Feb 18, 2024

xmonader Feb 11, 2024

Omarabdul3ziz Feb 18, 2024

xmonader Feb 11, 2024

xmonader Feb 11, 2024

xmonader Feb 11, 2024

xmonader Feb 11, 2024

xmonader Feb 11, 2024

Omarabdul3ziz Feb 18, 2024

xmonader Feb 11, 2024

xmonader Feb 11, 2024

Omarabdul3ziz Feb 18, 2024

xmonader left a comment •

edited

Loading

Omarabdul3ziz commented Feb 19, 2024

muhamadazmy Feb 20, 2024

muhamadazmy Feb 20, 2024

muhamadazmy Feb 20, 2024

muhamadazmy Feb 20, 2024

Omarabdul3ziz commented Feb 27, 2024

xmonader commented Mar 19, 2024

refactor the indexer #726

refactor the indexer #726

Conversation

Omarabdul3ziz commented Jan 31, 2024 • edited Loading

Description & Changes

Related issues

Checklist

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

xmonader left a comment • edited Loading

Choose a reason for hiding this comment

Omarabdul3ziz commented Feb 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Omarabdul3ziz commented Feb 27, 2024

xmonader commented Mar 19, 2024

Omarabdul3ziz commented Jan 31, 2024 •

edited

Loading

xmonader left a comment •

edited

Loading