Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add persistent storage to agent #131

Merged
merged 1 commit into from
Feb 6, 2025
Merged

Conversation

borod108
Copy link
Collaborator

Add persistent storage to the agent and move the credentials and agent-id there.

return string(bytes.TrimSpace(content)), nil
}

fmt.Println("datadir: ", a.config.DataDir)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use logging instead of fmt.Println

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops :)


storage:
filesystems:
- path: /var/lib/data
device: /dev/sda
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe for now, but maybe in future we should improve it to use ID?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, future improvement!

@@ -77,9 +76,9 @@ func (a *agentCmd) Execute() error {
undo := zap.ReplaceGlobals(logger)
defer undo()

agentID, err := a.readFile(agentFilename)
agentID, err := a.readPersistentFile(agentFilename)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change of name here? what value Persistent bring here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not adding changing the initial method and just add a parameter with the folder, or better just path the whole path and do the join here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some code duplication here indeed, let me think how to refactor - obviously a common underlying function with a param for basedir, but I still want two separate names to call it, so that I do not need to think about the right config option, just know if I am reading from the persistent storage or the volatile one,

if err != nil {
zap.S().Fatalf("failed to retreive agent_id: %v", err)
zap.S().Fatalf("failed to retrieve agent_id: %v", err)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in another PR

Copy link
Collaborator Author

@borod108 borod108 Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:/ I thought I will sneak this one line change which is in the execution path of what I am working on - but ok.

@@ -179,6 +187,20 @@ func (o *Ova) ovfSize() (int, error) {
return calculateTarSize(int(fileInfo.Size())), nil
}

func (o *Ova) diskSize() (int, error) {
file, err := os.Open("data/persistence-disk.vmdk")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wonder if we really need to bring the disk here, could not user create it when deploying OVA?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just leave the option for the user? if he decides to add a disk we move data folder there if not we proceed as planned?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what the benefit will be? currently it is a lazy provisioned very small file, it is now part of the OVA and will be deployed on the same storage selected by the user in the process of installation.
There is a higher complexity in allowing the user to add a disk - we do not know if it is sata or iscsi, not sure of it's id or name. What if the user wants to add a disk for a different purpose and wipes it so we loose the credentials and the id? Do we require that he adds a specific label? what if they remove the disk later (it is their disk...).
I think, at least for persistence of critical elements that like credentials this is a better solution, but I am open to be convinced.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, makes sense. I wasn't just sure if in the process of deployment user still need to select the disk for another VMDK, but I understand he don't have to and he pick the storage just once for both, which is great. So let's do it your way for sure.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure how the deployment process will be. If we automate the disk it's fine then.

just a remark: we never lose the id. the agent_id is created when agent service starts.

@tupyy
Copy link
Collaborator

tupyy commented Jan 22, 2025

Is this optional for the user? Why not move data folder on the new disk and avoid refactoring of the code.
@machacekondra

@machacekondra
Copy link
Collaborator

Is this optional for the user? Why not move data folder on the new disk and avoid refactoring of the code. @machacekondra

Well, right, yeah I think that all the data could now be stored on the disk and be persistent, unless I miss something.

@borod108
Copy link
Collaborator Author

borod108 commented Jan 22, 2025

Is this optional for the user? Why not move data folder on the new disk and avoid refactoring of the code. @machacekondra

Well, right, yeah I think that all the data could now be stored on the disk and be persistent, unless I miss something.

Well my thought was that we want to keep the disk minimal and the data collected may be large, and that on reset we want to restart the collection process and the sending of the inventory, right?

@tupyy
Copy link
Collaborator

tupyy commented Jan 22, 2025

Is this optional for the user? Why not move data folder on the new disk and avoid refactoring of the code. @machacekondra

Well, right, yeah I think that all the data could now be stored on the disk and be persistent, unless I miss something.

Well my thought was that we want to keep the disk minimal and the data collected may be large, and that on reset we want to restart the collection process and the sending of the inventory, right?

will the disk be smaller than 4gb? cause if not on the disk than you fill up the memory.
the user cannot know this "restart" behavior. In the future, I would expect to collect data anyway. not just once. What's the point on keep running the agent if we don't collect anything?

@borod108
Copy link
Collaborator Author

Is this optional for the user? Why not move data folder on the new disk and avoid refactoring of the code. @machacekondra

Well, right, yeah I think that all the data could now be stored on the disk and be persistent, unless I miss something.

Well my thought was that we want to keep the disk minimal and the data collected may be large, and that on reset we want to restart the collection process and the sending of the inventory, right?

will the disk be smaller than 4gb? cause if not on the disk than you fill up the memory. the user cannot know this "restart" behavior. In the future, I would expect to collect data anyway. not just once. What's the point on keep running the agent if we don't collect anything?

It is currently 50mb, could also be 5 actually for what we are doing now. Not sure what "user cannot know this restart behavior?" means exactly. The point of this entire thing is to improve restart behavior, which is our current solution to restart collection.

I do fully agree we should have collection for a running agent, we do not have it now and it is not the goal of this PR to allow it, do we have a timeline for this?

@machacekondra
Copy link
Collaborator

machacekondra commented Jan 23, 2025

Well the truth is that currently the only option to collect the inventory again is to restart the VM, unless user has access to VM via SSH and restart the Agent. We don't have any option to re-collect data via button or something, which we of course can add ;)
But let's for now persist only a credentials so we will say user can re-collect data via VM restart.

@app-sre-bot
Copy link
Collaborator

Can one of the admins verify this patch?

@machacekondra
Copy link
Collaborator

@borod108 Can u please fix the tests?

@borod108 borod108 force-pushed the task/unique-vm-id branch 2 times, most recently from c528f1a to 469c025 Compare February 3, 2025 21:59
kubectl wait --for=condition=Ready pods --all --timeout=240s
kubectl port-forward --address 0.0.0.0 service/migration-planner-agent 7443:7443 &
kubectl port-forward --address 0.0.0.0 service/migration-planner 3443:3443 &

- name: Run test
run: |
mkdir /tmp/untarova
cp data/persistence-disk.qcow2 /tmp/untarova/persistence-disk.qcow2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create this file on the fly instead of storing in git please?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

Add persistent storage to the agent and move the credentials and agent-id there.

Signed-off-by: borod108 <[email protected]>
@machacekondra
Copy link
Collaborator

If we would have an e2e test to test VM reboot in future it would be cool ;) But let's wait for single source.

@machacekondra
Copy link
Collaborator

@tupyy @nirarg PTAL

Copy link
Collaborator

@nirarg nirarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@borod108 borod108 merged commit 405bac1 into kubev2v:main Feb 6, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants