Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AzureSQL short term retention policies #1355

Merged

Conversation

matthchr
Copy link
Member

Closes #1302

What this PR does / why we need it:
Adds support for Azure SQL short term retention alongside existing support for long term retention on the AzureSQLDatabase object.

If applicable:

  • this PR contains documentation
  • this PR contains tests

@matthchr matthchr force-pushed the feature/azure-sql-short-term-retention branch from 50c6570 to cacaf32 Compare January 12, 2021 23:11
Copy link
Member

@babbageclunk babbageclunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good but I think we should be calling Result on the policy futures coming back from CreateOrUpdate?

pkg/resourcemanager/azuresql/azuresqldb/azuresqldb.go Outdated Show resolved Hide resolved
return nil, err
}

return &future, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to find the corresponding change in the call to AddLongTermRetention but couldn't see it in the diff - I'm guessing because the calling code just ignores the response/future. Is there any problem with not calling .Response on the future? I guess it's not waiting until the operation has finished without that. Shouldn't we be calling Result on them so we can see any error from the operation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling Response doesn't wait until the operation has finished either, as the implementation for Response just does:

// Response returns the last HTTP response.
func (f Future) Response() *http.Response {
	if f.pt == nil {
		return nil
	}
	return f.pt.latestResponse()
}

and f.pt.latestResponse() just says: // returns the cached HTTP response after a call to pollForStatus(), can be nil

So if for example it took more than a single polling interval, or we hit Response too quickly after calling it would be nil - so I'm pretty sure we were doing the wrong thing before as well and what I have here is effectively the same as what we had before. Doubly so because the result of Response() was always being ignored anyway.

I see a few paths towards fixing this...

  1. Do the wait inline. This would work and it'd probably be fast as I don't think that these operations take a long time, but it has the disadvantage of breaking a "rule" of Kubernetes controllers which seems to be that you don't loop inside of the Reconcile function, you set a variable and let reconcile call you again (respecting the backoff, etc configured for the operator as a whole).
  2. Set up a state machine infrastructure so that we can go through the required workflow for the DB. The workflow is something like: Create DB -> poll create DB LRO -> Set LongTermRetention -> wait for LongTermRetention LRO -> Set ShortTermRetention -> wait for ShortTermRetention LRO -> Set "complete".
  3. Similar to 2 above but rather than thinking of it as states (which I think Kubernetes doesn't really love), just do a delta comparison to each entity in Azure and set them one at a time. I think the workflow would be something like this:
    a. Poll LRO if we have one - if not done just keep waiting, if done check result. Will need error handling for each type of LRO.
    b. Does DB exist? If not, create and store LRO. If yes, compare with Spec. If different post and store LRO. If same continue.
    c. Does LongTermRetention match spec? If no, post and store LRO. If yes continue.
    d. Does ShortTermRetention match spec? If no, post and store LRO. If yes continue.
    e. Set provisioned = true

I think the right thing to do is technically option 3, which also does away with the spec JSON hash checking in favor of an actual diff with Azure (which has the added benefit of allowing us to correct differences in Azure that Kubernetes didn't know about). The issue is that both 2 and 3 (that fix this issue the "right" way) are big undertakings that would effectively require full rewrites of the SQL DB reconciler. That introduces more risk and also is more duplicate effort given we're tracking towards a generic implementation that does exactly the above in the code generated path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Effectively I think that this is a situation where yes things are not ideal, but this is far from the only place that's true in the operator currently and it's not clear to me that it's the right thing to build a bespoke infrastructure to solve this problem in ASO when we have a generic one coming, so it might just be best to live with it for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see - yeah, if this is an existing issue then I think you're right that we can land this as is and fix it in the generic case. I think option 3 would be the right plan as well if we were doing that.

@matthchr matthchr merged commit 10e3c6e into Azure:master Jan 13, 2021
@matthchr matthchr deleted the feature/azure-sql-short-term-retention branch January 13, 2021 22:17
babbageclunk pushed a commit to babbageclunk/azure-service-operator that referenced this pull request Jan 18, 2021
* Add AzureSQL short term retention policies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: Add support for SQL short term retention
2 participants