Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloudwatch: Consolidate client logic #25555

Merged
merged 77 commits into from
Jul 23, 2020

Conversation

patstrom
Copy link
Contributor

@patstrom patstrom commented Jun 11, 2020

What this PR does / why we need it:
Consolidate logic for getting clients in CloudWatch data source. Also clean up the CloudWatch test suite.

Special notes for your reviewer:

This is part 1 of PRs to replace #23439

It is probably appropriate for @aknuds1 to review this

This commit explicitly specifies and internal interface for the
cloudwatch executor to fetch the different AWS clients.

Previously each kind of client has been implemented in a different way
which was especially painful when implementing tests. With the commit
there is an explicit way to handle creation, fetching and testing of AWS
clients and interactions.

This lays the groundwork for reworking the caches in the cloudwatch
implementation altogether. The idea is to cache everything on a data
source id basis rather than the current cache keys, which are somewhat
arbitrary.
@patstrom patstrom requested a review from a team as a code owner June 11, 2020 21:34
@patstrom patstrom requested review from papagian and removed request for a team June 11, 2020 21:34
@aknuds1 aknuds1 self-assigned this Jun 13, 2020
@aknuds1 aknuds1 self-requested a review June 13, 2020 15:46
@aknuds1 aknuds1 added datasource/CloudWatch pr/external This PR is from external contributor labels Jun 13, 2020
@stale
Copy link

stale bot commented Jun 27, 2020

This pull request has been automatically marked as stale because it has not had activity in the last 2 weeks. It will be closed in 30 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale Issue with no recent activity label Jun 27, 2020
@patstrom
Copy link
Contributor Author

@aknuds1

@stale stale bot removed the stale Issue with no recent activity label Jun 27, 2020
aknuds1 added 16 commits July 3, 2020 11:52
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
@aknuds1 aknuds1 requested review from kaydelaney and aocenas July 22, 2020 11:50
@patstrom
Copy link
Contributor Author

I had to remove the "caching" of EC2 and RGTA clients in order to unify the interface for fetching in the first place. The other alternative would've been to add a cwSvc cloudwatchiface.CloudWatchAPI field to the cloudwatchExecutor and a corresponding ensureCloudwatchClient function that does a nil-check and I felt that was the wrong way forward.

The Cloudwatch logs client cache I removed because it did a cloudwatchlogs-specific caching on a region basis. While that is what we want (eventually) I thought the proper way was to remove it for now in order to be able to unify the interface and then later on start caching the sessions on a region-basis instead. That way all the different clients can use the same cache-implementation.

So the idea here moving forward is that once there is a unified interface for fetching AWS clients we can move on to fix the now buggy cache by using the data source id as key and start caching the actual Session (on a region-basis) instead of Credentials, which will be a performance improvement.

@aknuds1
Copy link
Contributor

aknuds1 commented Jul 22, 2020

@patstrom So if we ignore the reasons why you had to remove the caching in the first place, what would break if we re-introduce client caching (according to the original naive scheme) in this PR? I'm leaning towards re-introducing client caching in this PR, so we don't break anything and instead propose an improved(?) caching system in another PR.

I think technically, reintroducing client caching will be easy.

@patstrom
Copy link
Contributor Author

patstrom commented Jul 22, 2020

The reasons for actually removing the caching in this PR is purely in order to de-clutter the code base. Re-introducing the very same ec2Svc, rgtaSvc and logsClientsByRegion and instead manipulating them in the corresponding "get" function shouldn't break anything when compared to the previous behavior (as far as I can tell).

Do note that all caching in this package is buggy though. Nothing is properly invalidated. E.g if you provide a working configuration, do some queries (and stuff is cached) and then update the configuration to something that shouldn't work, it still will (in some cases).

I guess it is a matter of taste if you want to remove it now or later (either when replacing them with something proper or just removing them to fix the bugs). Personally I felt that given that they are buggy and that there are two different client cache implementations I deemed it better to just remove them since they made the code more difficult to reason about.

@aknuds1
Copy link
Contributor

aknuds1 commented Jul 22, 2020

Thanks for clarifying that @patstrom. It sounds as if I might just code up a reintroduction of caching for this PR, to keep the current scheme (for the time being?), unless my colleagues think differently.

@mtanda
Copy link
Collaborator

mtanda commented Jul 22, 2020

@mtanda Do you know why CloudWatch logs, EC2 and RGTA clients are cached (while the CloudWatch client is not)? FYI, this PR proposes to drop the caching of clients.

I didn't intend to cache client, I need to use mock client for test code.
5255c9c
(Sorry, I didn't read discussion)

@aknuds1
Copy link
Contributor

aknuds1 commented Jul 22, 2020

@mtanda Aha, thanks for clarifying that.

@aknuds1
Copy link
Contributor

aknuds1 commented Jul 23, 2020

@patstrom I've re-introduced client caching in this branch. I also removed initialization of cloudWatchExecutor.DataSource when it gets instantiated, since I realized it always gets initialized in the Query method anyway. Also fixed a merge conflict. Please look my changes over and merge them into your branch, or let me know if you see any problems.

Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, but there's a remaining minor merge conflict. I've fixed it in this branch.

@aknuds1 aknuds1 merged commit 43ef052 into grafana:master Jul 23, 2020
@aknuds1 aknuds1 changed the title cloudwatch: Explicitly specify internal client interface cloudwatch: Consolidate client logic Jul 23, 2020
@aknuds1 aknuds1 added this to the 7.2 milestone Jul 23, 2020
@aknuds1
Copy link
Contributor

aknuds1 commented Jul 23, 2020

Thank you for contributing to Grafana!

@patstrom
Copy link
Contributor Author

Cheers! 🎉

Do you want me to update the other MRs to be based on this instead?

@aknuds1
Copy link
Contributor

aknuds1 commented Jul 23, 2020

@patstrom Just base them (rebase, merge, whatever) on master, since this PR has been merged. We'll evaluate them as we find the time.

@patstrom
Copy link
Contributor Author

Ah yeah that's what I meant 😅

xlson added a commit to xlson/grafana that referenced this pull request Jul 24, 2020
* master: (195 commits)
  update latest to 7.1.1 (grafana#26588)
  changelog 7.1.1 (grafana#26586)
  Drone: Reduce parallelism of certain steps to conserve memory (grafana#26582)
  CloudWatch: Fix a few API status codes (grafana#26578)
  Drone: Identify Drone runner for each pipeline (grafana#26573)
  DashboardImport: Fix variable interpolation when property contains multiple variable expressions (grafana#26574)
  Update TextArea.mdx (grafana#26225)
  CloudWatch: Fix passing of namespace for getting custom metrics (grafana#26515)
  Alerting: Change settings column type to mediumtext (grafana#26549)
  Dashboard: Restore panel edit permission check, fixes 26555' (grafana#26556)
  Docs: Add variable examples (grafana#26565)
  Graph: support setting field units (grafana#26529)
  cloudwatch: Consolidate client logic (grafana#25555)
  Docs: Add dependencies for debian buster to image rendering plugin (grafana#26452)
  Fix self cloosing bracket to display input (grafana#26552)
  SQLite: Set 0640 permissions on SQLite database file (grafana#26339)
  Docs: Remove TODO from the CloudWatch docs (grafana#26543)
  Cloudwatch: Add af-south-1 region (grafana#26513)
  LDAP: fix LDAP test with special chars in username (grafana#26539)
  HTTPServer: add possibility to use additional middlewares (grafana#26514)
  ...
njvrzm pushed a commit to grafana/grafana-cloudwatch-datasource that referenced this pull request Apr 2, 2025
* cloudwatch: Consolidate client logic

Co-authored-by: Arve Knudsen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasource/CloudWatch pr/external This PR is from external contributor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants