-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Restrict cron execution for single pod in clustered environment #39171
base: release
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request introduces several enhancements across the codebase. It adds detailed logging in exception handling, refactors lambdas to method references, and incorporates new field constants using annotations. Dependency injection is enhanced with additional services, and repository methods are expanded to support reactive data retrieval. Moreover, method signatures in services and tests are updated to improve reactive programming flows. Finally, distributed locking is implemented via a new annotation and aspect for managing cross-instance coordination. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant Aspect as DistributedLockAspect
participant Redis
participant Target as TargetMethod
Caller->>Aspect: Call method annotated with @DistributedLock
Aspect->>Redis: setIfAbsent(lockKey, ttl)
Redis-->>Aspect: Lock Acquired?
alt Lock Acquired
Aspect->>Target: Invoke target method
Target-->>Aspect: Return result
Aspect->>Redis: releaseLock(lockKey)
Aspect-->>Caller: Return result
else Lock Not Acquired
Aspect-->>Caller: Return error or fallback
end
sequenceDiagram
participant Scheduler as ScheduledTaskCEImpl
participant OrgService as OrganizationService
participant FFService as FeatureFlagServiceCE
Scheduler->>OrgService: retrieveAll() (returns Flux<Organization>)
OrgService-->>Scheduler: List of Organizations
loop For Each Organization
Scheduler->>FFService: getAllRemoteFeaturesForOrganizationAndUpdateFeatureFlagsWithPendingMigrations(org)
FFService-->>Scheduler: Updated Organization
end
Scheduler-->>Scheduler: Complete fetchFeatures
Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
…run-cron-by-single-pod
app/server/appsmith-server/src/main/java/com/appsmith/server/domains/Tenant.java
Outdated
Show resolved
Hide resolved
app/server/appsmith-server/src/main/java/com/appsmith/server/domains/TenantConfiguration.java
Outdated
Show resolved
Hide resolved
...rver/appsmith-server/src/main/java/com/appsmith/server/domains/ce/TenantConfigurationCE.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🔭 Outside diff range comments (1)
app/server/appsmith-server/src/main/java/com/appsmith/server/aspect/FeatureFlaggedMethodInvokerAspect.java (1)
87-91
: 🛠️ Refactor suggestionRemove redundant exception logging.
The exception is being logged twice:
- Line 87:
log.error("Exception while invoking super class method", e)
- Line 90:
log.error(exception.getMessage(), e)
This creates duplicate entries in the logs with the same stack trace.
Apply this diff to remove the redundant logging:
- log.error("Exception while invoking super class method", e); String errorMessage = "Exception while invoking super class method"; AppsmithException exception = getInvalidAnnotationUsageException(method, errorMessage); log.error(exception.getMessage(), e); throw exception;
🧹 Nitpick comments (17)
app/server/appsmith-server/src/main/java/com/appsmith/server/configurations/InstanceConfig.java (2)
67-69
: Consider adding @todo for multi-tenancy implementation.The comment suggests this is related to multi-tenancy, but there's no tracking TODO. Consider adding a TODO comment similar to the one on line 64 to ensure consistent tracking of multi-tenancy related changes.
.then(Mono.defer(instanceConfigHelper::isLicenseValid) - // Ensure that the org feature flags are refreshed with the latest values after completing the - // license verification process. + // TODO: Ensure feature flags update handles multiple organizations when multi-tenancy is implemented .flatMap(isValid -> instanceConfigHelper.updateCacheForOrganizationFeatureFlags()));
69-69
: Verify error handling for feature flags update.The feature flags update operation should include error handling to prevent startup failures.
Consider adding error handling:
- .flatMap(isValid -> instanceConfigHelper.updateCacheForOrganizationFeatureFlags())); + .flatMap(isValid -> instanceConfigHelper.updateCacheForOrganizationFeatureFlags()) + .onErrorResume(error -> { + log.error("Failed to update organization feature flags: {}", error.getMessage()); + return Mono.empty(); + }));app/server/reactive-caching/src/main/java/com/appsmith/caching/aspects/DistributedLockAspect.java (3)
78-79
: Ensure subscription is guaranteed for releasing locks.The
.subscribe()
call is essential for lock release, but if the subscription never executes, locks remain held. Consider guaranteeing a subscription at a controlled lifecycle stage or returning a reactive pipeline to the caller to manage consistently.
104-105
: Reevaluate returningMono.empty()
when the lock is not acquired.Returning an empty
Mono
can mask the “lock unavailable” scenario. You could consider returning an error or a distinct signal if the lock is held by another process, helping callers handle locking conflicts more transparently.
164-165
: Avoid returningnull
in blocking flow.Returning
null
may lead to runtime issues if callers do not handle this case. Returning a fallback object or throwing an exception could make the contract clearer.app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (2)
42-44
: Consider removing the TODO comment and implementing proper documentation.The TODO comment suggests removing this field once helper methods are converted to reactive. This should be tracked in a proper issue and the field should be properly documented until then.
- // TODO remove once all the helper methods consuming @FeatureFlagged are converted to reactive + /** + * Cached organization feature flags. + * Note: This field will be removed once all helper methods consuming @FeatureFlagged are converted to reactive. + */
42-43
: Consider removing the TODO or converting to a tracked issue.
The TODO comment indicates pending technical debt. If the reactive refactor is nearly complete, remove the TODO or continue tracking it in an open issue for clarity.app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/PingScheduledTaskCEImpl.java (2)
139-141
: Consider extracting cloud hosting check to a constant.The cloud hosting check could be moved to a constant for better maintainability.
+ private static final String CLOUD_HOSTING_TODO_COMMENT = "TODO @CloudBilling remove cloud hosting check and migrate the cron to report organization level stats"; + - // TODO @CloudBilling remove cloud hosting check and migrate the cron to report organization level stats if (commonConfig.isTelemetryDisabled() || commonConfig.isCloudHosting()) {
139-140
: Follow up on the TODO comment.
Removing the cloud hosting check eventually may affect telemetry data flows; ensure it’s tracked.app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/OrganizationServiceCEImpl.java (1)
333-346
: Consider using takeUntil instead of take for better readability.The current implementation using
take(1)
could be more clearly expressed usingtakeUntil
.- return this.retrieveAll() - .filter(organization -> - TRUE.equals(organization.getOrganizationConfiguration().getIsRestartRequired())) - .take(1) - .hasElements() + return this.retrieveAll() + .filter(organization -> + TRUE.equals(organization.getOrganizationConfiguration().getIsRestartRequired())) + .takeUntil(org -> TRUE.equals(org.getOrganizationConfiguration().getIsRestartRequired())) + .hasElements()app/server/reactive-caching/src/main/java/com/appsmith/caching/annotations/DistributedLock.java (2)
13-17
: Consider reducing the default TTL for cron-specific scenariosThe default TTL of 5 minutes might be too long for frequently running cron jobs. Consider adding a separate constant for cron-specific TTL or documenting recommended values for different scenarios.
- long ttl() default 5 * 60; // Default TTL: 5 minutes + // Default TTL: 1 minute - suitable for most cron jobs + long ttl() default 60;
19-19
: Document the implications of shouldReleaseLock parameterThe
shouldReleaseLock
parameter needs documentation to explain when it should be set to false and the implications of not releasing the lock.app/server/appsmith-server/src/main/java/com/appsmith/server/repositories/ce/CustomOrganizationRepositoryCEImpl.java (1)
16-21
: Consider adding error handling for the bulk update operationWhile the implementation is clean, consider adding error handling to log and handle potential database operation failures.
@Override public Mono<Integer> disableRestartForAllTenants() { log.info("Disabling restart for all tenants"); return queryBuilder() .criteria(Bridge.isTrue(organizationConfiguration_isRestartRequired)) - .updateAll(Bridge.update().set(organizationConfiguration_isRestartRequired, false)); + .updateAll(Bridge.update().set(organizationConfiguration_isRestartRequired, false)) + .doOnError(error -> log.error("Error while disabling restart for all tenants", error)); }app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCE.java (1)
41-41
: Consider adding Javadoc for the new method.The new method lacks documentation unlike other methods in the interface.
+ /** + * To get features of a specific organization by ID. + * @param orgId The organization ID + * @return Mono of Map containing feature flags + */ Mono<Map<String, Boolean>> getOrganizationFeatures(String orgId);app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/ScheduledTaskCEImpl.java (1)
32-40
: Consider optimizing the reactive chain.The reactive chain could be simplified by using
flatMapMany
instead of separateflatMap
operations.- organizationFlux - .flatMap( - featureFlagService - ::getAllRemoteFeaturesForOrganizationAndUpdateFeatureFlagsWithPendingMigrations) - .flatMap(featureFlagService::checkAndExecuteMigrationsForOrganizationFeatureFlags) + organizationFlux.flatMapMany(org -> + featureFlagService + .getAllRemoteFeaturesForOrganizationAndUpdateFeatureFlagsWithPendingMigrations(org) + .flatMap(featureFlagService::checkAndExecuteMigrationsForOrganizationFeatureFlags))app/server/appsmith-server/src/main/java/com/appsmith/server/repositories/BaseRepositoryImpl.java (1)
104-109
: Add security documentation for retrieveAll method.The method bypasses security context checks. Add documentation to clarify this is intended for internal use only.
+ /** + * Retrieves all non-deleted entities without security context checks. + * IMPORTANT: This method is intended for internal use only (e.g., cron jobs, system tasks). + * Do not expose this method through public APIs. + * @return Flux of all non-deleted entities + */ @Override public Flux<T> retrieveAll() {app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/ce/InstanceConfigHelperCEImpl.java (1)
242-245
: Enhance error logging for feature flag cache updates.Add more context to the error log to help with debugging.
- log.error("Error while updating cache for org feature flags", error); + log.error("Error while updating cache for org feature flags. Organization features might be in stale state.", error);
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
app/server/appsmith-server/src/main/java/com/appsmith/server/aspect/FeatureFlaggedMethodInvokerAspect.java
(1 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/configurations/InstanceConfig.java
(2 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/domains/Organization.java
(2 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/domains/OrganizationConfiguration.java
(1 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/domains/ce/OrganizationConfigurationCE.java
(3 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/InstanceConfigHelperImpl.java
(3 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/ce/InstanceConfigHelperCEImpl.java
(3 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/migrations/JsonSchemaVersions.java
(1 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/repositories/BaseRepository.java
(2 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/repositories/BaseRepositoryImpl.java
(1 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/repositories/ce/CustomOrganizationRepositoryCE.java
(1 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/repositories/ce/CustomOrganizationRepositoryCEImpl.java
(1 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCE.java
(1 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java
(4 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/OrganizationServiceCE.java
(2 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/OrganizationServiceCEImpl.java
(3 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/PingScheduledTaskImpl.java
(3 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/PingScheduledTaskCEImpl.java
(7 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/ScheduledTaskCEImpl.java
(3 hunks)app/server/appsmith-server/src/test/java/com/appsmith/server/services/ce/FeatureFlagServiceCETest.java
(5 hunks)app/server/reactive-caching/src/main/java/com/appsmith/caching/annotations/DistributedLock.java
(1 hunks)app/server/reactive-caching/src/main/java/com/appsmith/caching/aspects/DistributedLockAspect.java
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- app/server/appsmith-server/src/main/java/com/appsmith/server/migrations/JsonSchemaVersions.java
- app/server/appsmith-server/src/main/java/com/appsmith/server/domains/ce/OrganizationConfigurationCE.java
🔇 Additional comments (33)
app/server/appsmith-server/src/main/java/com/appsmith/server/domains/OrganizationConfiguration.java (1)
6-6
: LGTM! Clean implementation of field name constants.The addition of
@FieldNameConstants
and the nestedFields
class provides a type-safe way to reference field names. This is a good practice for maintaining field name references.However, let's verify if this change aligns with the PR's stated objective of restricting cron execution. Please clarify the relationship between these changes and the cron execution functionality.
Also applies to: 10-12
app/server/appsmith-server/src/test/java/com/appsmith/server/services/ce/FeatureFlagServiceCETest.java (2)
208-213
: LGTM! Improved reactive chain implementation.The change from direct blocking call to a reactive chain using
retrieveAll().flatMap()
aligns well with reactive programming principles and the PR's objective of managing feature updates across pods.Also applies to: 231-236, 254-259
271-271
: LGTM! Added user context for test methods.The addition of
@WithUserDetails(value = "api_user")
ensures consistent user context during test execution, matching the pattern used in other test methods.Also applies to: 288-288
app/server/appsmith-server/src/main/java/com/appsmith/server/configurations/InstanceConfig.java (1)
50-50
: LGTM! Good use of method reference.The change from lambda to method reference improves code readability while maintaining the same functionality.
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (8)
161-172
: LGTM! Improved error handling for missing user context.The error handling for missing user context is well implemented, with a clear error message guiding developers to use the organizationId-based method instead.
174-183
: LGTM! Well-structured method for organization features retrieval.The method properly handles caching and fallback to empty map if no features are found.
16-16
: Lombok import looks fine.
Adding the@Getter
import is standard practice for Lombok usage.
63-65
: Doc comment update is good.
This clarification about user-level and organization-level feature flags is helpful.
122-122
: Return type doc clarifies behavior.
The updated return mention “Mono updated org” aligns with the new implementation.
124-153
: Implementation for updating feature flags looks correct.
This method properly updates the organization configuration and manages migration statuses, aligning well with existing logic.
160-172
: Edge case: Verify behavior when user lacks an organization ID.
If the current user’s organization ID is null, this method will end up returning an empty set of features. Validate this scenario is intentional.
174-183
: Potential concurrency concern with the shared field.
cachedOrganizationFeatureFlags
is set here. If multiple threads call this method simultaneously, consider thread safety if you rely on that field in subsequent operations.app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/PingScheduledTaskCEImpl.java (9)
88-94
: LGTM! Efficient caching of instance and IP data.Good use of caching for instance ID and IP address to avoid redundant calls.
3-3
: Distributed lock import is appropriate.
The import ensures we can use the lock mechanism in a clustered environment.
17-17
: New OrganizationService dependency.
InjectingOrganizationService
is consistent if you need organization data in the scheduled task.
63-63
: Dependency injection for organizationService is noted.
Be mindful of any circular dependencies.
78-81
: Locking thepingSchedule
method is a valid approach.
Ensures that only one pod runs this cron across the cluster.
88-89
: Caching the instance ID and IP.
Both are cached. Confirm it doesn’t expire too soon if needed by subsequent calls.
90-93
: Pinging for every organization.
This currently pings once per organization. Verify if the requirement is to ping multiple times or to combine data.
106-106
: Method signature updated to include organization ID.
This helps track pings at an organization level.
136-136
: Lock forpingStats
is consistent with the approach.
Prevents overlapping job executions in a cluster.app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/OrganizationServiceCEImpl.java (4)
360-363
: LGTM! Clean implementation of retrieveAll method.The method is simple and delegates properly to the repository layer.
29-29
: Flux import is in line with reactive patterns.
No concerns here.
332-345
: Restart logic now covers all organizations.
If multiple orgs require restarts, you stop after the first one. Confirm that’s the intended behavior or if you should handle every flagged organization.
360-363
: NewretrieveAll
method.
Exposing a Flux is a good fit for reactive retrieval of organizations.app/server/appsmith-server/src/main/java/com/appsmith/server/domains/Organization.java (1)
41-44
: LGTM! Clean implementation of field constantsThe use of
StringUtils.dotted
for field name generation is a good practice for maintaining consistent field references.app/server/appsmith-server/src/main/java/com/appsmith/server/repositories/ce/CustomOrganizationRepositoryCE.java (1)
7-9
: LGTM! Clear and concise interface definitionThe method signature follows reactive programming patterns and clearly describes its purpose.
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/OrganizationServiceCE.java (1)
35-35
: LGTM! Method addition supports clustered environment requirements.The new
retrieveAll()
method using Flux is well-suited for retrieving all organizations in a reactive manner, which is essential for cron job coordination across pods.app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCE.java (1)
32-33
: Method signature change improves organization context handling.The updated signature with Organization parameter provides better context and aligns with the reactive pattern.
app/server/appsmith-server/src/main/java/com/appsmith/server/repositories/BaseRepository.java (1)
39-44
: LGTM! Well-documented method for internal workflows.The new
retrieveAll()
method is properly documented and clearly intended for internal server workflows like crons, which aligns perfectly with the PR objective.app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/InstanceConfigHelperImpl.java (1)
28-41
: LGTM! Constructor properly updated with new dependency.The addition of OrganizationService follows proper dependency injection pattern and is correctly passed to the superclass.
app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/ScheduledTaskCEImpl.java (1)
25-28
: LGTM! Lock configuration looks appropriate.The distributed lock configuration with a 20-minute TTL for a 30-minute scheduled task provides sufficient overlap, and
shouldReleaseLock=false
ensures proper lock handling across pods.app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/PingScheduledTaskImpl.java (1)
45-46
: LGTM! Constructor changes look good.Clean addition of OrganizationService dependency.
Also applies to: 60-61
private void releaseLock(String lockKey) { | ||
redisOperations | ||
.delete(lockKey) | ||
.doOnSuccess(deleted -> { | ||
log.info("Released lock for: {}", lockKey); | ||
}) | ||
.onErrorResume(error -> { | ||
log.error("Error while releasing lock: {}", lockKey, error); | ||
return Mono.empty(); | ||
}) | ||
.subscribeOn(Schedulers.boundedElastic()) | ||
.subscribe(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement lock ownership check in releaseLock
.
Currently, the method just deletes the key without verifying that the lock belongs to the same process/thread. This risks deleting another holder’s valid lock if the TTL has expired or was renewed. Incorporate a lock ownership check mechanism (storing a unique value and comparing it before deletion) to ensure safe lock release in distributed environments.
.../appsmith-server/src/main/java/com/appsmith/server/solutions/ce/PingScheduledTaskCEImpl.java
Show resolved
Hide resolved
…ilable in the feature flag check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (1)
141-153
: Consider simplifying the conditional logic.The nested conditions for feature flag updates could be simplified for better readability.
Consider this alternative structure:
- if (featureFlagWithPendingMigrations != null - && !featureFlagWithPendingMigrations.equals( - organizationConfiguration.getFeaturesWithPendingMigration())) { - organizationConfiguration.setFeaturesWithPendingMigration(featureFlagWithPendingMigrations); - if (!featureFlagWithPendingMigrations.isEmpty()) { - organizationConfiguration.setMigrationStatus(MigrationStatus.PENDING); - } else { - organizationConfiguration.setMigrationStatus(MigrationStatus.COMPLETED); - } - return organizationService.update(organization.getId(), organization); - } - return Mono.just(organization); + if (featureFlagWithPendingMigrations == null + || featureFlagWithPendingMigrations.equals( + organizationConfiguration.getFeaturesWithPendingMigration())) { + return Mono.just(organization); + } + + organizationConfiguration.setFeaturesWithPendingMigration(featureFlagWithPendingMigrations); + organizationConfiguration.setMigrationStatus( + featureFlagWithPendingMigrations.isEmpty() + ? MigrationStatus.COMPLETED + : MigrationStatus.PENDING); + + return organizationService.update(organization.getId(), organization);
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java
(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: server-unit-tests / server-unit-tests
- GitHub Check: server-spotless / spotless-check
🔇 Additional comments (2)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (2)
124-154
: LGTM! Method signature change improves control flow.The updated method signature now accepts an Organization parameter and returns Mono, which:
- Eliminates unnecessary database queries
- Provides better control over the organization context
- Aligns with reactive programming principles
165-174
:❓ Verification inconclusive
Reconsider the default organization fallback strategy.
While the TODO comment explains the temporary nature of this solution, falling back to a default organization could mask potential issues and lead to unexpected behavior.
Consider tracking usage of this fallback:
🏁 Script executed:
#!/bin/bash # Search for similar fallback patterns that might need attention rg "getDefaultOrganizationId" -A 5Length of output: 22641
Default Organization Fallback Strategy – Evaluate and Monitor Its Impact
While the temporary fallback to a default organization is consistently used across the codebase (as seen in FeatureFlagServiceCEImpl and other modules), relying on it may inadvertently hide issues with missing user context. Given that a TODO already flags this as a temporary fix, please consider one or more of the following actions:
- Instrument the fallback: Add telemetry or logging to track how frequently this fallback is exercised, so you can assess its impact.
- Review fallback usage: Verify that its use doesn’t mask underlying errors, especially in scenarios where an explicit user context is expected.
- Plan for removal: Revisit this strategy when a robust signup flow and organization initialization are in place.
.../appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/ScheduledTaskCEImpl.java (1)
31-42
: Consider adding metrics for monitoring task execution.While the error handling and logging are good additions, consider adding metrics to track:
- Number of organizations processed
- Success/failure rates
- Processing duration
Example addition:
@Observed(name = "fetchFeatures") public void fetchFeatures() { log.info("Fetching features for organizations"); + Timer.Sample sample = Timer.start(); Flux<Organization> organizationFlux = organizationService.retrieveAll(); organizationFlux .flatMap( featureFlagService ::getAllRemoteFeaturesForOrganizationAndUpdateFeatureFlagsWithPendingMigrations) .flatMap(featureFlagService::checkAndExecuteMigrationsForOrganizationFeatureFlags) + .doOnComplete(() -> { + sample.stop(Timer.builder("features.fetch.duration") + .description("Time taken to fetch and process features") + .register(meterRegistry)); + }) .onErrorResume(error -> { log.error("Error while fetching organization feature flags", error); + Counter.builder("features.fetch.errors") + .description("Number of errors during feature fetch") + .register(meterRegistry) + .increment(); return Flux.empty(); })app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/ce/InstanceConfigHelperCEImpl.java (1)
241-243
: Consider optimizing reactive chain.The current implementation processes organizations sequentially. Consider using
flatMapSequential
orflatMapConcurrent
for better throughput.- return organizationService - .retrieveAll() - .flatMap(org -> featureFlagService.getOrganizationFeatures(org.getId())) + return organizationService + .retrieveAll() + .flatMapConcurrent(org -> featureFlagService.getOrganizationFeatures(org.getId()), 5)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
app/server/appsmith-server/src/main/java/com/appsmith/server/helpers/ce/InstanceConfigHelperCEImpl.java
(3 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/ScheduledTaskCEImpl.java
(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: perform-test / client-build / client-build
- GitHub Check: perform-test / server-build / server-unit-tests
- GitHub Check: perform-test / rts-build / build
- GitHub Check: server-unit-tests / server-unit-tests
- GitHub Check: server-spotless / spotless-check
🔇 Additional comments (5)
app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/ScheduledTaskCEImpl.java (2)
3-13
: LGTM! Required imports added for distributed locking and reactive programming.The new imports align with the implementation changes for distributed locking and reactive streams.
25-28
: LGTM! Well-configured distributed lock for cron execution.The
@DistributedLock
configuration effectively ensures single-pod execution:
- 20-minute TTL provides sufficient time for task completion
shouldReleaseLock=false
prevents other pods from acquiring the lock during the fixed rate intervalapp/server/appsmith-server/src/main/java/com/appsmith/server/helpers/ce/InstanceConfigHelperCEImpl.java (3)
18-18
: LGTM! Clean dependency injection.The OrganizationService integration follows the established dependency injection pattern.
Also applies to: 59-59
229-236
: Well-documented method with clear purpose and behavior.The Javadoc thoroughly explains the method's startup behavior and fallback mechanism.
239-240
: Address TODO comment before merging.The TODO comment indicates incomplete implementation for updating feature flags across all organizations.
Could you clarify the intended behavior for updating feature flags during startup?
...appsmith-server/src/main/java/com/appsmith/server/helpers/ce/InstanceConfigHelperCEImpl.java
Show resolved
Hide resolved
…run-cron-by-single-pod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (1)
184-192
:⚠️ Potential issueAddress potential thread safety issue.
The method mutates the
cachedOrganizationFeatureFlags
field which could lead to race conditions in a concurrent environment.Consider making the field volatile or using atomic references:
- private CachedFeatures cachedOrganizationFeatureFlags; + private volatile CachedFeatures cachedOrganizationFeatureFlags;
🧹 Nitpick comments (1)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (1)
126-154
: Consider simplifying the nested conditionals.The method could be more readable by extracting the feature flag comparison logic into a separate method.
- public Mono<Organization> getAllRemoteFeaturesForOrganizationAndUpdateFeatureFlagsWithPendingMigrations( - Organization organization) { - return featureFlagMigrationHelper - .getUpdatedFlagsWithPendingMigration(organization) - .flatMap(featureFlagWithPendingMigrations -> { - OrganizationConfiguration organizationConfiguration = - organization.getOrganizationConfiguration() == null - ? new OrganizationConfiguration() - : organization.getOrganizationConfiguration(); - if (featureFlagWithPendingMigrations != null - && !featureFlagWithPendingMigrations.equals( - organizationConfiguration.getFeaturesWithPendingMigration())) { - organizationConfiguration.setFeaturesWithPendingMigration(featureFlagWithPendingMigrations); - if (!featureFlagWithPendingMigrations.isEmpty()) { - organizationConfiguration.setMigrationStatus(MigrationStatus.PENDING); - } else { - organizationConfiguration.setMigrationStatus(MigrationStatus.COMPLETED); - } - return organizationService.update(organization.getId(), organization); - } - return Mono.just(organization); - }); + public Mono<Organization> getAllRemoteFeaturesForOrganizationAndUpdateFeatureFlagsWithPendingMigrations( + Organization organization) { + return featureFlagMigrationHelper + .getUpdatedFlagsWithPendingMigration(organization) + .flatMap(flags -> updateOrganizationWithFeatureFlags(organization, flags)); + } + + private Mono<Organization> updateOrganizationWithFeatureFlags(Organization organization, Map<String, Boolean> flags) { + OrganizationConfiguration config = organization.getOrganizationConfiguration(); + if (config == null) { + config = new OrganizationConfiguration(); + organization.setOrganizationConfiguration(config); + } + + if (flags == null || flags.equals(config.getFeaturesWithPendingMigration())) { + return Mono.just(organization); + } + + config.setFeaturesWithPendingMigration(flags); + config.setMigrationStatus(flags.isEmpty() ? MigrationStatus.COMPLETED : MigrationStatus.PENDING); + + return organizationService.update(organization.getId(), organization); + }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java
(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (1)
Learnt from: abhvsn
PR: appsmithorg/appsmith#39171
File: app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java:42-44
Timestamp: 2025-02-24T05:59:14.021Z
Learning: The reactive transition for helper methods consuming @FeatureFlagged annotations will be handled in a separate PR to maintain focused changes and proper separation of concerns.
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: perform-test / server-build / server-unit-tests
- GitHub Check: perform-test / rts-build / build
- GitHub Check: perform-test / client-build / client-build
- GitHub Check: server-spotless / spotless-check
- GitHub Check: server-unit-tests / server-unit-tests
🔇 Additional comments (2)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (2)
43-45
: TODO comment indicates incomplete reactive transition.Based on the retrieved learning, this will be handled in a separate PR. However, consider marking the field as deprecated to discourage new usages until the reactive transition is complete.
162-180
:❓ Verification inconclusive
Verify the default organization fallback behavior.
The TODO comments indicate potential data corruption risks with the current fallback mechanism. This needs to be addressed before the signup flow changes.
🏁 Script executed:
#!/bin/bash # Search for other occurrences of default organization usage rg "getDefaultOrganizationId|isCreateWorkspaceAllowed" -A 3Length of output: 19394
Action Required: Verify Default Organization Fallback Consistency
The usage of
organizationService.getDefaultOrganizationId()
as a fallback for empty organization IDs is consistent across the codebase. However, as noted in the TODO comments, this temporary mechanism could lead to data corruption during signup if not handled carefully. Please verify that:
- The fallback mechanism in
FeatureFlagServiceCEImpl
(lines 162–180) is in line with the overall strategy.- The default organization returned is properly configured to support subsequent actions like workspace creation.
- The implications flagged in the TODOs (e.g., potential data corruption and the reliance on
WorkspaceServiceHelperImpl.isCreateWorkspaceAllowed
) are thoroughly assessed before proceeding with any signup flow changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
app/server/appsmith-server/src/test/java/com/appsmith/server/aspect/DistributedLockAspectTest.java (1)
31-41
: Add verification for lock acquisition.Consider adding a check to verify that the lock is acquired before the operation executes.
@Test void testMonoOperation() { + // Verify lock is acquired + StepVerifier.create(redisOperations.hasKey(LOCK_PREFIX + "mono-test")) + .expectNext(true) + .verifyComplete(); + StepVerifier.create(testLockService.monoOperation()) .expectNext("mono-success") .verifyComplete();app/server/appsmith-server/src/test/java/com/appsmith/server/aspect/TestLockService.java (1)
28-31
: Extract TTL value to a constant.The TTL value should be configurable to accommodate different environment requirements.
+ private static final int LONG_RUNNING_OPERATION_TTL = 5; + - @DistributedLock(key = "long-running-mono", ttl = 5) + @DistributedLock(key = "long-running-mono", ttl = LONG_RUNNING_OPERATION_TTL) public Mono<String> longRunningMonoOperation() { return Mono.just("long-running-success").delayElement(Duration.ofSeconds(2)); }
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
app/server/appsmith-server/src/main/java/com/appsmith/server/configurations/InstanceConfig.java
(2 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java
(4 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/OrganizationServiceCEImpl.java
(3 hunks)app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/PingScheduledTaskCEImpl.java
(9 hunks)app/server/appsmith-server/src/test/java/com/appsmith/server/aspect/DistributedLockAspectTest.java
(1 hunks)app/server/appsmith-server/src/test/java/com/appsmith/server/aspect/TestLockService.java
(1 hunks)app/server/reactive-caching/src/main/java/com/appsmith/caching/aspects/DistributedLockAspect.java
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- app/server/reactive-caching/src/main/java/com/appsmith/caching/aspects/DistributedLockAspect.java
- app/server/appsmith-server/src/main/java/com/appsmith/server/configurations/InstanceConfig.java
🧰 Additional context used
🧠 Learnings (2)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (1)
Learnt from: abhvsn
PR: appsmithorg/appsmith#39171
File: app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java:42-44
Timestamp: 2025-02-24T05:59:14.021Z
Learning: The reactive transition for helper methods consuming @FeatureFlagged annotations will be handled in a separate PR to maintain focused changes and proper separation of concerns.
app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/PingScheduledTaskCEImpl.java (1)
Learnt from: abhvsn
PR: appsmithorg/appsmith#39171
File: app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/PingScheduledTaskCEImpl.java:78-81
Timestamp: 2025-02-24T06:01:18.807Z
Learning: In distributed lock patterns for scheduled tasks, setting the TTL slightly shorter than the schedule interval (e.g. 5h TTL for 6h schedule) is a deliberate design choice to allow other available pods to take control, improving failover capabilities in clustered environments.
⏰ Context from checks skipped due to timeout of 90000ms (5)
- GitHub Check: perform-test / client-build / client-build
- GitHub Check: perform-test / rts-build / build
- GitHub Check: perform-test / server-build / server-unit-tests
- GitHub Check: server-spotless / spotless-check
- GitHub Check: server-unit-tests / server-unit-tests
🔇 Additional comments (20)
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/OrganizationServiceCEImpl.java (5)
325-330
: Documentation updated to reflect feature flag migration dependency.The comment has been updated to clarify that restart occurs after feature flag migrations are completed.
332-346
: Improved restart logic for clustered environments.The implementation now efficiently handles organization restarts in a clustered environment by:
- Retrieving all organizations
- Filtering for those requiring restart
- Disabling restart flags before executing the restart
360-363
: LGTM: New method to retrieve all organizations.The implementation is clean and follows reactive patterns.
332-346
: LGTM! The refactoring improves cluster safety.The changes properly handle restart requirements across multiple organizations in a clustered environment.
Note: There's a TODO comment indicating temporary code that needs cleanup once form login env is moved to DB variable.
361-363
: LGTM! Clean implementation of retrieveAll method.The method follows single responsibility principle and uses reactive types appropriately.
app/server/appsmith-server/src/main/java/com/appsmith/server/services/ce/FeatureFlagServiceCEImpl.java (7)
43-44
: TODO comment indicates pending reactive conversion.Based on the retrieved learning, this TODO is intentionally left for a separate PR to maintain focused changes.
125-155
: Enhanced feature flag migration with organization context.The implementation properly handles organization configuration updates and migration status tracking.
161-181
: Improved error handling for organization features.The implementation now includes proper error logging and fallback to default organization when needed.
183-192
: LGTM: New method for organization-specific feature flags.Clean implementation using reactive patterns and proper caching.
126-155
: LGTM! Method signature and return type improvements.The changes better handle feature flag migrations and provide more useful return data.
162-181
: Consider improving anonymous user handling.The current implementation uses a temporary fix for anonymous users by falling back to the default organization.
Please track the following TODOs:
- Update organization ID retrieval based on request origin for anonymous users
- Review the temporary fix for workspace creation in EE
184-192
: LGTM! Clean implementation of feature caching.The method properly handles caching and fallback to empty map.
app/server/appsmith-server/src/main/java/com/appsmith/server/solutions/ce/PingScheduledTaskCEImpl.java (7)
67-68
: Good practice: Added delay to prevent rate limiting.The constant
DELAY_BETWEEN_PINGS
helps avoid 429 errors between analytics calls.
83-86
: Distributed lock configuration aligns with failover strategy.Based on the retrieved learning, the 5-hour TTL (shorter than 6-hour schedule) is intentional to improve failover capabilities in clustered environments.
93-100
: Efficient caching and organization processing.Good practices implemented:
- Caching instance and IP monos to avoid redundant calls
- Using
delayElements
for rate limiting- Proper reactive chaining with
flatMap
145-146
: TODO indicates future migration of organization-level stats.The cloud hosting check will be removed once the cron is migrated to report organization-level stats.
83-86
: LGTM! Distributed lock pattern properly implemented.The TTL (5 hours) is deliberately set shorter than the schedule interval (6 hours) to improve failover capabilities in clustered environments.
93-100
: LGTM! Performance improvements with caching and rate limiting.Good use of caching for instance and IP, and proper rate limiting with delay between pings.
142-146
: Consider cleanup of cloud hosting check.TODO indicates that cloud hosting check and cron job need to be migrated for organization-level stats.
Track the TODO for removing cloud hosting check and migrating the cron to report organization-level stats.
app/server/appsmith-server/src/test/java/com/appsmith/server/aspect/TestLockService.java (1)
1-59
: Well-structured implementation with comprehensive test coverage.The distributed locking implementation effectively addresses the PR objective of restricting cron execution to a single pod. The test coverage is thorough, covering various scenarios including concurrent access, lock persistence, and error handling.
@Test | ||
void testConcurrentAccess() throws InterruptedException { | ||
AtomicReference<String> thread1Result = new AtomicReference<>(); | ||
AtomicReference<String> thread2Result = new AtomicReference<>(); | ||
CountDownLatch thread1Started = new CountDownLatch(1); | ||
|
||
// Thread 1: Long running operation | ||
Thread thread1 = new Thread( | ||
() -> { | ||
thread1Started.countDown(); | ||
thread1Result.set(testLockService.longRunningMonoOperation().block()); | ||
}, | ||
"Thread-1"); | ||
|
||
// Thread 2: Tries to execute while Thread 1 is running | ||
Thread thread2 = new Thread( | ||
() -> { | ||
try { | ||
thread1Started.await(); // Wait for Thread 1 to start | ||
Thread.sleep(100); // Small delay to ensure Thread 1 has acquired lock | ||
thread2Result.set( | ||
testLockService.longRunningMonoOperation().block()); | ||
} catch (InterruptedException e) { | ||
throw new RuntimeException(e); | ||
} | ||
}, | ||
"Thread-2"); | ||
|
||
// Start both threads | ||
thread1.start(); | ||
thread2.start(); | ||
|
||
// Wait for both threads to complete | ||
thread1.join(5000); | ||
thread2.join(5000); | ||
|
||
// Verify results | ||
assertEquals("long-running-success", thread1Result.get()); | ||
assertNull(thread2Result.get()); // Thread 2 should not get the lock | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve test reliability by using better synchronization.
The fixed sleep duration of 100ms might cause flaky tests. Consider using a more reliable synchronization mechanism.
- Thread.sleep(100); // Small delay to ensure Thread 1 has acquired lock
+ // Wait for lock to be acquired by checking Redis
+ StepVerifier.create(redisOperations.hasKey(LOCK_PREFIX + "long-running-mono"))
+ .expectNext(true)
+ .verifyComplete();
Committable suggestion skipped: line range outside the PR's diff.
try { | ||
Thread.sleep(1100); // Wait just over 1 second | ||
} catch (InterruptedException e) { | ||
throw new RuntimeException(e); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace Thread.sleep with reactive delay.
Using Thread.sleep in reactive tests is not recommended. Consider using Mono.delay instead.
- try {
- Thread.sleep(1100); // Wait just over 1 second
- } catch (InterruptedException e) {
- throw new RuntimeException(e);
- }
+ return Mono.delay(Duration.ofMillis(1100))
+ .then();
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
try { | |
Thread.sleep(1100); // Wait just over 1 second | |
} catch (InterruptedException e) { | |
throw new RuntimeException(e); | |
} | |
return Mono.delay(Duration.ofMillis(1100)) | |
.then(); |
// Method to manually release the lock (for testing cleanup) | ||
public Mono<Long> releaseLock(String lockKey, ReactiveRedisOperations<String, String> redisOperations) { | ||
return redisOperations.delete("lock:" + lockKey); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Encapsulate Redis key manipulation.
Direct manipulation of Redis keys could lead to maintenance issues. Consider encapsulating the key prefix logic.
+ private static final String LOCK_PREFIX = "lock:";
+
// Method to manually release the lock (for testing cleanup)
public Mono<Long> releaseLock(String lockKey, ReactiveRedisOperations<String, String> redisOperations) {
- return redisOperations.delete("lock:" + lockKey);
+ return redisOperations.delete(LOCK_PREFIX + lockKey);
}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Method to manually release the lock (for testing cleanup) | |
public Mono<Long> releaseLock(String lockKey, ReactiveRedisOperations<String, String> redisOperations) { | |
return redisOperations.delete("lock:" + lockKey); | |
} | |
private static final String LOCK_PREFIX = "lock:"; | |
// Method to manually release the lock (for testing cleanup) | |
public Mono<Long> releaseLock(String lockKey, ReactiveRedisOperations<String, String> redisOperations) { | |
return redisOperations.delete(LOCK_PREFIX + lockKey); | |
} |
Description
/test All
🔍 Cypress test results
Important
🟣 🟣 🟣 Your tests are running.
Tests running at: https://github.com/appsmithorg/appsmith/actions/runs/13505240517
Commit: f179756
Workflow:
PR Automation test suite
Tags:
@tag.All
Spec: ``
Mon, 24 Feb 2025 18:34:46 UTC
Communication
Should the DevRel and Marketing teams inform users about this change?
Summary by CodeRabbit
Summary by CodeRabbit
New Features
Refactor