-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the port file and dynamic port generation in client/server tests. #10604
Conversation
This creates a runner named `runner_with_port_file` which knows how to interpolate two variables, `%PORT_FILE%` and `%PORT%`. This allows us to use the `port-file` argument to the kvutils runner rather than hard-coding a port for conformance tests. For now, we only use this for generating the kvutils reference ledger export. CHANGELOG_BEGIN CHANGELOG_END
callProcess clientExe clientArgs' | ||
forM_ portFile removeFile | ||
|
||
interpolateVariable :: String -> String -> String -> String |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already depend on extra
which has replace
on String.
|
||
generatePortFile :: [String] -> IO (Maybe String, [String]) | ||
generatePortFile args = do | ||
if any (Text.isInfixOf "%PORT_FILE%" . Text.pack) args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a good reason to not do this in all cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but we use this information to decide whether to wait for the port to open.
That said, we can probably just delete that behavior, as the Ledger API Test Tool also waits.
Nothing -> splitArgs clientArgs | ||
Just port -> map (interpolateVariable "PORT" (show port)) (splitArgs clientArgs) | ||
callProcess clientExe clientArgs' | ||
forM_ portFile removeFile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not exception safe. If your code fails before this the file might persist. I recommend to use withTempDir
to create a directory and then chose a file path in there. Then you also don’t have to worry about deleting the temp file you just created. We use that a fair bit throughout the codebase, e.g.,
withSandbox (SandboxClassic classic) mbPortSpec extraArgs a = withTempDir $ \tempDir -> do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense; I'll do that. Thanks.
Right portFileContents -> | ||
case reads @Int portFileContents of | ||
[(port, "")] -> do | ||
addr : _ <- getAddrInfo (Just hints) (Just "127.0.0.1") (Just $ show port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this? the rest of our codebase assumes that the portfile is only written when the connection is available and afaik that has never caused issues so this seems like a sensible pattern to rely on.
case eitherPortFileContents of | ||
Left _ -> sleep *> pure Nothing | ||
Right portFileContents -> | ||
case reads @Int portFileContents of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
readMay
might be a better choice and you don’t have to strip for that to work so you can simplifyf the code a bit. See how readPortFile
works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe try actually using that here if possible.
@@ -238,8 +238,6 @@ proto_jars( | |||
|
|||
REFERENCE_LEDGER_EXPORT_NAME = "reference-ledger-export" | |||
|
|||
REFERENCE_LEDGER_EXPORT_PORT = 65102 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just move this out of emphemeral range and be done? Want to "solve it for once and for all" to avoid chance of conflict with other fixed ports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure what you’re suggesting? In this PR, @SamirTalwar-DA switched to using port 0 which choses a free open port so there is no chance of it colliding with anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but the collision observed in practice was likely due to the 65102 being taken by an outgoing connection, whereas if it were out of the ephemeral range (i.e., in 1025-32767) this, at least, would not happen. I just wanted to understand the motivation for going the extra mile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t know how Samir ended up with the collision but generally just making sure that our own tests don’t conflict with each other is already painful and we have a lot of tests that cannot run in parallel with other tests exactly because they hardcode the same port number. I’m absolutely in favor of using port 0 everywhere and I think the complexity for this is justified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I started by changing the port and then decided to go overboard. I am happy to go with the simpler change but my real intent with this is to use it across the conformance tests so they don't accidentally collide.
It really bugs me when I have a ledger running locally for testing and then try running some conformance tests, only to watch them fail because they picked the same default port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.. local running, parallel tests, oh my – thanks for the explanations. Now I understand a bit about the differences between testing here and in LR. :-)
import System.Process | ||
import qualified Data.Text as Text | ||
|
||
main :: IO () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking the other feedback into account, about only waiting on the port-file, but not also the connection, how significantly does this runner still differ from //bazel_tools/client_server/runner
?
That other runner also supports port 0 and a port-file, it also waits for the creation of that port-file.
One difference that stands out is that the other runner hard-codes the shape of the --port-file
and --port
arguments. If that's the only remaining difference, perhaps the other runner could be generalized to make these args configurable instead of adding an additional runner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, they're similar. My goal is to eventually merge all three runners together. However, I wanted to start small and just change a single target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying. 👍 Makes sense.
It doesn't need to check if the port is open; we trust that the process will do it. This also makes sure the port file will be cleaned up, and reduces the number of dependencies by making use of more functions in `extra`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! two small comments
let interpolatedClientArgs = | ||
case maybePort of | ||
Nothing -> splitClientArgs | ||
Just port -> map (replace "%PORT%" (show port)) splitClientArgs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just always doing that now that you dropped the wait seems like it might simplify things a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropped the wait, sure, but we still have no guarantee that the server actually wrote a port file.
I kept this as a Maybe
because it allows us to use one runner everywhere. Do you think we're better off only using it where we definitely have a port file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before your PR we have two types of client_server runners:
- Based on port files which waits for the server to write a port file.
- Based on hardcoded ports which tests the connection until the server is up.
The client needs to know the port the server is listening on so if you don’t get it from the port file you don’t have a lot of other options than hardcoding it in some form.
I understood this PR as introducing a variation of 1 where the port file is specified differently meaning it doesn’t make sense to use it if the server does not write a port file. It sounds like you have something different in mind but I think I don’t quite understand what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My end goal is that we have one runner. I can see an argument for one (everything is done through Bazel configuration), and an argument for two (two different ways of starting the server).
Which would you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely in favor of one runner but that doesn’t really answer the question I clearly failed to ask properly :)
We currently have waiting for a port file and waiting for a connection. You’re adding a third option in here where we wait for nothing. What do we need this for? In my ideal world everything runs on ephemeral ports for which we need the port file (otherwise we can’t tell the client tell what to connect to). Where that’s not possible, we hardcode and wait for the connection. I don’t see the need for the third option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was getting ahead of myself. For now I'll simplify, though I think this logic will need to come back if we merge this runner with runner_with_port_check
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The simplified version worked.
Co-authored-by: Moritz Kiefer <[email protected]>
43d7dcb
to
a3eb1b4
Compare
This doesn't need to work if the server doesn't take a port file.
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
…#10604) * Use the port file and dynamic port generation in client/server tests. This creates a runner named `runner_with_port_file` which knows how to interpolate two variables, `%PORT_FILE%` and `%PORT%`. This allows us to use the `port-file` argument to the kvutils runner rather than hard-coding a port for conformance tests. For now, we only use this for generating the kvutils reference ledger export. CHANGELOG_BEGIN CHANGELOG_END * Simplify the runner_with_port_file considerably. It doesn't need to check if the port is open; we trust that the process will do it. This also makes sure the port file will be cleaned up, and reduces the number of dependencies by making use of more functions in `extra`. * Simplify port file generation in the new client-server runner. Co-authored-by: Moritz Kiefer <[email protected]> * Simplify the runner_with_port_file further. This doesn't need to work if the server doesn't take a port file. Co-authored-by: Moritz Kiefer <[email protected]>
This PR has been created by a script, which is not very smart and does not have all the context. Please do double-check that the version prefix is correct before merging. @SamirTalwar-DA is in charge of this release. Commit log: ``` 640fb68 Make Index DB enable multiple party additions [DPP-546] (#10623) b22de68 LF: Contract ID suffix check in Preprocessor (#10642) 7b94b06 Map shortened scala test suite names to long names on Windows (#10628) 6e4a24c participant-state: Generate correct gRPC error codes by v2 `WriteService` [KVL-1081] (#10656) 663781a Update curl 7.73.0 --> 7.78.0 (#10655) a471f15 Dpp-558 Fix startexclusive queries on oracle (#10649) e99254f Augment `completion.proto` with deduplication-related info [KVL-1057] (#10619) a00608c participant-integration-api: Accommodate changes to max dedup time. (#10650) 29c546c [Divulgence pruning] Added `prune_all_divulged_contracts` to PruneRequest [DPP-534] (#10635) dea57ca In-memory fan-out optimizations (#10558) 77eb366 [JSON-API] key_hash field to speed up fetchByKey queries (#10631) 5001329 LF: Comparisons fail at runtime if comparing local vs global CIDs (#10630) 055be4b Disable deprecation warnings for data-dependencies (#10647) c155935 clean-up gsg README (#10641) 8501832 DPP-468 StorageBackend tests (#10529) 4e08b47 update NOTICES file (#10645) 733590d ledger-api-health: Use the Scala health status values everywhere. (#10640) 5b837ec Ledger API: add `buf` checks [KVL-1045] (#10625) f77cd0a participant-integration-api: Attempt to fix RecoveringIndexerSpec. (#10639) 9d7f60f participant-integration-api: Fix a flaky test. (#10637) 4a9331c Upgrade Nixpkgs [KVL-1045] (#10624) 01b6e89 update compat versions for 1.17.0-snapshot.20210817.7604.0.0c187853 (#10610) b578b0e Reminder to put an empty line between subject and body (#10638) c0fbad1 participant-integration-api: Remove `limitMaxCommandsInFlight`. (#10626) b4af6d1 Canton testing: Mark one more DeeplyNestedValueIT test flaky (#10636) e807f4a Upgrade to a newer canton version (post 0.27.0 snapshot version) (#10632) c4513f2 Oracle append-only schema: enable contract id index on participant_events_xxxx tables (#10633) 3598e09 LF: Drop contract ID Freshness check (#10620) 37c999e ledger-on-sql: Increase the concurrency for conformance tests. (#10622) 46e8c7d DPP-460 Extract constant for event sequential IDs (#10564) 121047e DPP-460 Parameter storage consolidation (#10472) 569612a Drop broken symlink config (#10616) 6d0109f Support $ in daml-lf identifiers in the parser (#10609) c38703e participant-integration-api: Store a status gRPC protobuf. [KVL-1005] (#10600) 0af5b49 make FinalReason a case class (#10614) 8dd136f bazel-tools: Replace `runner` with either `runner_with_port_check` or `runner_with_port_file`. (#10615) 19c3d28 Remove GenMissingString class because it is not used (#10608) 3227e86 Use the port file and dynamic port generation in client/server tests. (#10604) fb19bcb fix gsg-trigger template (#10611) b207702 rotate release duty after 1.17.0-snapshot.20210817.7604.0.0c187853 (#10606) 975a5fb Move DeduplicationPeriod to ledger-api-domain [KVL-1047] (#10590) f8a1820 release 1.17.0-snapshot.20210817.7604.0.0c187853 (#10605) 64abf8a update NOTICES file (#10607) ``` Changelog: ``` [Integration Kit] Corrected gRPC error codes returned by v2 `WriteService` adaptor. - [Ledger API Server] The API server manages a single command tracker per (application ID × submitters) pair. This tracker would read the current ledger configuration's maximum deduplication time on creation, but never updated it, leading to trackers that might inadvertently reject a submission when it should have been accepted. The tracker now reads the latest ledger configuration. - Update schema version for http-json-api query store with new key_hash field - Improved performance for fetchByKey query which now uses key_hash field participant-state - move `DeduplicationPeriod` to ledger-api-domain dc4629f update NOTICES file (#10580) 8b0a0e7 update NOTICES file (#10578) 4b8b67a Upgrade Scalatest to v3.2.9. (#10576) 41e60f7 Upgrade to Scala 2.12.14 and 2.13.6. (#10573) c447898 Fix display of unhandled exceptions in the scenario service (#10572) 4e1a90d Enable --incompatible_remote_results_ignore_disk (#10571) d183ecc rotate release duty after 1.17.0-snapshot.20210811.7560.0.4f9de4ba (#10556) 76ecb44 update compat versions for 1.17.0-snapshot.20210811.7565.0.f1a55aa4 (#10563) 86a03fa Bump bazel max jvm memory (#10569) 1cc136c update NOTICES file (#10565) c69880c ledger-api-test-tool: Enforce test naming standards. (#10562) ee34d0f Track command - use types for error handling instead of grpc statuses [KVL-1005] (#10503) 93c25f3 Release 1.17 snapshot (#10560) ``` Changelog: ``` - [Ledger API Test Tool] The ``TransactionServiceIT`` test suite has been split into many test suites. If you are including or excluding it, you will need to use the new test suite names, or you can use "TransactionService" as a prefix for all of them. If you are including or excluding individual tests, you will need to update your arguments with the new test suite. You can find the new test suite by running the test tool with the ``--list-all`` flag and looking for the test's short identifier. The short identifiers have not changed, with the exception of ``TXNoContractKey``, which has been renamed to ``CKNoContractKey`` and is now in the ``ContractKeysIT`` test suite. * [Daml export] You can now set the ``--all-parties`` option to generate a ledger export as seen by all known parties. ledger-api-client - Propagate definite_answer as metadata in the GRPC response for submit/submitAndWait [JSON API] Ledger connection errors are now logged at every attempt akka-bindings: `LedgerClientBinding.commands` now returns a flow of `Either[CompletionFailure, CompletionSuccess]` instead of `Completion` for clearer error handling. For backwards compatiblity the new return type can be turned back into a `Completion` using `CompletionResponse.toCompletion` ``` ``` CHANGELOG_BEGIN CHANGELOG_END
This PR has been created by a script, which is not very smart and does not have all the context. Please do double-check that the version prefix is correct before merging. @SamirTalwar-DA is in charge of this release. Commit log: ``` 640fb68 Make Index DB enable multiple party additions [DPP-546] (#10623) b22de68 LF: Contract ID suffix check in Preprocessor (#10642) 7b94b06 Map shortened scala test suite names to long names on Windows (#10628) 6e4a24c participant-state: Generate correct gRPC error codes by v2 `WriteService` [KVL-1081] (#10656) 663781a Update curl 7.73.0 --> 7.78.0 (#10655) a471f15 Dpp-558 Fix startexclusive queries on oracle (#10649) e99254f Augment `completion.proto` with deduplication-related info [KVL-1057] (#10619) a00608c participant-integration-api: Accommodate changes to max dedup time. (#10650) 29c546c [Divulgence pruning] Added `prune_all_divulged_contracts` to PruneRequest [DPP-534] (#10635) dea57ca In-memory fan-out optimizations (#10558) 77eb366 [JSON-API] key_hash field to speed up fetchByKey queries (#10631) 5001329 LF: Comparisons fail at runtime if comparing local vs global CIDs (#10630) 055be4b Disable deprecation warnings for data-dependencies (#10647) c155935 clean-up gsg README (#10641) 8501832 DPP-468 StorageBackend tests (#10529) 4e08b47 update NOTICES file (#10645) 733590d ledger-api-health: Use the Scala health status values everywhere. (#10640) 5b837ec Ledger API: add `buf` checks [KVL-1045] (#10625) f77cd0a participant-integration-api: Attempt to fix RecoveringIndexerSpec. (#10639) 9d7f60f participant-integration-api: Fix a flaky test. (#10637) 4a9331c Upgrade Nixpkgs [KVL-1045] (#10624) 01b6e89 update compat versions for 1.17.0-snapshot.20210817.7604.0.0c187853 (#10610) b578b0e Reminder to put an empty line between subject and body (#10638) c0fbad1 participant-integration-api: Remove `limitMaxCommandsInFlight`. (#10626) b4af6d1 Canton testing: Mark one more DeeplyNestedValueIT test flaky (#10636) e807f4a Upgrade to a newer canton version (post 0.27.0 snapshot version) (#10632) c4513f2 Oracle append-only schema: enable contract id index on participant_events_xxxx tables (#10633) 3598e09 LF: Drop contract ID Freshness check (#10620) 37c999e ledger-on-sql: Increase the concurrency for conformance tests. (#10622) 46e8c7d DPP-460 Extract constant for event sequential IDs (#10564) 121047e DPP-460 Parameter storage consolidation (#10472) 569612a Drop broken symlink config (#10616) 6d0109f Support $ in daml-lf identifiers in the parser (#10609) c38703e participant-integration-api: Store a status gRPC protobuf. [KVL-1005] (#10600) 0af5b49 make FinalReason a case class (#10614) 8dd136f bazel-tools: Replace `runner` with either `runner_with_port_check` or `runner_with_port_file`. (#10615) 19c3d28 Remove GenMissingString class because it is not used (#10608) 3227e86 Use the port file and dynamic port generation in client/server tests. (#10604) fb19bcb fix gsg-trigger template (#10611) b207702 rotate release duty after 1.17.0-snapshot.20210817.7604.0.0c187853 (#10606) 975a5fb Move DeduplicationPeriod to ledger-api-domain [KVL-1047] (#10590) f8a1820 release 1.17.0-snapshot.20210817.7604.0.0c187853 (#10605) 64abf8a update NOTICES file (#10607) ``` Changelog: ``` [Integration Kit] Corrected gRPC error codes returned by v2 `WriteService` adaptor. - [Ledger API Server] The API server manages a single command tracker per (application ID × submitters) pair. This tracker would read the current ledger configuration's maximum deduplication time on creation, but never updated it, leading to trackers that might inadvertently reject a submission when it should have been accepted. The tracker now reads the latest ledger configuration. - Update schema version for http-json-api query store with new key_hash field - Improved performance for fetchByKey query which now uses key_hash field participant-state - move `DeduplicationPeriod` to ledger-api-domain dc4629f update NOTICES file (#10580) 8b0a0e7 update NOTICES file (#10578) 4b8b67a Upgrade Scalatest to v3.2.9. (#10576) 41e60f7 Upgrade to Scala 2.12.14 and 2.13.6. (#10573) c447898 Fix display of unhandled exceptions in the scenario service (#10572) 4e1a90d Enable --incompatible_remote_results_ignore_disk (#10571) d183ecc rotate release duty after 1.17.0-snapshot.20210811.7560.0.4f9de4ba (#10556) 76ecb44 update compat versions for 1.17.0-snapshot.20210811.7565.0.f1a55aa4 (#10563) 86a03fa Bump bazel max jvm memory (#10569) 1cc136c update NOTICES file (#10565) c69880c ledger-api-test-tool: Enforce test naming standards. (#10562) ee34d0f Track command - use types for error handling instead of grpc statuses [KVL-1005] (#10503) 93c25f3 Release 1.17 snapshot (#10560) ``` Changelog: ``` - [Ledger API Test Tool] The ``TransactionServiceIT`` test suite has been split into many test suites. If you are including or excluding it, you will need to use the new test suite names, or you can use "TransactionService" as a prefix for all of them. If you are including or excluding individual tests, you will need to update your arguments with the new test suite. You can find the new test suite by running the test tool with the ``--list-all`` flag and looking for the test's short identifier. The short identifiers have not changed, with the exception of ``TXNoContractKey``, which has been renamed to ``CKNoContractKey`` and is now in the ``ContractKeysIT`` test suite. * [Daml export] You can now set the ``--all-parties`` option to generate a ledger export as seen by all known parties. ledger-api-client - Propagate definite_answer as metadata in the GRPC response for submit/submitAndWait [JSON API] Ledger connection errors are now logged at every attempt akka-bindings: `LedgerClientBinding.commands` now returns a flow of `Either[CompletionFailure, CompletionSuccess]` instead of `Completion` for clearer error handling. For backwards compatiblity the new return type can be turned back into a `Completion` using `CompletionResponse.toCompletion` ``` ``` CHANGELOG_BEGIN CHANGELOG_END Co-authored-by: Azure Pipelines DAML Build <[email protected]>
This creates a runner named
runner_with_port_file
which knows how to interpolate two variables,%PORT_FILE%
and%PORT%
. This allows us to use theport-file
argument to the kvutils runner rather than hard-coding a port for conformance tests.For now, we only use this for generating the kvutils reference ledger export.
This pull request brought to you by me getting annoyed that the assigned port was already taken, breaking my build on CI.
Pull Request Checklist
CHANGELOG_BEGIN
andCHANGELOG_END
tagsNOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with
/AzurePipelines run
totrigger the build.