Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Created EmptyQuickstart command #8024

Merged
merged 4 commits into from
Jan 19, 2022
Merged

Created EmptyQuickstart command #8024

merged 4 commits into from
Jan 19, 2022

Conversation

kbastani
Copy link
Contributor

Description

Created a new Pinot quickstart command called EmptyQuickstart.

The entire point of this command is that we previously had no quickstart that would save your data after a restart. This will be used to update the Homebrew formula so that users can run Pinot as a single component local service that will save your data/tables/schemas/segments after a restart. It can also be used for Docker when specifying a host volume data directory.

See the Documentation section below for usage notes.

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

  • Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed
  • Yes (Please label this PR as release-notes and complete the section on Release Notes)

Documentation

To launch a single component installation of Pinot that saves data after a restart, you can use the EmptyQuickstart command.

  • Allows a user to use an external or embedded Zookeeper instance (-zkAddress).
    -- If no argument is provided, an embedded Zookeeper instance will be launched.
    -- When an argument is provided, no embedded Zookeeper instance will be launched and controller/broker/server will use the provided ZK URL.
  • Allows a user to specify a non-temporary data directory (-dataDir).
    -- If no argument is provided, a temporary data directory will be generated.
    -- When an argument is provided, embedded Zookeeper files are stored and reused on restart.
    -- When an argument is provided, Pinot controller/broker/server data files are stored and reused on restart.

To start the quickstart command, run the following in your terminal.

# Reuses the provided dataDir on restart but launches an embedded ZK instance
$ bin/pinot-admin.sh QuickStart -type EMPTY -dataDir "/usr/local/var/lib/pinot/data"

# Reuses the provided dataDir on restart and uses an external ZK instance
$ bin/pinot-admin.sh QuickStart -type EMPTY -dataDir "/usr/local/var/lib/pinot/data" -zkAddress "localhost:2181"

Thread.sleep(5000);
}

public static void printStatus(Color color, String message) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should go to QuickStartBase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

return null;
}

protected void waitForBootstrapToComplete(QuickstartRunner runner)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put into QuickStartBase?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in all referencing quickstarts

@codecov-commenter
Copy link

codecov-commenter commented Jan 14, 2022

Codecov Report

Merging #8024 (4a26bc1) into master (eaa5180) will decrease coverage by 3.27%.
The diff coverage is 33.33%.

❗ Current head 4a26bc1 differs from pull request most recent head 26f86d1. Consider uploading reports for the commit 26f86d1 to get more accurate results
Impacted file tree graph

@@             Coverage Diff              @@
##             master    #8024      +/-   ##
============================================
- Coverage     71.39%   68.12%   -3.28%     
+ Complexity     4222     4142      -80     
============================================
  Files          1597     1202     -395     
  Lines         82903    60120   -22783     
  Branches      12369     9267    -3102     
============================================
- Hits          59189    40956   -18233     
+ Misses        19707    16271    -3436     
+ Partials       4007     2893    -1114     
Flag Coverage Δ
integration1 ?
integration2 ?
unittests1 68.12% <33.33%> (-0.02%) ⬇️
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...a/org/apache/pinot/common/utils/ServiceStatus.java 50.95% <0.00%> (-11.30%) ⬇️
...local/indexsegment/mutable/MutableSegmentImpl.java 59.08% <ø> (ø)
.../java/org/apache/pinot/spi/data/TimeFieldSpec.java 88.63% <ø> (-2.28%) ⬇️
...core/query/executor/ServerQueryExecutorV1Impl.java 60.55% <69.23%> (-22.38%) ⬇️
...a/org/apache/pinot/common/metrics/MinionMeter.java 0.00% <0.00%> (-100.00%) ⬇️
...g/apache/pinot/common/metrics/ControllerMeter.java 0.00% <0.00%> (-100.00%) ⬇️
.../apache/pinot/common/metrics/BrokerQueryPhase.java 0.00% <0.00%> (-100.00%) ⬇️
.../apache/pinot/common/metrics/MinionQueryPhase.java 0.00% <0.00%> (-100.00%) ⬇️
...he/pinot/common/messages/SegmentReloadMessage.java 0.00% <0.00%> (-100.00%) ⬇️
...pinot/core/data/manager/realtime/TimerService.java 0.00% <0.00%> (-100.00%) ⬇️
... and 613 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update eaa5180...26f86d1. Read the comment docs.

@@ -75,6 +75,8 @@
private final boolean _enableTenantIsolation;
private final String _authToken;
private final Map<String, Object> _configOverrides;
private final String _zkAddress;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call it _externalZkAddress? And add comment that if this is set, Quickstart won't start an embedded zk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@@ -93,9 +95,31 @@ public QuickstartRunner(List<QuickstartTableRequest> tableRequests, int numContr
_enableTenantIsolation = enableIsolation;
_authToken = authToken;
_configOverrides = configOverrides;
_zkAddress = null;
_deleteExistingData = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be added as a command-line parameter for QuickStartCommand?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refactored the constructor to make this simpler. For deleting existing data, if the flag is set to false for any of the other quickstart commands, there will be an exception on startup when attempting to create a data directory that already exists. I've added _deleteExistingData as a constructor argument for future quickstart commands.

@amrishlal
Copy link
Contributor

This is going to be super useful. I am wondering if we can have the option to set certain parameters as well to make debugging using Quickstart easier:

CommonConstants.java:
public static final long DEFAULT_BROKER_TIMEOUT_MS = 10_000L
public static final long DEFAULT_QUERY_EXECUTOR_TIMEOUT_MS = 15_000L

  • setting to higher value will prevent execution from terminating while debugging using Quickstart.

Also, if possible:
QueryException.java:
private static int _maxLinesOfStackTracePerFrame = 5;

  • setting to higher value while launching quickstart will allow for seeing full stack trace in console when query fails.

@kbastani
Copy link
Contributor Author

This is going to be super useful. I am wondering if we can have the option to set certain parameters as well to make debugging using Quickstart easier:

CommonConstants.java: public static final long DEFAULT_BROKER_TIMEOUT_MS = 10_000L public static final long DEFAULT_QUERY_EXECUTOR_TIMEOUT_MS = 15_000L

* setting to higher value will prevent execution from terminating while debugging using Quickstart.

Also, if possible: QueryException.java: private static int _maxLinesOfStackTracePerFrame = 5;

* setting to higher value while launching quickstart will allow for seeing full stack trace in console when query fails.

Yes, let me look into this. We're discussing adding configuration overrides to QuickStarts as another PR. We're still sorting out the best design philosophy for this, as configuration keys are applied to separate components. I think ideally we would be able to provide an easier zero-downtime configuration manager, such as using a config server with auto-refresh scope. This would allow a user to use a git repository as a way to version control and manage configurations for each of the different components. Any thoughts on this @amrishlal?

@xiangfu0
Copy link
Contributor

This is going to be super useful. I am wondering if we can have the option to set certain parameters as well to make debugging using Quickstart easier:

CommonConstants.java: public static final long DEFAULT_BROKER_TIMEOUT_MS = 10_000L public static final long DEFAULT_QUERY_EXECUTOR_TIMEOUT_MS = 15_000L

  • setting to higher value will prevent execution from terminating while debugging using Quickstart.

Also, if possible: QueryException.java: private static int _maxLinesOfStackTracePerFrame = 5;

  • setting to higher value while launching quickstart will allow for seeing full stack trace in console when query fails.

We should add support for quickstart to take extra configs or a config file to override existing configs. E.g. enabling grpc or auth for quick test.

@xiangfu0 xiangfu0 merged commit b118600 into apache:master Jan 19, 2022
KKcorps pushed a commit to KKcorps/incubator-pinot that referenced this pull request Jan 21, 2022
* Added empty quickstart command.

* Quickstart fixes per reviewer feedback.

* Fixed typo in QuickStartCommand

* Fixing bug with minion quickstart
@amrishlal
Copy link
Contributor

This would allow a user to use a git repository as a way to version control and manage configurations for each of the different components. Any thoughts on this @amrishlal?

This is more generalized (and probably more useful) approach than what I had in mind, but I was mainly thinking about something very simple like specifying a command line parameter ("-debug" or "-dev") that would increase the value of
DEFAULT_BROKER_TIMEOUT_MS, DEFAULT_QUERY_EXECUTOR_TIMEOUT_MS, and _maxLinesOfStackTracePerFrame) so that one could easily debug code through Quickstart without timing out in the middle of a debug session.

@xiangfu0
Copy link
Contributor

This would allow a user to use a git repository as a way to version control and manage configurations for each of the different components. Any thoughts on this @amrishlal?

This is more generalized (and probably more useful) approach than what I had in mind, but I was mainly thinking about something very simple like specifying a command line parameter ("-debug" or "-dev") that would increase the value of DEFAULT_BROKER_TIMEOUT_MS, DEFAULT_QUERY_EXECUTOR_TIMEOUT_MS, and _maxLinesOfStackTracePerFrame) so that one could easily debug code through Quickstart without timing out in the middle of a debug session.

Added in #8059

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants