Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PIP-251 Enhancing Transaction Buffer Stats and Introducing TransactionBufferInternalStats API #20291

Closed
liangyepianzhou opened this issue May 10, 2023 · 3 comments

Comments

@liangyepianzhou
Copy link
Contributor

liangyepianzhou commented May 10, 2023

Background knowledge

In the implementation of the Pulsar Transaction, each topic is configured with a Transaction Buffer to prevent consumers from reading uncommitted messages, which are invisible until the transaction is committed. Transaction Buffer works with Position (maxReadPosition) and TxnID Set (aborts).maxReadPosition is a position (ledgerId + entry sequence ID) for which it is guaranteed all messages up to it belong to a finished transaction (committed or aborted). Any messages related to an in-flight transaction are after maxReadPosition. The broker only dispatches messages, before the maxReadPosition, to the consumers. When the broker dispatches the messages before maxReadPosition to the consumer, the messages sent by aborted transactions will get filtered by the Transaction Buffer. It implements it by storing all the aborted transaction IDs in memory.
According to the above information, we can know that Transaction Buffer mainly works using maxReadPosition and an aborted transaction ID set.
These two variables change in real time when the producer uses transactions to send messages to the topic. Therefore, after the broker restarts, Transaction Buffer needs to replay all the messages in the original topic to restore these two variables. But this will make the startup time of the Transaction Buffer very slow, so the Transaction Buffer will periodically write these two variables as a message into a system topic.
The number of aborted Transaction IDs stored in a topic may be unlimited, but the data stored in a bookie entry is limited. So we introduced the concept of multiple-snapshotPIP-196:

  1. The aborted transaction IDs are divided into equally-sized segments (size is configurable).
  2. Each segment is written as an individual message to a dedicated system topic.
  3. The positions of those messages are grouped into another message, called the segment index, which is written into a separate system topic. It also contains the remains of TX IDs (the last segment, which is smaller than configured size).

Motivation

Our primary goal is to improve the visibility and troubleshooting capabilities of the Pulsar Transaction Buffer by providing more detailed information about the snapshot stats and system topic internal status.
Currently, we have stats available via the Admin Rest API (/transactionBufferStats/{tenant}/{namespace}/{topic}), but it only exposes maxReadPosition and lastSnapshotTimestamps. In order to better understand and troubleshoot the transaction buffer, we need to obtain additional information, such as segment size and unseal aborted transaction ID size corresponding to the maxReadPosition.
Additionally, there is no API for accessing theTransactionBufferInternalStats, which would provide valuable insights into the stats of the system topic used for storing snapshots. Having this information available would facilitate problem investigation and resolution.

Goals

In Scope

  1. Enhance the existing TransactionBufferStats by adding information about snapshot stats, including the capital of the current segment, unseal aborted transaction ID size, and other related data. This will provide better visibility and troubleshooting capabilities for the Pulsar Transaction Buffer.
  2. Introduce a new API for obtaining TransactionBufferInternalStats, allowing users to access the state of the system topic used for storing snapshots. This will facilitate problem investigation and resolution when issues arise with the transaction buffer.

Out of Scope

None.

High Level Design

The high-level design for this proposal consists of two main components:

  1. Enhancing the existing TransactionBufferStats by incorporating snapshot stats information.
  2. Introducing a new API to obtain TransactionBufferInternalStats.

Detailed Design

Design & Implementation Details

Enhancing TransactionBufferStats

The current TransactionBufferStats class will be extended to include snapshot segment stats related to the transaction buffer. The previous result is as follows:

public class TransactionBufferStats {

/** The state of this transaction buffer. */
public String state;

/** The max read position of this transaction buffer. */
public String maxReadPosition;

/** The last snapshot timestamps of this transaction buffer. */
public long lastSnapshotTimestamps;
}
The enhanced class will contain the following additional fields:
public class TransactionBufferStats {
      /** The state of this transaction buffer. */
   public String state;

   /** The max read position of this transaction buffer. */
    public String maxReadPosition;

   /** The last snapshot timestamps of this transaction buffer. */
    public long lastSnapshotTimestamps;

    // The total number of aborted transactions.
    public long totalAbortedTransactions;

    // The type of snapshot being used: either "Single" or "Segment"
    public String snapshotType;

    // If snapshotType is "Segment", this field will provide additional segment-related statistics
    public SegmentsStats SegmentsStats;

}

public class SegmentsStats {
    // The current number of the snapshot segments.
    public long segmentsSize;

    // The capacity of snapshot segment calculated by the current config (transactionBufferSnapshotSegmentSize)
    public long currentSegmentCapacity;

    // The latest aborted txn IDs which number less than currentSegmentCapacity
    public long unsealedAbortTxnIDSize;

    // A list of individual segment stats
    public List<SegmentStats> segmentStats;
   /** The last snapshot segment timestamps of this transaction buffer. */
    public long lastTookSnapshotSegmentTimestamp;

}

public class SegmentStats {
    public String lastTxnID;
    public String persistentPosition;
}

Adding a New API for TransactionBufferInternalStats

A new API will be introduced to obtain TransactionBufferInternalStats, which will provide information about the state of the system topic used for storing snapshots and indexes. The new API will allow users to access detailed internal stats about the transaction buffer, aiding in problem investigation and resolution.
The TransactionBufferInternalStats class will contain the following fields:

public class TransactionBufferInternalStats {
    // The type of snapshot being used: either "Single" or "Segment"
    public String snapshotType;

    // If snapshotType is "Single", this field will provide the statistics of single snapshot log.
    public SnapshotInternalStats singleSnapshotInternalStats;

    // If snapshotType is "Segment", this field will provide the statistics of snapshot segment topic.
    public SnapshotInternalStats segmentInternalStats;

    // If snapshotType is "Segment", this field will provide the statistics of snapshot segment index topic.
    public SnapshotInternalStats segmentIndexInternalStats;
}


public class SnapshotInternalStats {
    // The managed ledger name for the snapshot segment index topic.
    public String managedLedgerName;

    // The managed ledger internal stats for the snapshot segment index topic.
    public ManagedLedgerInternalStats managedLedgerInternalStats;
}

Public-facing Changes

Public API

We will enhance the existing admin API getTransactionBufferStats and add a new admin API getTransactionBufferInternalStats.
The existing admin API, getTransactionBufferStats, retrieves a TransactionBufferStats object containing the statistics of the TopicTransactionBuffer.

admin.transactions().getTransactionBufferStats(yourTopicName);

This proposal adds a SegmentsStats field to the TransactionBufferStats object. Details of the content changes for TransactionBufferStats can be found in the Implementation section. Since the new segmentStats field within SegmentsStats can be large, a flag is introduced to control its retrieval. Therefore, the proposal will add a new API:

// Enhanced API
// The existing API, which now includes additional statistics.
admin.transactions().getTransactionBufferStats(yourTopicName);

// The new enhanced API, allows control over whether to retrieve segment stats.
admin.transactions().getTransactionBufferStats(yourTopicName, acquireSegmentStats);

This proposal also adds a new API to provide valuable insight into the stats of the snapshot storage-related system topic.

// New API
admin.transactions().getTransactionBufferInternalStats(yourTopicName);

Binary protocol

Configuration

CLI

In addition to the changes in the Admin API, we will also introduce new CLI commands to utilize the enhanced and newly added APIs. These CLI commands will allow users to interact with the APIs more conveniently and efficiently.

Enhanced getTransactionBufferStats API

For the enhanced getTransactionBufferStats API, which now includes SegmentsStats in the TransactionBufferStats, we will add a flag to control the retrieval of segment stats.

# Usage of the enhanced getTransactionBufferStats API with the new flag:
pulsar-admin transactions get-transaction-buffer-stats <yourTopicName> [--include-segment-stats]

The --include-segment-stats flag will allow users to decide whether to include the segment stats in the result. If not specified, the segment stats will not be included.
Newly Added getTransactionBufferInternalStats API
For the newly added getTransactionBufferInternalStats API, we will introduce a new CLI command to obtain the TransactionBufferInternalStats. This will provide users with the state of the system topics used for storing snapshots and indexes, aiding in problem investigation and resolution.

# Usage of the new getTransactionBufferInternalStats API:
pulsar-admin transactions get-transaction-buffer-internal-stats <yourTopicName>

This command will return the TransactionBufferInternalStats object, which includes the snapshot type, and, if applicable, the statistics of the single snapshot topic, the snapshot segment topic, and the snapshot segment index topic.
With these new CLI commands, users can more easily access and manage the enhanced and newly added transaction buffer APIs, improving their overall experience with the system.

Metrics

None

Monitoring

Users can monitor the internal state of the Transaction Buffer using the new getTransactionBufferInternalStats API and enhanced API getTransactionBufferStats.

Security Considerations

No security considerations are needed.

Backward & Forward Compatability

The new API will not affect backward or forward compatibility.

Revert

None.

Upgrade

None.

Alternatives

None.

General Notes

Links

@hpvd
Copy link

hpvd commented May 10, 2023

this PIP Number seems to be already in use, see:
https://github.com/apache/pulsar/issues?q=is%3Aissue+is%3Aopen+PIP+251

@liangyepianzhou
Copy link
Contributor Author

this PIP Number seems to be already in use, see:
https://github.com/apache/pulsar/issues?q=is%3Aissue+is%3Aopen+PIP+251

This is the same one. I have close that.

@github-actions
Copy link

The issue had no activity for 30 days, mark with Stale label.

@github-actions github-actions bot added the Stale label Jun 16, 2023
liangyepianzhou added a commit that referenced this issue Jul 22, 2023
…ransactionBufferInternalStats API (#20330)

master #20291

### Motivation

Our primary goal is to improve the visibility and troubleshooting capabilities of the Pulsar Transaction Buffer by providing more detailed information about the snapshot stats and system topic internal status.
### Modifications

1. Enhance the existing TransactionBufferStats by adding information about snapshot stats, including the capital of the current segment, unseal aborted transaction ID size, and other related data. This will provide better visibility and troubleshooting capabilities for the Pulsar Transaction Buffer.
2. Introduce a new API for obtaining TransactionBufferInternalStats, allowing users to access the state of the system topic used for storing snapshots. This will facilitate problem investigation and resolution when issues arise with the transaction buffer.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants