You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the implementation of the Pulsar Transaction, each topic is configured with a Transaction Buffer to prevent consumers from reading uncommitted messages, which are invisible until the transaction is committed. Transaction Buffer works with Position (maxReadPosition) and TxnID Set (aborts).maxReadPosition is a position (ledgerId + entry sequence ID) for which it is guaranteed all messages up to it belong to a finished transaction (committed or aborted). Any messages related to an in-flight transaction are after maxReadPosition. The broker only dispatches messages, before the maxReadPosition, to the consumers. When the broker dispatches the messages before maxReadPosition to the consumer, the messages sent by aborted transactions will get filtered by the Transaction Buffer. It implements it by storing all the aborted transaction IDs in memory.
According to the above information, we can know that Transaction Buffer mainly works using maxReadPosition and an aborted transaction ID set.
These two variables change in real time when the producer uses transactions to send messages to the topic. Therefore, after the broker restarts, Transaction Buffer needs to replay all the messages in the original topic to restore these two variables. But this will make the startup time of the Transaction Buffer very slow, so the Transaction Buffer will periodically write these two variables as a message into a system topic.
The number of aborted Transaction IDs stored in a topic may be unlimited, but the data stored in a bookie entry is limited. So we introduced the concept of multiple-snapshotPIP-196:
The aborted transaction IDs are divided into equally-sized segments (size is configurable).
Each segment is written as an individual message to a dedicated system topic.
The positions of those messages are grouped into another message, called the segment index, which is written into a separate system topic. It also contains the remains of TX IDs (the last segment, which is smaller than configured size).
Motivation
Our primary goal is to improve the visibility and troubleshooting capabilities of the Pulsar Transaction Buffer by providing more detailed information about the snapshot stats and system topic internal status.
Currently, we have stats available via the Admin Rest API (/transactionBufferStats/{tenant}/{namespace}/{topic}), but it only exposes maxReadPosition and lastSnapshotTimestamps. In order to better understand and troubleshoot the transaction buffer, we need to obtain additional information, such as segment size and unseal aborted transaction ID size corresponding to the maxReadPosition.
Additionally, there is no API for accessing theTransactionBufferInternalStats, which would provide valuable insights into the stats of the system topic used for storing snapshots. Having this information available would facilitate problem investigation and resolution.
Goals
In Scope
Enhance the existing TransactionBufferStats by adding information about snapshot stats, including the capital of the current segment, unseal aborted transaction ID size, and other related data. This will provide better visibility and troubleshooting capabilities for the Pulsar Transaction Buffer.
Introduce a new API for obtaining TransactionBufferInternalStats, allowing users to access the state of the system topic used for storing snapshots. This will facilitate problem investigation and resolution when issues arise with the transaction buffer.
Out of Scope
None.
High Level Design
The high-level design for this proposal consists of two main components:
Enhancing the existing TransactionBufferStats by incorporating snapshot stats information.
Introducing a new API to obtain TransactionBufferInternalStats.
Detailed Design
Design & Implementation Details
Enhancing TransactionBufferStats
The current TransactionBufferStats class will be extended to include snapshot segment stats related to the transaction buffer. The previous result is as follows:
publicclassTransactionBufferStats {
/** The state of this transaction buffer. */publicStringstate;
/** The max read position of this transaction buffer. */publicStringmaxReadPosition;
/** The last snapshot timestamps of this transaction buffer. */publiclonglastSnapshotTimestamps;
}
The enhanced class will contain the following additional fields:
publicclassTransactionBufferStats {
/** The state of this transaction buffer. */publicStringstate;
/** The max read position of this transaction buffer. */publicStringmaxReadPosition;
/** The last snapshot timestamps of this transaction buffer. */publiclonglastSnapshotTimestamps;
// The total number of aborted transactions.publiclongtotalAbortedTransactions;
// The type of snapshot being used: either "Single" or "Segment"publicStringsnapshotType;
// If snapshotType is "Segment", this field will provide additional segment-related statisticspublicSegmentsStatsSegmentsStats;
}
publicclassSegmentsStats {
// The current number of the snapshot segments.publiclongsegmentsSize;
// The capacity of snapshot segment calculated by the current config (transactionBufferSnapshotSegmentSize)publiclongcurrentSegmentCapacity;
// The latest aborted txn IDs which number less than currentSegmentCapacitypubliclongunsealedAbortTxnIDSize;
// A list of individual segment statspublicList<SegmentStats> segmentStats;
/** The last snapshot segment timestamps of this transaction buffer. */publiclonglastTookSnapshotSegmentTimestamp;
}
publicclassSegmentStats {
publicStringlastTxnID;
publicStringpersistentPosition;
}
Adding a New API for TransactionBufferInternalStats
A new API will be introduced to obtain TransactionBufferInternalStats, which will provide information about the state of the system topic used for storing snapshots and indexes. The new API will allow users to access detailed internal stats about the transaction buffer, aiding in problem investigation and resolution.
The TransactionBufferInternalStats class will contain the following fields:
publicclassTransactionBufferInternalStats {
// The type of snapshot being used: either "Single" or "Segment"publicStringsnapshotType;
// If snapshotType is "Single", this field will provide the statistics of single snapshot log.publicSnapshotInternalStatssingleSnapshotInternalStats;
// If snapshotType is "Segment", this field will provide the statistics of snapshot segment topic.publicSnapshotInternalStatssegmentInternalStats;
// If snapshotType is "Segment", this field will provide the statistics of snapshot segment index topic.publicSnapshotInternalStatssegmentIndexInternalStats;
}
publicclassSnapshotInternalStats {
// The managed ledger name for the snapshot segment index topic.publicStringmanagedLedgerName;
// The managed ledger internal stats for the snapshot segment index topic.publicManagedLedgerInternalStatsmanagedLedgerInternalStats;
}
Public-facing Changes
Public API
We will enhance the existing admin API getTransactionBufferStats and add a new admin API getTransactionBufferInternalStats.
The existing admin API, getTransactionBufferStats, retrieves a TransactionBufferStats object containing the statistics of the TopicTransactionBuffer.
This proposal adds a SegmentsStats field to the TransactionBufferStats object. Details of the content changes for TransactionBufferStats can be found in the Implementation section. Since the new segmentStats field within SegmentsStats can be large, a flag is introduced to control its retrieval. Therefore, the proposal will add a new API:
// Enhanced API// The existing API, which now includes additional statistics.admin.transactions().getTransactionBufferStats(yourTopicName);
// The new enhanced API, allows control over whether to retrieve segment stats.admin.transactions().getTransactionBufferStats(yourTopicName, acquireSegmentStats);
This proposal also adds a new API to provide valuable insight into the stats of the snapshot storage-related system topic.
// New APIadmin.transactions().getTransactionBufferInternalStats(yourTopicName);
Binary protocol
Configuration
CLI
In addition to the changes in the Admin API, we will also introduce new CLI commands to utilize the enhanced and newly added APIs. These CLI commands will allow users to interact with the APIs more conveniently and efficiently.
Enhanced getTransactionBufferStats API
For the enhanced getTransactionBufferStats API, which now includes SegmentsStats in the TransactionBufferStats, we will add a flag to control the retrieval of segment stats.
# Usage of the enhanced getTransactionBufferStats API with the new flag:
pulsar-admin transactions get-transaction-buffer-stats <yourTopicName> [--include-segment-stats]
The --include-segment-stats flag will allow users to decide whether to include the segment stats in the result. If not specified, the segment stats will not be included.
Newly Added getTransactionBufferInternalStats API
For the newly added getTransactionBufferInternalStats API, we will introduce a new CLI command to obtain the TransactionBufferInternalStats. This will provide users with the state of the system topics used for storing snapshots and indexes, aiding in problem investigation and resolution.
This command will return the TransactionBufferInternalStats object, which includes the snapshot type, and, if applicable, the statistics of the single snapshot topic, the snapshot segment topic, and the snapshot segment index topic.
With these new CLI commands, users can more easily access and manage the enhanced and newly added transaction buffer APIs, improving their overall experience with the system.
Metrics
None
Monitoring
Users can monitor the internal state of the Transaction Buffer using the new getTransactionBufferInternalStats API and enhanced API getTransactionBufferStats.
Security Considerations
No security considerations are needed.
Backward & Forward Compatability
The new API will not affect backward or forward compatibility.
…ransactionBufferInternalStats API (#20330)
master #20291
### Motivation
Our primary goal is to improve the visibility and troubleshooting capabilities of the Pulsar Transaction Buffer by providing more detailed information about the snapshot stats and system topic internal status.
### Modifications
1. Enhance the existing TransactionBufferStats by adding information about snapshot stats, including the capital of the current segment, unseal aborted transaction ID size, and other related data. This will provide better visibility and troubleshooting capabilities for the Pulsar Transaction Buffer.
2. Introduce a new API for obtaining TransactionBufferInternalStats, allowing users to access the state of the system topic used for storing snapshots. This will facilitate problem investigation and resolution when issues arise with the transaction buffer.
Background knowledge
In the implementation of the Pulsar Transaction, each topic is configured with a Transaction Buffer to prevent consumers from reading uncommitted messages, which are invisible until the transaction is committed. Transaction Buffer works with Position (maxReadPosition) and
TxnID
Set (aborts).maxReadPosition
is a position (ledgerId
+ entry sequence ID) for which it is guaranteed all messages up to it belong to a finished transaction (committed or aborted). Any messages related to an in-flight transaction are after maxReadPosition. The broker only dispatches messages, before the maxReadPosition, to the consumers. When the broker dispatches the messages before maxReadPosition to the consumer, the messages sent by aborted transactions will get filtered by the Transaction Buffer. It implements it by storing all the aborted transaction IDs in memory.According to the above information, we can know that Transaction Buffer mainly works using maxReadPosition and an aborted transaction ID set.
These two variables change in real time when the producer uses transactions to send messages to the topic. Therefore, after the broker restarts, Transaction Buffer needs to replay all the messages in the original topic to restore these two variables. But this will make the startup time of the Transaction Buffer very slow, so the Transaction Buffer will periodically write these two variables as a message into a system topic.
The number of aborted Transaction IDs stored in a topic may be unlimited, but the data stored in a bookie entry is limited. So we introduced the concept of multiple-snapshotPIP-196:
Motivation
Our primary goal is to improve the visibility and troubleshooting capabilities of the Pulsar Transaction Buffer by providing more detailed information about the snapshot stats and system topic internal status.
Currently, we have stats available via the Admin Rest API (
/transactionBufferStats/{tenant}/{namespace}/{topic}
), but it only exposes maxReadPosition and lastSnapshotTimestamps. In order to better understand and troubleshoot the transaction buffer, we need to obtain additional information, such as segment size and unseal aborted transaction ID size corresponding to the maxReadPosition.Additionally, there is no API for accessing the
TransactionBufferInternalStats
, which would provide valuable insights into the stats of the system topic used for storing snapshots. Having this information available would facilitate problem investigation and resolution.Goals
In Scope
Out of Scope
None.
High Level Design
The high-level design for this proposal consists of two main components:
Detailed Design
Design & Implementation Details
Enhancing TransactionBufferStats
The current
TransactionBufferStats
class will be extended to include snapshot segment stats related to the transaction buffer. The previous result is as follows:Adding a New API for TransactionBufferInternalStats
A new API will be introduced to obtain
TransactionBufferInternalStats
, which will provide information about the state of the system topic used for storing snapshots and indexes. The new API will allow users to access detailed internal stats about the transaction buffer, aiding in problem investigation and resolution.The
TransactionBufferInternalStats
class will contain the following fields:Public-facing Changes
Public API
We will enhance the existing admin API getTransactionBufferStats and add a new admin API getTransactionBufferInternalStats.
The existing admin API, getTransactionBufferStats, retrieves a TransactionBufferStats object containing the statistics of the TopicTransactionBuffer.
This proposal adds a
SegmentsStats
field to theTransactionBufferStats
object. Details of the content changes forTransactionBufferStats
can be found in the Implementation section. Since the newsegmentStats
field withinSegmentsStats
can be large, a flag is introduced to control its retrieval. Therefore, the proposal will add a new API:This proposal also adds a new API to provide valuable insight into the stats of the snapshot storage-related system topic.
Binary protocol
Configuration
CLI
In addition to the changes in the Admin API, we will also introduce new CLI commands to utilize the enhanced and newly added APIs. These CLI commands will allow users to interact with the APIs more conveniently and efficiently.
Enhanced getTransactionBufferStats API
For the enhanced getTransactionBufferStats API, which now includes SegmentsStats in the TransactionBufferStats, we will add a flag to control the retrieval of segment stats.
The
--include-segment-stats
flag will allow users to decide whether to include the segment stats in the result. If not specified, the segment stats will not be included.Newly Added
getTransactionBufferInternalStats
APIFor the newly added
getTransactionBufferInternalStats
API, we will introduce a new CLI command to obtain theTransactionBufferInternalStats
. This will provide users with the state of the system topics used for storing snapshots and indexes, aiding in problem investigation and resolution.This command will return the
TransactionBufferInternalStats
object, which includes the snapshot type, and, if applicable, the statistics of the single snapshot topic, the snapshot segment topic, and the snapshot segment index topic.With these new CLI commands, users can more easily access and manage the enhanced and newly added transaction buffer APIs, improving their overall experience with the system.
Metrics
None
Monitoring
Users can monitor the internal state of the Transaction Buffer using the new
getTransactionBufferInternalStats
API and enhanced APIgetTransactionBufferStats
.Security Considerations
No security considerations are needed.
Backward & Forward Compatability
The new API will not affect backward or forward compatibility.
Revert
None.
Upgrade
None.
Alternatives
None.
General Notes
Links
The text was updated successfully, but these errors were encountered: