-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] Even if I recreate the kafka topic or modify the topic properties, I wonder how consuming can continue to do it. #7100
Comments
It doesn't matter how much data is lost. Only comsuming should be restored and operated normally.:) |
If you don't care about data loss, then you can do the following:
|
@mcvsubbu is there any workaround?:) |
@mcvsubbu |
As of now, there is no workaround. If you think about it, Pinot consumes from Kafka using offsets. Kafka guarantees a unique offset for each message in the queue, and a unique queue per topic. Across topics, the offsets and queues will not match in general, so the semantics of an identifier of a 'row' is lost. |
@mcvsubbu 1Is there any development in progress to solve this problem? 2If I follow the method of creating a new table, however, existing data must be preserved, so I create a new table and insert data. For example, To continue using the data in the tables, if (searchtime < createTableBtime)
select table A
else (searchtime > createTableBtime)
select table B That is, what I want to ask is... If so, it would be good if you could give me some advice.:) 3I've seen this issue. #6555 For reference, 4We have trouble because we want to permanently store customer data(up to 6 months to 1 year) in the pinot, rather than storing, verifying, and erasing user data one-time. We know you are busy, but we look forward to answering your questions. |
Hi @@mcvsubbu I am writing to get answers to the above questions. I am a developer developing an open source pinpoint apm. If you give answers to the above questions, I think we can make good functions using pinot. |
There are two types of properties when it comes to changing the stream configs:
The problem with first type is that offsets of different partitions change completely when the underlying stream changes. Pause/resume feature - that recently merged into master (#8986 and #9289) - can help here. For these incompatible parameter changes, the resume request has an option to handle the case of a completely new set of offsets. Operators can now follow a three-step process: First, issue a Pause request. Second, change the consumption parameters. Finally, issue the Resume request with the appropriate option. These steps will preserve the old data and allow the new data to be consumed immediately. All through the operation, queries will continue to be served. For the 2nd type, force commit endpoint #9197 can be used. The current consuming segments which hold the previous values in stream config will be immediately completed and new consuming segment will be spun off. These new consuming segments will pick up the new values in the stream config. |
Hello~
I am using pinot well. Thank you for making a great product. :)
I have a question because it didn't work as I thought while using it.
Even if I recreate the kafka topic or modify the topic properties, I wonder how consuming can continue to do it.
I am storing data in Stream ingestion way.
for some reason I need to delete the kafka topic and recreate it. The consuming segment seems to have stopped.
That is, no more data is stored.
I've tested it several times with the same scenario, but the consuming segment still stops and no data is saved.
So, I tried several methods to solve this problem, and among various attempts,
1 When I disable the data table
2 delete and recreate the kafka topic
3 enable the data table
sometimes the consuming segment recovers its operation.
However, it does not always work normally.
Also, I tried reload segment after comsuming segment stopped, but it still didn't work.
In addition, I tried various methods, but consuming stopped as it is.
Also, I restarted the docker cluster several times, but the consuming segment still did not work.
I tried various methods besides these, but couldn't find a solution.
My guess is that when the kafka topic is recreated, the data offset is changed and this is what happens.
In my opinion, if the offset is well reset within the pinot consumer, even if the kafka topic is recreated, it is normal when the comsuming segment continues to accumulate data well.
Even if I recreate the kafka topic or modify the properties, I wonder how consuming can continue to do it.
I don't know the internal logic well, but I looked at the org.apache.pinot.core.realtime.impl.kafka2.KafkaConsumerFactory, KafkaPartitionLevelConsumer, KafkaStreamLevelConsumer class codes, but couldn't find any problem.
If the consuming segment stops, re-creating the table may be a way,
but since the previously stored data is lost, I am looking for a way to keep the comsuming segment operating normally and not lose data without re-creating the table.
Note that
The docker image version currently used is as follows.
table confg
I know you are busy developing, but I hope you can help. I've been looking for a solution for a week, but I can't find a way.
The text was updated successfully, but these errors were encountered: