Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTreeProcessorMP processes events multiple times when there are more threads than entries #15425

Closed
1 task done
iguinn opened this issue May 6, 2024 · 8 comments · Fixed by #16147
Closed
1 task done
Assignees
Labels

Comments

@iguinn
Copy link

iguinn commented May 6, 2024

Check duplicate issues.

  • Checked for duplicates

Description

TTreeProcessorMP process events multiple times when you read multiple files using TChain, and some of the files have fewer entries than there are threads. This has been observed using TChain and TSelector, but I haven't tried it out with any other methods of using TTreeProcessorMP.

The attached reproducer minimally shows this bug by showing that the Process method is run more than it should be. Another observations that are not shown in the reproducer is that the additional events are repeats of existing events. In the code where I first observed this, we frequently have files where one thread is processing multiple events (5-10ish), and most of them are real events, with the last few being repeats. It also seems like it always tries to process at least 15 events for each file if you use 16 threads, even if those events don't exist; if there are more than 15 events than it seems to behave as expected.

Reproducer

mp_bug.zip
Run:

mkdir files
root FillRootFiles.C
root TestMP.C+

This will create 100 root files with 1 entry each called files/f_0###.root. It will then use TTreeProcessorMP with 16 threads to read through the files and count the number of events read. This should be 100; instead it is 1500.

ROOT version

v6.28/06 and 6.26/10

Installation method

Built from source

Operating system

Ubuntu 22.04

Additional context

No response

@iguinn iguinn added the bug label May 6, 2024
@dpiparo dpiparo self-assigned this May 9, 2024
@bellenot
Copy link
Member

bellenot commented Jun 5, 2024

Not sure if it's a bug or a feature... With TTreeProcessorMP, if the number of files to process is larger than the number of worker processes, the process is invoked once per file. Otherwise entries are divided equally between workers.
To see this, change your code to have 10 files and 1000 events per file:

  ULong_t n_files = 10;
  ULong_t ev_per_file = 1000;

you'll see:

TParameter<int> ct = 10000

But I might be wrong and @pcanal, @dpiparo, or @guitargeek can correct me...

@dpiparo
Copy link
Member

dpiparo commented Jun 13, 2024

But isn't the report about the opposite case, i.e. when there are more workers than events to process?

@bellenot
Copy link
Member

100 root files with 16 threads, so more files than workers. Or did I miss something?

@iguinn
Copy link
Author

iguinn commented Jun 13, 2024

It's about having more threads in a file than events (there's one event per file). As mentioned above, it seems like the process is getting invoked once per thread per file, but if there aren't enough events, the extra threads seem to reprocess an old event. This is what we had noticed with the much more complex code that led us to identify this, which is that certain events were getting processed and histogrammed multiple times in cases like this. We were able to work around this successfully by checking if the same event is being processed multiple times in a row.

@dpiparo
Copy link
Member

dpiparo commented Jul 30, 2024

In this case, the behaviour is confirmed:

void fill(){
    ROOT::RDataFrame(3).Define("a", [](){return 1;}).Snapshot("t", "f.root");
}

class TestSelector : public TSelector {
public:
  TH1F *h;
  
  virtual void SlaveBegin(TTree*) {
    h = new TH1F("h", "h", 8, -1, 1);
    h->SetDirectory(0);
  }
  virtual bool Process(Long64_t) {
    h->Fill(0);
    return true;
  }
  virtual void SlaveTerminate() {
    GetOutputList()->Add(h);
  }
};

void b(){
    //fill();
    ROOT::TTreeProcessorMP pool(12);
    TestSelector sel;
    auto h = pool.Process("f.root", sel);
    h->At(0)->Print();
}

prints TH1.Print Name = h, Entries= 11, Total sum= 11

dpiparo added a commit to dpiparo/root that referenced this issue Jul 31, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Thanks to @hageboeck for the help with the output management of the test.

Fixes root-project#15425
@dpiparo
Copy link
Member

dpiparo commented Jul 31, 2024

Thanks a lot for the report. Indeed it was a problem, which is hopefully fixed by #16147 . I will backport the change to 6.32, too.

dpiparo added a commit to dpiparo/root that referenced this issue Jul 31, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Fixes root-project#15425
dpiparo added a commit to dpiparo/root that referenced this issue Jul 31, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Fixes root-project#15425
dpiparo added a commit to dpiparo/root that referenced this issue Aug 7, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Fixes root-project#15425
dpiparo added a commit to dpiparo/root that referenced this issue Aug 7, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Fixes root-project#15425
dpiparo added a commit that referenced this issue Aug 7, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Fixes #15425
@dpiparo
Copy link
Member

dpiparo commented Aug 7, 2024

Reopening until the BP is merged in the 6.32 branch.

@dpiparo dpiparo reopened this Aug 7, 2024
dpiparo added a commit to dpiparo/root that referenced this issue Aug 7, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Fixes root-project#15425
dpiparo added a commit that referenced this issue Aug 7, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Fixes #15425
@dpiparo
Copy link
Member

dpiparo commented Aug 7, 2024

BP also merged.

@dpiparo dpiparo closed this as completed Aug 7, 2024
silverweed pushed a commit to silverweed/root that referenced this issue Aug 14, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Fixes root-project#15425
silverweed pushed a commit to silverweed/root that referenced this issue Aug 19, 2024
When processing trees with less entries than workers with TTreeProcessorMP
some entries were processed multiple times because of a mistake in the
algorithm calculating the event ranges.

Fixes root-project#15425
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Status: Done
Status: Issues
Development

Successfully merging a pull request may close this issue.

3 participants