-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add with_split to DatasetDict.map #7368
base: main
Are you sure you want to change the base?
Conversation
Can you check this out, @lhoestq? |
1 similar comment
src/datasets/dataset_dict.py
Outdated
if with_split and "split" in fn_kwargs: | ||
raise ValueError("The key 'split' is reserved for the split name and cannot be passed in fn_kwargs.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to avoid this issue we use positional arguments for indices/rank to avoid this issue (this way users can name their parameters arbitrarily), maybe we can use this trick: https://stackoverflow.com/a/66274908 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lhoestq
Ah, I see what you mean.
So what you're saying is that there may be people who were already using fn_kwagrs to do splits, and my implementation may cause conflicts? I agree.
So let's use partial instead of fn_kwargs to solve the problem? Sounds like a good idea!
So I redefined partial and made it like this, can you take a look?
class bind(partial):
def __call__(self, *fn_args, **fn_kwargs):
return self.func(*fn_args, *self.args, **fn_kwargs)
#7356