-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Features/context #109
Features/context #109
Conversation
625 tests passing for each python version :) |
from .base import Dataset | ||
from ..selector import Selector | ||
|
||
# DO WE NEED INTERNAL LINKS?? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you define the page sections as internal_links, I believe that would be more than enough for our use case. Do we still need this comment?
month = self.rng.randint(1, 12) | ||
|
||
max_days = 31 if month in (1, 3, 5, 7, 8, 10, 12) else 30 | ||
max_days = max_days if month != 2 else 29 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we are not handling possible leap year scenarios, which could break the code with ValueError: day is out of range for month
, even though its probability of happening is small
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formatting is a bit weird (which will be solved once black PR is merged) but overall this PR seems to add a very nice structure around the tasks and rewards organization, not to mention the dataset modularization + extra refactoring.
The only consideration I have is towards date qa not taking leap years in consideration, which could cause an exception but it wouldn't affect the flow that much considering that we have a task creation retry policy in place.
Dataset
class which enables context generation via 3 methodsget
(specific context retrieval, fully deterministic)search
(dataset-specific search algorithm, generally deterministic)random
(produces a random context, can be seeded)requests
based wikipedia dataclass in favor of the wiki python api. Closes Improve wikipedia retrieval mechanism #76Selector
class and variants, which enable customizable selection from sets of link-like objectsContext
dataclass which all datasets now return.MaxRetryError
exception class for handling api call errors