-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce resource footprint of data-driven tests #3736
Comments
Out of curiosity, does it mean you will have access to the test case properties set in the |
Hi again @theofidry :)
The first is a matter of simply moving the Getting the runtime context right is trickier. And then there are generators. Turning a Thanks for the comment! You always have a keen eye for detail and I know you will be one of the first to test the pull request when it's up. Much appreciated. :) |
I have no idea about internal phpunit structure, but maybe there is a one-off standing tasks, that doesn't require phpunit knowledge to solve? I could contribute :) |
Hi @danon!
Refactoring the data provider mechanism is all about untangling parts of the core of PHPUnit and updating it to support on-demand data initialization. At the same time I want to add generator-based providers to the developers' toolbox and maintain backwards compatibility. It is a bit of a mess.
Just reading your positive message already contributes so much! Often when people make the effort to get into the Github issues something doesn't work as expected and they have become frustrated enough to write about it. This is a good reminder to create some extra ticket triage labels like |
@epdenouden Right now my only issue is that I have custom @dataPrvider, which loads data from files. The parsing of files can be difficult, and I'd like to debug it. But I can't debug a single dataprovider, without also debugging all of them - since data providers are resolved before running tests. I need to manually edit the dataProvider and add But I can imagine that's not a trivial task at the best of circumstances, and made extra hard by me not nowing anything about phpunit :D |
See, this is why the comments are just gold: I had not thought about it this way. Yes, the data provider refactoring would help you out with its improved row-by-row loading. However, when a data provider fails it should/could already give you an accurate code location. I have a request: can you With a bit of luck I can help you out with a small improvement along the lines of the extra protection against tear down exceptions or the improved code location hints. |
Please: no screenshots. Copy&Paste instead (of course using the right markup). |
Good point, text is much easier to work with. I've been guilty of posting screenshots as the highlighting is quite handy. Just found out how to use Markdown formatting in collapisble blocks like this one: github collapsible content with formatting
|
I'm the author of T-Regx library. I've also written a documentation which has a lot of code markups. For example here: https://t-regx.com/docs/match I don't want ever for the code markups to be invalid or return unexpected values. So I've created a parser that iterates throuch each of the snippets asserting that they return expected values: https://github.com/T-Regx/T-Regx.github.io/blob/source/test/Docs/MarkupResultConsistencyTest.php |
Thanks for the example, I'll have a look later this week. It is data provider related and I will create a separate ticket for it to keep the comments here for lazy-loading specific discussion. In any case: @danon, thanks for bringing up another use case. The whole array vs row by row handling is something that rears its head in multiple places in the code. |
@epdenouden but yea, basically I'd like the phpunit with iterator to call |
Yes, that will be a very nice improvement and it will get there. It will take a few steps to get every detail sorted as development on PHPUnit has to keep backwards compatibility in mind. BTW, T-Regx looks really nice; I grew up on too much Perl and it just looks so... clean. Will give it a go in a private project. Also, the Cross Data Providers project I saw is something that I have implemented myself more than once. Reimplmenting something like that as an iterator with proper memory management would be really sweet. If you don't mind, I'll ping you once I have finished laying the groundwork here. |
Can't wait :) If I can be of help, please ask.
Thank you, I really appreciate it :Dyou have no idea what it means to me. do you know this from checking out the docs?
We could merge it and create one common repo for users to use.
Waiting :) |
@epdenouden Hello, what's the status of the data provider handling revamp? What's new? :) Do you need contributors? |
🏭 Data-driven tests reengineered for large-scale automation
The current implementation of running data sets through tests using
@dataprovider
has not been designed for efficiency. Optimizing this part of PHPUnit would greatly benefit build pipelines and is a regular request from developers. The proposed changes provide a solid improvement to data providers in general and generators in particular.🚀 Benefits of improved data provider handling
A picture is worth a thousand data points:

prefetched
is the@dataprovider
implementation from8.1
just-in-time
is the experimental just-in-time loading mechanismJIT with unload
is the JIT combined with data unloading during the runsimulated Generator
are data providers that are (un)loaded row-by-rowData gathered with a custom
TestCase
and Xdebug. See the technical notes below for details.🏗 Design considerations and implementation
The change that makes everything possible is postponing the loading of data sets from the initialization phase to just before running. Immediately after each
TestCase
finishes the associated data is removed from memory. For all this to work a lot of plumbing and wiring has to be reviewed.\PHPUnit\Util\Test
toDataProviderTestSuite
DataProviderTestSuite
is created as an emptyTestSuite
DataProviderTestSuite::run()
TestCase::run()
there is now a call toTestCase::unloadData()
--list-tests
that require the whole test collection to be available for traversal can force-load all data providers by poking the rootTestSuite
--filter
selector requires access to both the keys and data of a@dataprovider
to be able to pick out the right ones. The MVP handles this by reverting back to the oldload on init
behaviour when a filter is detected⚗️ Research
@dataprovider
--filter
(and other selectors) work in detail and see what efficiency gains are possible without introducing new edge caseswhich requires more logic inwhich already work but are currently unrolled into anDataProviderTestSuite::run()
Array
📐Testing and benchmarking the prototype
8.1
, behaviour in newer versions is the sameDataProviderTestSuite
implementations I've createdSyntheticDataprovidersTest
Generator
row-by-row testingThis benchmark is accurately simulated by unrolling the
SyntheticDataprovidersTest
into a series of tests with single-row@dataprovider
. SeeDummyGeneratorDataprovidersTest
🔬 Technical notes
master
based development branchxdebug_trace()
and mangled into the graph above with some scripting and a spreadsheet📚 References
@dataProvider
usingiterator_to_array()
Generator
📋 Tasks
master
based MVPprefetch
vsjit
@dataprovider
loading mechanismDataProviderTestSuite
@testWith
to return aGenerator
Generator
row-by-row loadingDataProviderTestSuite::run()
injectFilter
functionality toDataProviderTestSuite
@group
loadAllDataProviders()
to make--list-tests
workTestSuiteSorter
to handle sorting unloadedDataProviderTestSuite
with defectsResultPrinter
(more tasks to be added)
The text was updated successfully, but these errors were encountered: