-
-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests on test_services_types.py fail randomly on Jenkins #1018
Comments
@wesleybl thanks for bringing this up! I wasn't aware of those 500 errors. As far as I know Jenkins nodes all run on the same machine which could cause load problems. We run performance tests on REST API (on our company Jenkins) and I never saw those 500 errors there. Maybe we should include this in our jMeter performance tests: https://github.com/plone/plone.restapi/blob/master/performance.jmx (Anyone can run those tests. The only reason I do not run them on the Plone Jenkins is that I do not have permissions on Jenkins any longer, since I stepped down as testing/ci team lead) |
@tisto @mauritsvanrees is there any parallelism in the execution of tests in Jenkins? I am suspecting that these errors occur because of parallels. For example, one test creates a field when another test expected that field to not exist. |
There is no parallelism as far as I know. In the future it might be possible to add this in the test runner. Note: for a Python 3 PR job, this is the main line in the job description. The only "sort-of" parallelism is that each node has several workers (node 1 has 6, the others have 4), so multiple jobs can run at the same time. I guess the effect is mostly race conditions like this: a robot test tries a random port to run a server, the port is free, but before it can start the server, another job has taken the port. I am not sure if this is what happens, but I can imagine it does. |
@mauritsvanrees according to the situation you described, my theory makes sense. This situation can occur not only in robot tests but also in tests such as: plone.restapi/src/plone/restapi/tests/test_services_types.py Lines 246 to 256 in 4bca687
Is always:
So, when we have jobs in parallel in Jenkins, the This explains why these errors don't occur in Travis, where jobs are run on separate machines. Is it possible that these tests run on random ports, like robot tests? |
In reality I saw that in the Line 69 in 93b6e59
Anyway, even being random in Jenkins, collisions can occur, as @mauritsvanrees suspected. |
Aha, so Travis uses the
I am not sure though when the port randomisation occurs, especially: does this happen multiple times during a test run, which could explain a random failure in a single test, or does this happen once for the entire run. Ah: all files in the |
If I recall correctly the zope testrunner on jenkins.plone.org does run in parallel. The REST API tests on Jenkins do not use the buildout in the REST API repo but instead the buildout.coredev one. We have to make sure we do not have any hardcoded ports in plone.restapi (which we don't have as far as I know). I looked into running stuff in parallel myself long time ago and dropped the idea exactly because of very hard to track edge cases like this. Just my 2c. Maybe that helps a little bit... |
@mauritsvanrees in |
@tisto @mauritsvanrees I noticed that these errors only occur in Python 2. Does this give you any hint of what it might be? Is the way Python 3 handles threads different from Python 2? |
I was able to simulate the error locally in for run in {1..5}; do bin/test -m plone.restapi -t test_services_types & done It is not all the time that the error occurs but if you run the command a few times, you will see the error. That is, parallel executions actually cause the error. |
I don't think the port randomisation or parallellisation is a problem. When I open two terminals and run Looking at
That is this line. Second thing that sometimes goes wrong, is a few lines further. The PUT gives a 400 My guess: this is caused by our good friend the dexterity schema cache. So we might need to call:
or:
That could be done at the start and/or end of the test Or maybe the test failures are a real bug and this needs to be fixed in the actual code. @wesleybl Are you up for creating a PR for this? |
@mauritsvanrees when I have time I’ll take a look. Thanks! |
@mauritsvanrees the strange thing is that this error only occurs with Python 2. So I think it is not a plone.restapi error. |
@mauritsvanrees I can confirm that calling:
or
in The plone.restapi/src/plone/restapi/types/utils.py Lines 281 to 300 in 0c24548
and it didn't fix the problem. |
I found this occurs because of Strangely, it only occurs in Python 2. The error occurred in Jenkins, but it did not occur in After the merge of PR #1218 , the error can happen in GA as well. |
I looked at this in the coredev buildout on Python 2.7, adapting it to have two zeoclients. I checked out Still, it occasionally goes wrong on Jenkins... One suspect change is that in Another thing is that maybe some changes go too fast: when the FTI is changed, Huh? A
So it looks like doing a request with |
Using my branch
The middle one does an explicit transaction commit.
|
When using A similar change in
Then after each request, I see the change reflected in I can make a PR later, but first there is another PR of me open, which should be merged first. |
This should fix some random test failures. Fixes #1018 Summary of my latest comments there: - In a functional test layer using zope.testbrowser, a request that changes something in Plone leads to a transaction commit. - plone.restapi uses RelativeSession instead, and this missed such an integration: changes did not really end up in the database. It somehow mostly worked so far, but to me that seems luck.
Fix in #1232 |
Reopening because I still see failures. For example: 2 != None
File "/srv/python2.7/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
File "/home/jenkins/workspace/pull-request-5.2-2.7/src/plone.restapi/src/plone/restapi/tests/test_services_types.py", line 379, in test_types_document_update_min_max
self.assertEqual(2, response.json().get("minLength"))
File "/srv/python2.7/lib/python2.7/unittest/case.py", line 513, in assertEqual
assertion_func(first, second, msg=msg)
File "/srv/python2.7/lib/python2.7/unittest/case.py", line 506, in _baseAssertEqual
raise self.failureException(msg) 'author_name' not found in {u'exclude_from_nav': {u'description': u'If selected, this item will not appear in the navigation tree', u'title': u'Exclude from navigation', u'default': False, u'factory': u'Yes/No', u'behavior': u'plone.excludefromnavigation', u'type': u'boolean'}, u'table_of_contents': {u'title': u'Table of contents', u'type': u'boolean', u'description': u'If selected, this will show a table of contents at the top of the page.', u'behavior': u'plone.tableofcontents', u'factory': u'Yes/No'}, u'contributors': {u'additionalItems': True, u'description': u'The names of people that have contributed to this item. Each contributor should be on a separate line.', u'title': u'Contributors', u'items': {u'title': u'', u'type': u'string', u'description': u'', u'factory': u'Text line (String)'}, u'factory': u'Tuple', u'behavior': u'plone.dublincore', u'uniqueItems': True, u'widgetOptions': {u'vocabulary': {u'@id': u'http://localhost:48059/plone/@vocabularies/plone.app.vocabularies.Users'}}, u'type': u'array'}, u'effective': {u'widget': u'datetime', u'description': u'If this date is in the future, the content will not show up in listings and searches until this date.', u'title': u'Publishing Date', u'factory': u'Date/Time', u'behavior': u'plone.dublincore', u'type': u'string'}, u'rights': {u'widget': u'textarea', u'description': u'Copyright statement or other rights information on this item.', u'title': u'Rights', u'factory': u'Text', u'behavior': u'plone.dublincore', u'type': u'string'}, u'text': {u'widget': u'richtext', u'description': u'', u'title': u'Text', u'factory': u'Rich Text', u'behavior': u'plone.richtext', u'type': u'string'}, u'expires': {u'widget': u'datetime', u'description': u'When this date is reached, the content will no longer be visible in listings and searches.', u'title': u'Expiration Date', u'factory': u'Date/Time', u'behavior': u'plone.dublincore', u'type': u'string'}, u'allow_discussion': {u'description': u'Allow discussion for this content object.', u'vocabulary': {u'@id': u'http://localhost:48059/plone/@sources/allow_discussion'}, u'title': u'Allow discussion', u'enum': [u'True', u'False'], u'factory': u'Choice', u'choices': [[u'True', u'Yes'], [u'False', u'No']], u'enumNames': [u'Yes', u'No'], u'behavior': u'plone.allowdiscussion', u'type': u'string'}, u'changeNote': {u'title': u'Change Note', u'type': u'string', u'description': u'Enter a comment that describes the changes you made.', u'behavior': u'plone.versioning', u'factory': u'Text line (String)'}, u'language': {u'description': u'', u'vocabulary': {u'@id': u'http://localhost:48059/plone/@vocabularies/plone.app.vocabularies.SupportedContentLanguages'}, u'title': u'Language', u'default': u'en', u'factory': u'Choice', u'behavior': u'plone.dublincore', u'type': u'string'}, u'relatedItems': {u'additionalItems': True, u'description': u'', u'title': u'Related Items', u'default': [], u'items': {u'title': u'Related', u'type': u'string', u'description': u'', u'vocabulary': {u'@id': u'http://localhost:48059/plone/@vocabularies/plone.app.vocabularies.Catalog'}, u'factory': u'Relation Choice'}, u'factory': u'Relation List', u'behavior': u'plone.relateditems', u'uniqueItems': True, u'widgetOptions': {u'vocabulary': {u'@id': u'http://localhost:48059/plone/@vocabularies/plone.app.vocabularies.Catalog'}, u'pattern_options': {u'recentlyUsed': True}}, u'type': u'array'}, u'versioning_enabled': {u'description': u'Enable/disable versioning for this document.', u'title': u'Versioning enabled', u'default': True, u'factory': u'Yes/No', u'behavior': u'plone.versioning', u'type': u'boolean'}, u'title': {u'title': u'Title', u'type': u'string', u'description': u'', u'behavior': u'plone.dublincore', u'factory': u'Text line (String)'}, u'subjects': {u'additionalItems': True, u'description': u'Tags are commonly used for ad-hoc organization of content.', u'title': u'Tags', u'items': {u'title': u'', u'type': u'string', u'description': u'', u'factory': u'Text line (String)'}, u'factory': u'Tuple', u'behavior': u'plone.dublincore', u'uniqueItems': True, u'widgetOptions': {u'vocabulary': {u'@id': u'http://localhost:48059/plone/@vocabularies/plone.app.vocabularies.Keywords'}}, u'type': u'array'}, u'creators': {u'additionalItems': True, u'description': u'Persons responsible for creating the content of this item. Please enter a list of user names, one per line. The principal creator should come first.', u'title': u'Creators', u'items': {u'title': u'', u'type': u'string', u'description': u'', u'factory': u'Text line (String)'}, u'factory': u'Tuple', u'behavior': u'plone.dublincore', u'uniqueItems': True, u'widgetOptions': {u'vocabulary': {u'@id': u'http://localhost:48059/plone/@vocabularies/plone.app.vocabularies.Users'}}, u'type': u'array'}, u'id': {u'title': u'Short name', u'type': u'string', u'description': u'This name will be displayed in the URL.', u'behavior': u'plone.shortname', u'factory': u'Text line (String)'}, u'description': {u'widget': u'textarea', u'description': u'Used in item listings and search results.', u'title': u'Summary', u'factory': u'Text', u'behavior': u'plone.dublincore', u'type': u'string'}}
File "/srv/python2.7/lib/python2.7/unittest/case.py", line 329, in run
testMethod()
File "/home/jenkins/workspace/pull-request-5.2-2.7/src/plone.restapi/src/plone/restapi/tests/test_services_types.py", line 426, in test_types_document_put
self.assertIn("author_name", response.json().get("properties"))
File "/srv/python2.7/lib/python2.7/unittest/case.py", line 804, in assertIn
self.fail(self._formatMessage(msg, standardMsg))
File "/srv/python2.7/lib/python2.7/unittest/case.py", line 410, in fail
raise self.failureException(msg) See: https://jenkins.plone.org/job/pull-request-5.2-2.7/2236 |
I noticed that the tests at test_services_types.py are failing with considerable frequency in Jenkins. Errors like:
Error message:
Traceback
For example:
https://jenkins.plone.org/job/pull-request-5.2/1712/testReport/junit/plone.restapi.tests.test_services_types/TestServicesTypes/test_types_document_get_field/
https://jenkins.plone.org/job/pull-request-5.2/1714/testReport/junit/plone.restapi.tests.test_services_types/TestServicesTypes/test_types_document_get_fieldset/
https://jenkins.plone.org/job/pull-request-5.2/1729/testReport/junit/plone.restapi.tests.test_services_types/TestServicesTypes/test_types_document_patch_fieldsets/
https://jenkins.plone.org/job/pull-request-5.2/1730/testReport/junit/plone.restapi.tests.test_services_types/TestServicesTypes/test_types_document_patch_create_missing/
https://jenkins.plone.org/job/pull-request-5.2/1736/testReport/junit/plone.restapi.tests.test_services_types/TestServicesTypes/test_types_document_get_field/
https://jenkins.plone.org/job/pull-request-5.2/1737/testReport/junit/plone.restapi.tests.test_services_types/TestServicesTypes/test_types_document_patch_create_missing/
These errors are likely due to high load on the Jenkins server. But why do only tests on
test_services_types.py
fail? Are type services heavier than others? Messages like500 != 204
do not help to understand these errors. Perhaps it is good to print the body of the response when the return code is500
, to see the backend traceback.The text was updated successfully, but these errors were encountered: