-
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repairing parentid for destroy rev, fixes #1448 #1455
repairing parentid for destroy rev, fixes #1448 #1455
Conversation
also updating maint-validate-metadata, final fix for moinwiki#1402 * updating index when fixing items * not replacing revision numbers for gaps in sequence
@@ -113,6 +115,14 @@ def test_destroy_revision(self): | |||
with pytest.raises(AccessDenied): | |||
item.destroy_revision(revid_protected) | |||
|
|||
def test_destroy_middle_revision(self): | |||
item, item_name, revid0, revid1, revid2 = self._store_three_revs('joe:read,write,destroy') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
destroying a middle revision now requires write permission as well as destroy in order to allow for update of the parentid
If I create an item with 3 revisions and delete revision 1, then create a new revision; the revision number of the new revision is 3; should be 4. Item History shows revisions 2,3,3. moin maint-validate-metadata shows no errors. |
I will reinstate some of the revision_number fixing in maint-validate-metadata, sounds like the rule is the revision_numbers are allowed to have gaps but not repeats? |
right, revision numbers may have gaps, not repeats |
also correcting index update to search in ALL_REVS existing_item only searches LATEST_REVS
latest commit adds validation for repeated revision_number, also fixed issue with the way I was querying the index, previous version only got hits from LATEST_REVS this commit finds revs in ALL_REVS |
Oops updated issue, meant to update here: Testing with an item with 4 revisions... If rev-number 1 (or lowest rev-number) is destroyed, rev-number 2 is updated, and becomes the current revision with updated timestamp. Item history shows rev-numbers 2, 4, 3. Testing with another item with 4 revisions... If rev-number 2 is destroyed, then the timestamp for rev-number 3 is updated and becomes current. Item history shows rev-numbers 3, 4, 1. I think the metadata timestamp (or MTIME) should refer to the update time of the revision data. Updating the metadata should not change the metadata MTIME. Is there a way to update the parentid without updating the MTIME? |
Hang in there, I know this is a hard one...
|
I've got the Now struggling with index files left open by whoosh, but I have the failure reproducible on Linux now so hoping to have a clean test run soon
latest test fail: https://github.com/bylsmad/moin/actions/runs/5102359184/jobs/9171871358 |
previous code when destroying a middle revision the update to PARENTID was making the next revision into the latest applying same fix ValidateMetadata
f2d368c
to
ee698f2
Compare
all tests passing ,ready for review |
@@ -1240,7 +1254,8 @@ def store_revision(self, meta, data, overwrite=False, | |||
data.seek(0) # rewind file | |||
backend_name, revid = backend.store(meta, data) | |||
meta[REVID] = revid | |||
self.indexer.index_revision(meta, content, backend_name) | |||
self.indexer.index_revision(meta, content, backend_name, force_latest=not overwrite) | |||
gc.collect() # triggers close of index files from is_latest search |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me know what you think of adding gc.collect()
here, it solves the test failure seen in https://github.com/bylsmad/moin/actions/runs/5102359184/jobs/9171871358
it is also possible to solve the test failure by putting the gc.collect
just before destroy_app()
in src.moin.conftest
but I was worried this would mask potential issues where unclosed files could lead to "too many open files" exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will defer to you as where to put gc.collect().
The code seems to work well, but tests on Windows 10 came up with one error:
================================== FAILURES ===================================
________________________ TestSiteCrawl.test_home_page _________________________
self = <moin.cli._tests.test_scrapy_crawl.TestSiteCrawl object at 0x000001F70A1768F0>
crawl_results = []
def test_home_page(self, crawl_results):
> assert len(crawl_results) > 0
E assert 0 > 0
E + where 0 = len([])
C:\git-david\moin\src\moin\cli\_tests\test_scrapy_crawl.py:54: AssertionError
----------------------------- Captured log setup ------------------------------
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you attach these files from the failed run:
_test_artifacts/crawl.log
_test_artifacts/server-crawl.log
_test_artifacts/crawl.csv
try latest commit, previous one had a typo in the logging setup for crawl.log and I've added logging to make timeouts easier to see, hopefully you will get same error and we can see what happened |
|
still no clues to the test failure in the logs. something is stopping the spider before it completes. In both of the failures, the spider stopped at right around the 2 minute point. Do you have any custom pytest.ini with a timeout setting? Could you attach full output of pytest (i.e. m-tox.txt)? |
No pytest.ini, but have a tox.ini under version control. Github does not support .ini, so here is txt version. Here are fresh copies of |
I'm still stumped...what could be different between your workstation setup and the Windows box which runs the test in the github action? does the test fail when invoked directly using command line?
|
@RogerHaase another troubleshooting method, manually run the commands from test_scrapy NOTE: my commands are from Linux terminal, you will need to adjust for Windows In one window:
NOTE: the test currently does an index-build after the load-help, just realized this is not needed and have filed a new ticket to eliminate the redundant index-build commands #1458 validate that your server is running by going to http://localhost:9080/Home in a browser In a second window:
once the scrapy command finishes, check for existence of _test_artifacts/crawl.csv there should be something like 421 lines in the file If this does not succeed, please attach the console output of the scrappy command |
I am confused and must rerun your tests. I was about to claim user error because I created a fresh clone and ran After correcting the problem by doing I ran your suggested tests above and they seem to have worked successfully. I am distracted by other tasks so I may need a day or two to rerun everything. |
fixing couple more bad file paths
pull down the latest from this branch before you resume testing, I've added more exception traps which should hopefully leave us some clues in crawl.log one thing I encountered as I was testing error conditions is I had scrapy installed in both my venv and also my user dir and also in the system dir, I got strange errors when scrapy was coming from my user dir or the system dir I did a closer look at your logs and it appears that the entire crawl is getting completed, but something is going wrong with the output of crawl.csv in spider_closed hopefully the increased logging will show what is happening |
error was caused by AsyncWriter which is not used when overwrite is True
sorry for all the commits, but the final test failure was a race condition which only happened on the github host I can squash them down once we have scrapy test passing on your workstation |
Here are the latest logs: |
Here is server log from manual run: Need anything else? |
ok should be fixed now the difference is you are running the tests via tox via
btw I'm heading out on a bicycle tour for a week, this should be ready for merge now assuming the fix for |
c4fa6d0
to
7fe23f9
Compare
Horay all tests pass> Have a fun trip. |
repairing parentid for destroy rev, fixes #1448
also updating maint-validate-metadata, final fix for #1402