Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in class used by OAI Harvest #35

Merged
merged 1 commit into from
Feb 8, 2015

Conversation

nathan-wallis
Copy link

The original method used was just detecting when a record was found via the close tag, which resulted in the occasional corruption of raw-records due to the OAI-PMH harvest data being included. Improved but not foolproof method waits for opening tag to be hit before writing to the buffer that generates the record. I have not tested this with multiple tags to split on. That situation would complicate the harvest especially if the tags were nested inside each other.

@drspeedo
Copy link
Contributor

drspeedo commented Nov 5, 2014

Nathan, this looks good to me but I need to test it first... as soon as I get my PC back together at home (just moved) I'll run it and merge it. If you can provide output of you testing it I'll go ahead and merge it without my own testing.

@drspeedo drspeedo self-assigned this Feb 5, 2015
@drspeedo
Copy link
Contributor

drspeedo commented Feb 5, 2015

First attempt at testing it freezes at the output step in NLMJournalFetch testing. These passed on master branch so I'm wondering what is happening. I'll look into it later this week and try to help you get this merged. Sorry its taken almost 4 months for me to finally get around to testing this.

2015-02-04 20:58:59.854 INFO [o.v.h.f.n.NLMJournalFetch] Fetching 500 to 1000 records from search
2015-02-04 20:59:04.910 DEBUG [o.v.h.f.n.NLMJournalFetch] Sanitizing Output
2015-02-04 20:59:04.910 DEBUG [o.v.h.f.n.NLMJournalFetch] XML File Length - Pre Sanitize: 1776643
2015-02-04 20:59:04.963 DEBUG [o.v.h.f.n.NLMJournalFetch] XML File Length - Post Sanitze: 1777078
2015-02-04 20:59:04.964 DEBUG [o.v.h.f.n.NLMJournalFetch] Sanitization Complete
2015-02-04 20:59:04.964 TRACE [o.v.h.f.n.NLMJournalFetch] Writing to output

drspeedo added a commit that referenced this pull request Feb 8, 2015
Fix bug in class used by OAI Harvest
@drspeedo drspeedo merged commit 7e73947 into vivo-community:master Feb 8, 2015
@drspeedo
Copy link
Contributor

drspeedo commented Feb 8, 2015

Was a memory issue that happens occasionally when testing the harvester. Tests completed now and std tests for PubMed failed, but the rest past

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants