Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace dictzip with idzip #512

Merged
merged 6 commits into from
Sep 28, 2023
Merged

Replace dictzip with idzip #512

merged 6 commits into from
Sep 28, 2023

Conversation

bergentroll
Copy link
Contributor

@bergentroll bergentroll commented Sep 16, 2023

Alternative

izip is maintained Python alternative to dictzip utility.

Rationale

The pure python dependency instead of a binary, easy to obtain, no need to subprocess it.

Changes

  • dictzip call replaced with idzip, while runDictzip contract are kept as close to dictzip version as possible.
  • stardict writer compresses also a syn file when exists and dictzip opt is True for consistence.
  • Few simple test cases for runDictzip.

Notes

idzip had been tried on the DPD stardict. It does the job as same efficient as the dictzip, even a few percents faster. On a 1G random binary blob it shows a bit worse performance than dictzip do.

P. S.: I will be out of a PC for a few next days, so sorry for possible long response time.

@bergentroll
Copy link
Contributor Author

bergentroll commented Sep 17, 2023

There are OSError exceptions possible on open (PremissionError particularly). It is very probably an object to except and fire a log.error().

Upd: Done.

@ilius
Copy link
Owner

ilius commented Sep 21, 2023

Let's use both. Check for idzip first, then check for dictzip. Because the whole point of this function is convenience (user can compress the files himself after convert). If user has dictzip, why bother installing anything else? We also don't like to break compatibility.

Also please:

  • Rebase.
  • Add idzip to scripts/test-deps.sh if you rely on it in tests (they run as github action now)
  • Install ruff and ruff --fix ./pyglossary (also checked in a github action)

@ilius ilius merged commit f5e9681 into ilius:master Sep 28, 2023
@ilius
Copy link
Owner

ilius commented Sep 28, 2023

Thanks.

@bergentroll
Copy link
Contributor Author

bergentroll commented Sep 29, 2023

Oh, I thought to update the test to test an idzip variant explicitly.

Also I would suggest to include python-idzip to extra_required in the setup.py.

And one other thing. It seems reasonable to me to get an error when dictzip flag of the stardict writer is True and files are not compresses because of missing dependencies. "Explicit is better than implicit" as said the Zen of Python. In such case compression may be conditional when no explicit dictzip value by a user.

@ilius

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants