Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in DPSEG? #57

Open
connormayer opened this issue Mar 24, 2025 · 3 comments
Open

Bug in DPSEG? #57

connormayer opened this issue Mar 24, 2025 · 3 comments

Comments

@connormayer
Copy link

Hello,

I'm working on a paper that compares different word segmentation algorithms, and wordseg has been extremely helpful. The wordseg documentation says that the DPSEG algorithm has a bug in it and "is not fully functional at present". I can't find any information about what the bug is/was and whether it has been fixed. I noticed in the source code that there's a function called _dpseg_bugfix to correct an issue with certain types of input. Is this the bug? I'm just hoping to confirm that DPSEG works properly before we report any results.

Thanks!

@mmmaat
Copy link
Collaborator

mmmaat commented Mar 24, 2025

Hello,

I worked on that about 8 years ago and now I'm doing completely different things... I remember the bug was related to some numerical issues with the Monte Carlo process, C++ side, not Python side (which should just be a wrapper over the C++ code). The _dpseg_bugfix you mention is not related to that bug but is a fix to another (solved) bug.

@alecristia
Copy link
Collaborator

Hi Connor,
Apologies! To my knowledge, that bug still has not been fixed. Like Mathieu, I've also moved on to other topics... I wonder if the people who created dpseg may be more helpful?

@connormayer
Copy link
Author

Hi Mathieu and Alex,

Thanks for the replies! I actually work with one of the authors of the dpseg paper, so I'll reach out to her.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants