Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unpredictable load order for Ruby files causing randomness in undocumented reports. #1134

Closed
mensfeld opened this issue Nov 8, 2017 · 8 comments

Comments

@mensfeld
Copy link

mensfeld commented Nov 8, 2017

Unpredictable load order for Ruby files causing randomness with undocumented reports.

This problem is really tricky to reproduce due to the fact, that it requires different OS to be involved, so let me draw the context here.

I use yard stats --list-undoc to track undocumented objects within the code base. This is really really helpful for refactoring and documenting bigger systems. I have several parsers and tools wrapping raw yard output, so I can format it nicer. However what I've noticed, is that due to the fact that the code that I'm working on is developed both on Mac and Linux, the system reports same undocumented results for different files if undocumented object occurs in both of them when being executed exactly the same way on different OSes (or sometimes with the same OS, when changes were applied on the other one).

This happens due to the fact, that the behavior or Dir.glob can not be relied upon to be the same across different OSs. Not sure if this is by design, but rather an artifact of the file-systems.

On Windows and Linux the results are sorted by hierarchy, and then alphabetically; On Mac OS X the results are sorted alphabetically.

It seems to be related to this area of the code (not the particular line though):

https://github.com/lsegal/yard/blob/master/lib/yard/cli/yardoc.rb#L406

Steps to reproduce

Note: due to the fact, that Dir.glob is more or less unpredictable, it can be hard to replicate that issue without a bit of experimenting in between OSes, however the Dir.glob problem is widely known. This can be simulated on one OS by changing the file hierarchy in between executions and this is how the replication here goes.

This is the minimal reproduction for the issue. I've done my best to remove
all extraneous code and unique environment state on my machine before providing
these steps:

  1. Both files content can be a module or a class, that is used as a namespace and exists in multiple files
module A
end
  1. Create a repository with 2 files with the same content on Linux with following structure:
  • lib/a/test.rb
  • lib/b/test.rb
  1. Run yard stats --list-undoc --debug
[test]$ be yard stats --list-undoc --debug
[debug]: Parsing ["{lib,app}/**/*.rb", "ext/**/*.{c,cc,cxx,cpp}"] with `ruby` parser
[debug]: Parsing lib/a/test.rb
[debug]: Parsing lib/b/test.rb
[debug]: Serializing to .yardoc/objects/root.dat
Files:           1
Modules:         1 (    1 undocumented)
Classes:         0 (    0 undocumented)
Constants:       0 (    0 undocumented)
Attributes:      0 (    0 undocumented)
Methods:         0 (    0 undocumented)
 0.00% documented

Undocumented Objects:

(in file: lib/a/test.rb)
A
  1. Create a nested directory b within a a move there your test.rb file, so the structure looks like this:
  • lib/a/b/test.rb
  • lib/b/test.rb
  1. Run yard stats --list-undoc --debug
[test]$ be yard stats --list-undoc --debug
[debug]: Parsing ["{lib,app}/**/*.rb", "ext/**/*.{c,cc,cxx,cpp}"] with `ruby` parser
[debug]: Parsing lib/b/test.rb
[debug]: Parsing lib/a/b/test.rb
[debug]: Serializing to .yardoc/objects/root.dat
Files:           1
Modules:         1 (    1 undocumented)
Classes:         0 (    0 undocumented)
Constants:       0 (    0 undocumented)
Attributes:      0 (    0 undocumented)
Methods:         0 (    0 undocumented)
 0.00% documented

Undocumented Objects:

(in file: lib/b/test.rb)
A

Actual Output

When the execution results are diffed:

2d1
< [debug]: Parsing lib/a/test.rb
3a3
> [debug]: Parsing lib/a/b/test.rb
15c15
< (in file: lib/a/test.rb)
---
> (in file: lib/b/test.rb)

Expected Output

I would expect it to always report the first occurence in the alphabetical order.

Environment details:

  • OS: Ubuntu 14.04
  • Ruby version (ruby -v): ruby 2.4.0p0 (2016-12-24 revision 57164) [x86_64-linux]
  • YARD version (yard -v): yard 0.9.9

Recommended way of fixing this issue

The Dir.glob output should be sorted. That's all that is required. Then it will be predictable across OSes.

Ideal way of fixing this issue

The ideal would be a possibility to inject own order into yard somehow. This would allow people like me to prioritize files by their creation date (oldest first), so I would get pretty stable results despite adding new files to the structure that could provide additional noise.

I have read the Contributing Guide.

@lsegal
Copy link
Owner

lsegal commented Nov 17, 2017

I believe this might be fixed by #1123

@mensfeld
Copy link
Author

Hello @isegal, yes indeed this would fix that (or at least minimise the impact as a real fix would be a custom sorting option). Is there a chance for a release with this change?

@lsegal
Copy link
Owner

lsegal commented Nov 18, 2017

@mensfeld the PR is still in review. It looks like it's blocked on PR updates. If you want to pick them up and run with it, feel free!

@MSP-Greg
Copy link
Contributor

MSP-Greg commented Nov 18, 2017

@mensfeld @lsegal

Somewhat unrelated, but in the PR I mentioned some commits to RubyGems re unstable sort. I did one of them (PR), and I think both were related to Linux vs Windows differences. You've got differences between MacOS/OSX and Windows/Linux. Upon checking, RubyGems is not tested against MacOS/OSX, just Linux and Windows.

EDIT: RubyCI.org does have MacOS/OSX testing.

Another reason to never expect a stable sort...

@lsegal
Copy link
Owner

lsegal commented Nov 20, 2017

This should be fixed and released in 0.9.10. Thanks for reporting and thanks @MSP-Greg for the fix!

@lsegal lsegal closed this as completed Nov 20, 2017
@mensfeld
Copy link
Author

@lsegal thank you.

@mensfeld
Copy link
Author

@lsegal it's weird but I still observe this behavior :/ is there a way to impose custom order to sort for example by name?

@mensfeld
Copy link
Author

@lsegal re-pinging you in case you have an answer to my question :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants