Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Visualize File Age (in the whole Subtree?) #118

Closed
shundhammer opened this issue Jul 30, 2019 · 12 comments
Closed

Feature Request: Visualize File Age (in the whole Subtree?) #118

shundhammer opened this issue Jul 30, 2019 · 12 comments

Comments

@shundhammer
Copy link
Owner

shundhammer commented Jul 30, 2019

Original subject (from me; I misunderstood the intentions):

Feature Request: Option to Sort by Age (atime, mtime) instead of Size

Received via mail from Nicolas Brisset:

Maybe this is something that has already been discussed, it reminds me of something I read somewhere but I can't seem to recall exactly where. I think it would be very interesting to have a way to visualize something else than the size of files. Typically, the last access or modification time may make sense when you want to archive old, unused files.

@shundhammer
Copy link
Owner Author

shundhammer commented Jul 30, 2019

That has been there literally forever:

QDirStat-main-win

The Last Modified column is the latest mtime in that subtree or, for files, the mtime of that file. Of course you can also sort by that column.

The atime is not even kept in the internal data structures since that timestamp is completely useless; in my 25+ years of dealing with Unix/Linux systems I have yet to see the first useful application of the atime. Each and every access to a file will change it; for a directory, each find command will update it to "now". For a file, even the simplest of operations like invoking the file command to find out what that file is or a recursive grep -R in that directory tree will update it to "now". The very idea behind that timestamp is broken by design.

To summarize this: This has been there for mtime forever, and for atime it's not useful at all.

@brisset
Copy link

brisset commented Jul 31, 2019

It seems I wasn't clear enough, sorry for that. In fact I meant the treemap: I would like to be able to identify in one glance the oldest files by having an area proportional to their age.
If atime is not usable as you explain, then too bad... But mtime could still be interesting, couldn't it?

@shundhammer
Copy link
Owner Author

shundhammer commented Jul 31, 2019

OK, sorry, then I misunderstood you.

So this could come in different stages:

  • Introduce a new summary field oldest file and a corresponding (optional?) column in the tree view that shows the age (in days?) of the oldest file (not directory, symlink etc.) in a subtree.

    For a file, this is the same as the last modified field / column (the mtime), but it might make sense to display it in days to make it easier to grasp visually (larger number -> older).

    For a directory, this would show the oldest age anywhere in the subtree.

  • Come up with a suitable visualization in addition to that. You are right that this could be done with a treemap, just not using the size as the value, but the age. But I see that as problematic because the treemap has a learning curve as it is, and having to deal with two different kinds of treemap and being able to tell the difference is a challenge unto itself. Now that QDirStat / KDirStat / WinDirStat users know what a treemap is and how to interpret it, I'd rather not introduce new confusion.

    Maybe there are better ways to visualize this.

@shundhammer shundhammer changed the title Feature Request: Option to Sort by Age (atime, mtime) instead of Size Feature Request: Visualize File Age (in the whole Subtree?) Jul 31, 2019
@shundhammer shundhammer reopened this Jul 31, 2019
@shundhammer
Copy link
Owner Author

shundhammer commented Jul 31, 2019

How Others are Doing This

Agedu

I thought about this over night and remembered that there was a program agedu that claims to do this:

https://www.chiark.greenend.org.uk/~sgtatham/agedu/

agedu

(I was a bit surprised to find out that this is not a GUI program, but a program to index a subtree and generate HTML from that index that you can load into your web browser)

Hrmph. I am not impressed. That visualization does not really tell me much. It's colored bars, but those bars don't give away much information, not even with the legend above them.

ArcheoloGit

https://marmelab.com/blog/2014/05/15/archeologit.html

(Click for screenshots; not sure if it's okay with them to repost those screenshots here)

This is not exactly about file age; this is about source code age and change frequency. But it goes into a similar general direction.

They are also using a treemap. They visualize change frequency of a file (commits per time) with the treemap value (where QDirStat uses the size), i.e. the more commits a file has, the larger the corresponding treemap tile (the rectangle).

The color represents the age; from green (fresh) to red / brown (rotten?). I like that idea.

Independent of this visualization, I had thought about using different color schemes; like the brown-yellowish sepia you might know from old photographs for old stuff, and something more crisp for new stuff.

@brisset
Copy link

brisset commented Jul 31, 2019

Yes, the ArcheoloGit is very close to what I had in mind (just that for qdirstat it may be difficult to get the number of times a file was changed, unless it is somehow a working copy or clone which then may offer this specific view - but then it becomes even more complicated).
I also like your idea of the Sepia-ish color scale.
Regarding how to avoid confusion: how about adding the treemaps in different tabs with an explicit name (Size map and Age map or Activity map)? Possibly, make the list of shown tabs configurable in the settings, or alternatively adding a menu/button in the toolbar to select the view(s)

@shundhammer
Copy link
Owner Author

shundhammer commented Jul 31, 2019

Yes, the number of times changed is out of the question; it also wouldn't be very useful in the context of QDirStat.

I thought about it some more: Both extremes might be useful to know. Sometimes I want to spot files that changed just a moment ago (or today), sometimes I am interested in very old stuff. It depends on the specific use case.

So, just throwing crazy ideas into the room: In an ideal world, I'd like to see the "normal" files shaded differently from the very old and from the very new files. Old ones might be brownish / sepia / red brown, new ones, say, bright green; the "normal" ones would get a neutral color like medium grey (like the treemap already does for files with unknown MIME categories).

The trouble starts with determining what exactly that means: "Normal" vs. "old" vs. "new". I'd like to pick the median for "normal" or maybe even the interval between the first and the third quartile.

Mind you, in an extreme example where I just downloaded or unpacked a thousand files where all files have pretty much the same age, I wouldn't want to see the extreme colors; the visualization should clearly show that they are of roughly the same age.

The cowardly solution would be to add two sliders like in the histogram to move those boundaries up and down; not sure if it's realistic to please everybody most users (you can never please everybody anyway...).

Technically, determining things like a median and quartiles / percentiles over that many items is an expensive thing (but then, I already have that code for the histograms): It involves sorting all data and then counting the intervals between the percentiles (for the generic solution).

Let me think about this some more. This may be turning out to be something really useful - and, more importantly, something unique. So far I have not seen any similar program do anything like that.

Can you elaborate some more about your specific use case? Maybe that might give some more insights. I bet there are more conceiveable use cases, but since you started this discussion, you get to present yours first. 😄

@shundhammer
Copy link
Owner Author

shundhammer commented Aug 1, 2019

The first very simple support for this use case is now in git master:

We now have a column Oldest File in the tree, and a row Oldest File in the details view for directories and packages:

QDirStat-oldest-file

Both show the oldest nonzero (i.e. not time_t 0, 1970-01-01 00:00) timestamp in the respective subtree for files only. Directories, symlinks and special files (block and character devices, FIFOS etc.) are disregarded.

Since this timestamp is the same for a plain file as the mtime, non-directory items do not show anything in that column.

Of course you can also sort the tree by this column, and that makes it easy to spot old files. It's not quite as quick as with a treemap that does something smilar because you still have to open the branches to get to the old files.

By default, this column is now enabled in layouts L2 and L3. I was not entirely sure if I really like it in L2 (the default layout); for the time being, it's there. One way or the other, as with (almost) all columns, the user is free to show or hide it in each of the layouts.

Please experiment with this first version if this is useful for you.

I was mildly surprised that this already showed me some old cruft in my home directory; some dot directories from ancient versions of flashplayer and whatnot. This alone might justify having that column.

@shundhammer
Copy link
Owner Author

If you don't see that new column during testing, enable it in the context menu of the tree view column header.

@shundhammer
Copy link
Owner Author

I tried different prototypes for treemaps with colors by file age, and so far I didn't find a single one that didn't look dead ugly or that confused the hell out of me -- or both.

I tried cushion treemaps and plain treemaps. With cusion shading, the colors get too dark to be easily distinguishable; the desired effect reminiscent of tree leaves (spring green -> new; dark green -> middle age; brown -> old) just does not work with that.

Plain treemaps (without cushion shading) just get primary colors, and that's just plain ugly, and the grouping effect that the cushion shading have (to indicate what files belong together in the hierarchy) is completely lost.

With treemaps in general, users have a hard time to tell apart what they are seeing right now; is it the MIME type or the file age? So the learning curve how to interpret the treemap graphics and what to do with it gets considerably harder or is lost entirely.

@shundhammer
Copy link
Owner Author

shundhammer commented Aug 5, 2019

The best kind of shading that I could come up with so far is this (but I still don't like it very much):

age-map-proto-01

Shading base colors:

  • Default (middle age): #6ec300 (blueish grey)
  • New stuff: #00d900 (green)
  • Old stuff: #ce621a (copper)

Treemap parameters in ~/.config/QDirStat/QDirStat.conf:

[Treemaps]
AmbientLight=10
HeightScaleFactor=0.7

This only simulates the age treemap, and it only uses 3 shades. The real thing should use many more shades for old and for new stuff (ideally a continuous spectrum).

The analogy with tree leaves doesn't work well, thus the not-very-tree-like shade for the medium values. So there is another learning curve how to interpret this.

But as I mentioned before, this very likely defeats the users' learning process how to interpret the treemap in general, so right now, this entire approach appears counterproductive to me.

In the current state, this seems to be a user interface disaster.

@shundhammer
Copy link
Owner Author

So this might need a completely different kind of visualization; I don't want to wreck the treemap and the users' perception of it for this.

@shundhammer
Copy link
Owner Author

No feedback at all. Closing.

shundhammer added a commit that referenced this issue Aug 12, 2019
This reverts commit 5c1870d.
Don't overwhelm the user with information of little value;
the oldest file can be shown as a separate column in the tree view,
but in the details panel that information adds little value,
yet it contributes to the clutter in that panel.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants