-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add directory tracking to sync #425
Conversation
This deprecates usage of the `repofiles` package in favor of the filer package and consolidates the code paths into WSFS. Note: one potentially breaking change here is the following. If a file at `foo/bar.txt` is created and removed, the directory `foo` is kept around because we do not perform directory tracking. If subsequently we need to write a file at `foo`, it will result in an `fs.ErrExist` because it is impossible to overwrite a directory. The previous implementation performed a recursive delete of the path if this happened, where this implementation will return the `fs.ErrExist` error to the user. We can mitigate this in one of two ways: * Track directories to remove as part of a `diff` and remove them * Attempt to remove an empty directory tree if we see this error * ...?
Sync currently doesn't clean up remote empty directories. This change computes the set of directories that have been removed between on an incremental update and removes those as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, just needs integration tests for two cases:
- We delete empty directory trees on the workspace
- We do not delete if the directory tree is not empty (ie has a file in it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try this out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just some minor suggestions
@shreyas-goenka Could you take a look at the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
## Changes CLI: * Add directory tracking to sync ([#425](#425)). * Add fs cat command for dbfs files ([#430](#430)). * Add fs ls command for dbfs ([#429](#429)). * Add fs mkdirs command for dbfs ([#432](#432)). * Add fs rm command for dbfs ([#433](#433)). * Add installation instructions ([#458](#458)). * Add new line to cmdio JSON rendering ([#443](#443)). * Add profile on `databricks auth login` ([#423](#423)). * Add readable console logger ([#370](#370)). * Add workspace export-dir command ([#449](#449)). * Added secrets input prompt for secrets put-secret command ([#413](#413)). * Added spinner when loading command prompts ([#420](#420)). * Better error message if can not load prompts ([#437](#437)). * Changed service template to correctly handle required positional arguments ([#405](#405)). * Do not generate prompts for certain commands ([#438](#438)). * Do not prompt for List methods ([#411](#411)). * Do not use FgWhite and FgBlack for terminal output ([#435](#435)). * Skip path translation of job task for jobs with a Git source ([#404](#404)). * Tweak profile prompt ([#454](#454)). * Update with the latest Go SDK ([#457](#457)). * Use cmdio in version command for `--output` flag ([#419](#419)). Bundles: * Check for nil environment before accessing it ([#453](#453)). Dependencies: * Bump github.com/hashicorp/terraform-json from 0.16.0 to 0.17.0 ([#459](#459)). * Bump github.com/mattn/go-isatty from 0.0.18 to 0.0.19 ([#412](#412)). Internal: * Add Mkdir and ReadDir functions to filer.Filer interface ([#414](#414)). * Add Stat function to filer.Filer interface ([#421](#421)). * Add check for path is a directory in filer.ReadDir ([#426](#426)). * Add fs.FS adapter for the filer interface ([#422](#422)). * Add implementation of filer.Filer for local filesystem ([#460](#460)). * Allow equivalence checking of filer errors to fs errors ([#416](#416)). * Fix locker integration test ([#417](#417)). * Implement DBFS filer ([#139](#139)). * Include recursive deletion in filer interface ([#442](#442)). * Make filer.Filer return fs.DirEntry from ReadDir ([#415](#415)). * Speed up sync integration tests ([#428](#428)).
Changes
This change replaces usage of the
repofiles
package with thefiler
package to consolidate WSFS code paths.The
repofiles
package implemented the following behavior. If a file atfoo/bar.txt
was created and removed, the directoryfoo
was kept around because we do not perform directory tracking. If subsequently, a file atfoo
was created, it resulted in anfs.ErrExist
because it is impossible to overwrite a directory. It would then perform a recursive delete of the path if this happened and retry the file write.To make this use case work without resorting to a recursive delete on conflict, we need to implement directory tracking as part of sync. The approach in this commit is as follows:
Tests