Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[filestorage]: Provide an option to the user to create a directory #34985

Merged

Conversation

VihasMakwana
Copy link
Contributor

Description:
This PR introduces a new option that will create a directory for the user if it doesn't already exist.

Link to tracking Issue: #34939

Testing: Added

@@ -24,6 +24,8 @@ The default timeout is `1s`.

`fsync` when set, will force the database to perform an fsync after each write. This helps to ensure database integrity if there is an interruption to the database process, but at the cost of performance. See [DB.NoSync](https://pkg.go.dev/go.etcd.io/bbolt#DB) for more information.

`create_directory` when set, will create the data storage directory if it does not already exist. The directory will be created with `0755 (rwxr-x--)` permissions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make permissions configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did that. PTAL

@djaglowski djaglowski added the Run Windows Enable running windows test on a PR label Sep 3, 2024
@VihasMakwana VihasMakwana requested a review from atoulme September 3, 2024 18:07
@djaglowski
Copy link
Member

Failing windows test is unrelated. See #35002

@VihasMakwana
Copy link
Contributor Author

There's another unrelated failure due to a dead link. It reports

https://linux.die.net/man/1/strings → Status: 403

the link works fine on my browser though

@@ -65,7 +65,7 @@ func (cfg *Config) Validate() error {
for _, dir := range dirs {
if info, err := os.Stat(dir); err != nil {
if !cfg.CreateDirectory && os.IsNotExist(err) {
return fmt.Errorf("directory must exist: %w", err)
return fmt.Errorf("directory must exist: %w. You can enable the create_directory option to automatically create it.", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this whole logic is used not only for the directory configuration option, but also for the compaction::directory option. Only the validation of the directory option should be modified. The validation for compaction::directory setting should be unchanged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrzej-stencel we also create the compaction directory, if it doesn't exist. See here

if config.CreateDirectory {
var dirs []string
if config.Compaction.OnStart {
dirs = []string{config.Directory, config.Compaction.Directory}
} else {
dirs = []string{config.Directory}
}
perm, err := strconv.ParseInt(config.DirectoryPermissions, 8, 32)
if err != nil {
logger.Debug("Error while parsing permissions, switching to default mode (0750)", zap.String("DirectoryPerms", config.DirectoryPermissions))
perm = 0750
}
for _, dir := range dirs {
if err := createDirectory(dir, os.FileMode(perm)); err != nil {
return nil, err
}
}
}


On a side note, I've updated docs to mention the same

create_directory when set, will create the data storage and compaction directory if it does not already exist. The directory will be created with 0750 (rwxr-x--) permissions, by default. Use directory_permissions to customize directory creation permissions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, missed that, thanks. I suppose that's OK, thanks for updating the docs.

I see the following issue in this logic though. The compaction directory is only created when compaction::on_start is enabled, but not when compaction::on_rebound is enabled. This means that if I use the following configuration:

extensions:
  file_storage:
    compaction:
      directory: non-existent-compaction-dir
      on_rebound: true
    create_directory: true
    directory: non-existent-storage-dir 

then the collector will not create the compaction directory non-existent-compaction-dir either on start or later, when rebound compaction is being performed, resulting in an error trying to compact.

I think ideally in this case the compaction directory should be created not on start, but only if/when needed, if/when the rebound compaction is actually performed.

Alternatively, only have the create_directory option affect the root directory option, leaving space for creating a separate compaction::create_directory option controlling the creation of the compaction directory. I would be fine with either approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the following issue in this logic though. The compaction directory is only created when compaction::on_start is enabled, but not when compaction::on_rebound is enabled.

@andrzej-stencel I filed #35114 for this.

As per my understanding, we should check for compaction directory when either of the options (on_start or on_rebound) are enabled

Copy link
Contributor Author

@VihasMakwana VihasMakwana Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrzej-stencel with #35114 merged, we now create compaction directory on rebound or start.

I think ideally in this case the compaction directory should be created not on start, but only if/when needed, if/when the rebound compaction is actually performed.

Rebound compaction takes place periodically, usually every 5 seconds by default. Checking the compaction directory with every iteration would be an inefficient. What do you think?

Copy link
Member

@andrzej-stencel andrzej-stencel Sep 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebound compaction takes place periodically, usually every 5 seconds by default.

This is inaccurate. The check takes place periodically, but the compaction is only triggered when the conditions are met.

But I suppose creating the directory on collector's start when rebound compaction is enabled is OK too.

Thank you for making these changes and for the tests Vihas! 🙏

I think one last change is needed, adding OnRebound in newLocalFileStorage function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inaccurate. The check takes place periodically, but the compaction is only triggered when the conditions are met.

Yes, sorry I missed that.

extension/storage/filestorage/config.go Outdated Show resolved Hide resolved
extension/storage/filestorage/extension.go Outdated Show resolved Hide resolved
extension/storage/filestorage/config_test.go Show resolved Hide resolved
extension/storage/filestorage/config_test.go Show resolved Hide resolved
@@ -65,7 +65,7 @@ func (cfg *Config) Validate() error {
for _, dir := range dirs {
if info, err := os.Stat(dir); err != nil {
if !cfg.CreateDirectory && os.IsNotExist(err) {
return fmt.Errorf("directory must exist: %w", err)
return fmt.Errorf("directory must exist: %w. You can enable the create_directory option to automatically create it.", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, missed that, thanks. I suppose that's OK, thanks for updating the docs.

I see the following issue in this logic though. The compaction directory is only created when compaction::on_start is enabled, but not when compaction::on_rebound is enabled. This means that if I use the following configuration:

extensions:
  file_storage:
    compaction:
      directory: non-existent-compaction-dir
      on_rebound: true
    create_directory: true
    directory: non-existent-storage-dir 

then the collector will not create the compaction directory non-existent-compaction-dir either on start or later, when rebound compaction is being performed, resulting in an error trying to compact.

I think ideally in this case the compaction directory should be created not on start, but only if/when needed, if/when the rebound compaction is actually performed.

Alternatively, only have the create_directory option affect the root directory option, leaving space for creating a separate compaction::create_directory option controlling the creation of the compaction directory. I would be fine with either approach.

@VihasMakwana
Copy link
Contributor Author

@andrzej-stencel @djaglowski I've added more test coverage. PTAL!

@@ -124,7 +127,7 @@ following troubleshooting method works for bbolt-managed files. As such, there i

When troubleshooting components that use the File Storage extension, it is sometimes helpful to read the raw contents of
files created by the extension for the component. The simplest way to read files
created by the File Storage extension is to use the strings utility ([Linux](https://linux.die.net/man/1/strings),
created by the File Storage extension is to use the strings utility ([Linux](https://man7.org/linux/man-pages/man1/strings.1.html),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check-links is failing, so I'm updating this just to see if it passes.
I'll raise another PR if that's more convenient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed via #35145

@VihasMakwana
Copy link
Contributor Author

@djaglowski I think this is good to go. The prometheus test failure is a known issue.

@djaglowski djaglowski merged commit 0fb57cb into open-telemetry:main Sep 11, 2024
171 of 172 checks passed
@github-actions github-actions bot added this to the next release milestone Sep 11, 2024
jriguera pushed a commit to springernature/opentelemetry-collector-contrib that referenced this pull request Oct 4, 2024
…pen-telemetry#34985)

**Description:** 
This PR introduces a new option that will create a directory for the
user if it doesn't already exist.

**Link to tracking Issue:**
open-telemetry#34939

**Testing:** Added

---------

Co-authored-by: Daniel Jaglowski <[email protected]>
Co-authored-by: Andrzej Stencel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension/storage Run Windows Enable running windows test on a PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants