Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid byte slice parameters #9

Closed
ahouene opened this issue Jul 13, 2022 · 3 comments · Fixed by #11
Closed

Avoid byte slice parameters #9

ahouene opened this issue Jul 13, 2022 · 3 comments · Fixed by #11

Comments

@ahouene
Copy link
Contributor

ahouene commented Jul 13, 2022

As we're passing potentially large amounts of data, providing functions dealing in byte slices can lead to storing a large amount of data in memory without need.

Thus, the functions provided by simpleblob should deal only in io.Reader and io.Writer. Turning byte slices into an io.Reader and back should be the user's concern, provided they need to.
We should take advantage of the package not being spread yet and Interface being explicitly about to change to do this work.

The details remain to be determined, but it could be shaped as:

// A Storage represents a backend handling file storage.
type Storage interface {
	// Blob fetches or creates a Blob.
	// If no data with given name exists upstream, create it at write and return no error.
	// An error shall be returned in case the storage's backend fails.
	Blob(name string) (Blob, error)

	// Blobs returns all known blobs for Storage.
	Blobs() []Blob

	// Other fields probably?
}

// A Blob is an entry, which purpose is to be stored on a backend.
type Blob interface {
	io.Reader
	io.Writer
	io.Closer
	Name() string
	Size() int64
	Remove() error
	Storage() Storage
	ModTime() time.Time
}

See also:

@wojas
Copy link
Member

wojas commented Jul 18, 2022

You are right about this being inefficient for large values, but the simple key-value interface was intentional.

It allows implementing simple backends like the memory backend, redis backends and memcached backends without every backend that does not support streaming values having to implement a temporary buffer and without every access having to go through an open/read-or-write/close step.

The target project would not have blobs larger that about 10 MB before inefficiencies in other places would become an issue. Even 100 MB would be workable in many use cases, even if less efficient.

For use cases where it does matter, I propose optional backend interfaces as described in #2 and #3. Simpleblob would use that backend interface to provide an efficient streaming API when implemented, and fall back to buffering with a Store on close.

Such an approach offers the following benefits:

  • A simple backend requires minimal code to implement.
  • A client that does not work with large values, or already holds the full value in memory, can use the simple API.
  • Efficient streaming of large values is opt-in for both sides.

@ahouene
Copy link
Contributor Author

ahouene commented Jul 20, 2022

You're right, this should be optional and probably regarded as fs.GlobFS, fs.SubFS etc., allowing an optimized implementation when available.

As such, would following interfaces be more suitable?

// A ReaderStorage provides an optimized io.ReadCloser.
type ReaderStorage interface {
	// NewReader returns an io.ReadCloser, allowing stream reading of named value from the underlying backend.
	NewReader(ctx context.Context, name string) (io.ReadCloser, error)
}

// A WriterStorage provides an optimized io.WriteCloser.
type WriterStorage interface {
	// NewWriter returns an io.WriteCloser, allowing stream writing to named key in the underlying backend.
	NewWriter(ctx context.Context, name string) (io.WriteCloser, error)
}

// NewReader returns an optimized io.ReadCloser for backend if available, else a basic buffered implementation.
func NewReader(ctx context.Context, st Interface, name string) (io.ReadCloser, error)

// NewWriter returns an optimized io.WriteCloser for backend if available, else a basic buffered implementation.
func NewWriter(ctx context.Context, st Interface, name string) (io.WriteCloser, error)

The default fs.FS wrapper (#2) would benefit from this as well, calling simpleblob.NewReader behind the scenes when no fs.FS implementation exists for backend.

@wojas
Copy link
Member

wojas commented Jul 20, 2022

As such, would following interfaces be more suitable?

That looks quite clean and consistent, I like it!

The default fs.FS wrapper (#2) would benefit from this as well, calling simpleblob.NewReader behind the scenes when no fs.FS implementation exists for backend.

You are right. This way most storage backends will not even need to implement fs.FS directly.

@ahouene ahouene linked a pull request Oct 16, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants