Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow struct method extensions #1170

Closed
lerno opened this issue Jun 28, 2018 · 40 comments
Closed

Allow struct method extensions #1170

lerno opened this issue Jun 28, 2018 · 40 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@lerno
Copy link

lerno commented Jun 28, 2018

I would like to suggest that struct methods could be defined outside of the struct and even in different source files.

For example, assume we have a Vec3 struct, defined like this:

const Vec3 = struct {
   x: f32,
   y: f32,
   z: f32,

   pub fn init(x: f32, y: f32, z: f32) Vec3 {
     return Vec3 {
        .x = x,
        .y = y,
        .z = z,
     };
  }

  pub fn dot(self: *const Vec3, other: *const Vec3) f32 {
    return self.x * other.x + self.y * other.y + self.z * other.z;
  }
};

The proposal is that one could define extensions in other files like this:

pub fn Vec3.cross(self: *const Vec3, other: *const Vec3) f32 { ... }

One would obviously need to discuss how (if) this should be supported for generic structs.

This is not UFCS ( discussed in #148 ), but much more narrow change that might be a better fit for Zig.

@andrewrk andrewrk added this to the 0.4.0 milestone Jun 28, 2018
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jun 28, 2018
@binary132
Copy link

binary132 commented Jun 29, 2018

Is it feasible to do this now via some two-step comptime function?

  1. Compose several types or methods together into some comptime collection in a few comptime invocations.
  2. Make a comptime call to elaborate that collection into a type.

(Please excuse this rambling comment, I elaborated on a few ideas and it's a little bigger than I intended now!)


comptime or @struct approach?

What about solving this with a comptime namespace, maybe via macro, or some kind of syntactic sugar, which would desugar or elaborate to a single struct namespace?

Or, maybe a simpler approach to multi-file namespace extension could work.

I'm imagining something a little like this:

vec.zig: @struct tells the compiler to find all @struct in the local tree for the name "Vec" before assignment to Vec3. Maybe specify filenames explicitly?

const Vec3 = @struct(Vec, {
  x: f32,
  y: f32,
  // other stuff
})

vec_dot.zig: Add to the namespace named Vec before the assignment to Vec3 in vec.zig.

@struct(Vec, {
  pub fn dot(self: ....)
})

Here's another way it could be done.

Allow a block to be named and elaborated in multiple files one way or another. A call to @block without a body elaborates the block named in all the other @block expansions into a single inline block.

vec.zig: use a named block.

const Vec3 = struct @block(Vec)

@block(Vec, {
  x: f32,
  y: f32,
})

vec_dot.zig: Expand Vec with a method.

@block(Vec, {
  pub fn dot(self: ...)
})

@thejoshwolfe
Copy link
Contributor

What's the motivation for this? Why not just cross(v1, v2);?

@isaachier
Copy link
Contributor

@thejoshwolfe I think it's about OOP-ness of the language. The only popular language I know to allow this is C# with its partial classes. My guess is that implementing this sort of thing involves potential slowdowns (vtable?) that aren't worth the benefits.

@lerno
Copy link
Author

lerno commented Jun 29, 2018

To resolve the inconsistency and tension between utility methods and packaged methods. The example here illustrates that we have v1.dot(v2) but then cross(v1, v2) (or more probable: vec3_cross(v1, v2) )

So essentially when you want to extend the operations on a struct (that is also tightly coupled to the struct itself) you need to use a function, whereas the struct author can use a struct method.

I want to argue that this is inconsistent and confusing. As I see it the struct methods are merely namespaced functions with slightly easier discovery than a free function.

In particular I feel that this will make the standard library much more useful and reusable.

@lerno
Copy link
Author

lerno commented Jun 29, 2018

@isaachier No, this is all resolved at compile time.

@isaachier
Copy link
Contributor

OK my fear here is that importing a different set of packages effectively gives you a different class. In order to use someone else's code with such a struct, you must determine which files you need. I believe a similar issue exists with Rust traits.

@lerno
Copy link
Author

lerno commented Jun 29, 2018

@isaachier I suggest that these functions act like any other function and need to be explicitly imported into the namespace.

Consequently, added methods would be local to the context into which they are added, but would not pollute the outer context.

So if you use two libraries, one which internally uses the function Vec3.foo(), and another which internally uses Vec3.bar(), you would see none of them unless they are public AND you explicitly pull them into your current context.

@lerno
Copy link
Author

lerno commented Jun 29, 2018

One more note: since all of this essentally is syntactic sugar, you could have libraries A and B, both defining Vec3.cross - and if you want import them into your scope with different names e.g. Vec3.cross_a and Vec3.cross_b

@bnoordhuis
Copy link
Contributor

My guess is that implementing this sort of thing involves potential slowdowns (vtable?) that aren't worth the benefits.

Switching from a closed-for-modification to open-for-modification assumption inhibits some optimizations that you don't get back until you turn on LTO, and don't get back at all with shared libraries. Feature requests for a final modifier that prevents extension are bound to come up. :-)

@lerno
Copy link
Author

lerno commented Jun 29, 2018

@bnoordhuis How do you mean?

pub fn Vec3.cross(self: *const Vec3, other: *const Vec3) f32 { ... }

Would be similar to defining:

pub fn Vec3__cross(self: *const Vec3, other: *const Vec3) f32 { ... }

The compiler would simply expand Vec3 struct v1, v1.cross(v2) into Vec3__cross(v1, v2). It is syntactic sugar only and it only expands in your current context.

So basically, when the compiler comes across v1.<some_method>(v2):

  1. Check if v1 type is known
  2. If so, expand the name into __<some_method>, and set v1 as the first argument
  3. Check if this function exists, if so rewrite as that function.

When the compiler comes across fn .( *x, ... ):

  1. Check that type1 == type2 (otherwise throw parse error)
  2. Rewrite the function name as ...( *x, ...);

Hope that this makes things clearer: this is not related to vtables or OO. There is no dispatch on the first argument.

@bnoordhuis
Copy link
Contributor

I assumed your proposal implies access to non-public methods and properties of the struct. That blocks off most closed-world optimizations.

If that isn't the proposal, there is no performance issue, just the philosophical issue of whether it aligns with zig's design goals.

@lerno
Copy link
Author

lerno commented Jun 29, 2018

@bnoordhuis Only public methods and properties of the struct should be allowed (exactly what you would get from a free function).

Allowing non-public method/property access would be a very, very bad idea – not just for performance reasons.

For a direct example, today File has the method pub fn close(self: *File) void

Let's say we wanted a "closeIfOpen" method, that only performed the close if the file was open. With this proposal we could write this as:

pub fn File.closeIfOpen(self: *File) bool {
  if (self.isOpen()) return false;
  self.close();
  return true;
}

We can imagine convenience functions for reading an entire file as a string or data e.g. "file.readAll()": convenient, but hardly library functionality that should be in the std lib.

@PavelVozenilek
Copy link

The proposal exists because stdlib is perceived as untouchable. It "solves" the problem by making the situation messier.

What would be ideal solution?

  1. Do you want to use some part of stdlib? Copy relevant source files into your project.
  2. Do you want to add a convenience function? Write it inside the copied source, in its proper place.

This assumes that stdlib is properly organized: good filenames, not a collection of tiny files.


Once there was related wish: to be able to replace certain functions (in regular code, outside mock testing context).

#500 (comment)

IMO it has the same solution.

@lerno
Copy link
Author

lerno commented Jun 30, 2018

@PavelVozenilek no, not really. Other usecases:

  1. Put domain specific code in another file. E.g. A graphics library converts Vec3 into a a different struct to interface with C code might add - if included - v1.toGL(). Similarly: file.loadAsJson() - from a Json library, file.loadAsXml() from an XML library, file.loadAsYaml() from a Yaml library etc.

  2. Not putting all functionality in the same file to avoid certain struct definitions from being way too big (ObjC extensions, slightly related, is often used this way within the libraries)

  3. The utility function may be unsafe / unsuitable for inclusion with the standard lib, but reasonable within the domain.

@raulgrell
Copy link
Contributor

raulgrell commented Jul 4, 2018

I'm not sure it's valuable to be able to define a struct over several files. We'd only need a way to be able to call user defined functions on a struct with method call syntax. It might also be useful to define variables in that struct's namespace. We probably don't want to be able to add fields.

We also don't want to affect method resolution all over the place - we should be able to affect only a single file or a clearly defined import chain.

I think use is a good candidate as a keyword for this purpose. At the moment it is the only thing that can implicitly change your 'local namespace'. My reasoning is that if someone finds a Vec3.cross() call but cannot find cross() in Vec3, they should be able to look for a use in the file they are in or a file they imported

// myMath.zig
pub const Vec3 = @import('somelibrary").Vec3;

pub use Vec3 {
    const Up = Vec3{ .x = 0, .y = 1, .z = 0};
    pub fn cross(self: *const Vec3, other: *const Vec3) f32 { ... }
}

I could import Vec3 from mymath.zig instead of somelibrary.zig in order to get the "extended type".

A more general solution is the following:

// myMath.zig
pub const Vec3 = @import('somelibrary").Vec3;

pub const ExtendedVec3 = Vec3 use {
    pub fn cross(self: *const Vec3, other: *const Vec3) f32 { ... }
}

pub const NotExtendedVec3 = ExtendedVec3 use {
    // Redeclaration error
    pub fn cross(self: *const Vec3, other: *const Vec3) f32 { ... }
}

Where A use B basically means add the functions in B to A

@bjornpagen
Copy link

I am worried that the proposals of @lerno and @raulgrell do nothing but add unwanted complexity to Zig.

I would like to suggest that struct methods could be defined outside of the struct and even in different source files.

Please elaborate why one would ever need to do this. If the programmer doesn't like a struct, they roll their own. Otherwise we have people writing horrible spaghetti code all over the place, adding weird extensions to structs in the std, or something like that.

What's the motivation for this? Why not just cross(v1, v2);?

This, very much so. Just because structs have local namespaced functions, doesn't mean we should start abusing it by adding superfluous features.

I think use is a good candidate as a keyword for this purpose.

I'm afraid this still has the same problem. Allowing a programmer to just easily change APIs of libraries they can't touch encourages bad programming. A language should not encourage bad programming.

@ghost
Copy link

ghost commented Jul 14, 2018

this would cover it #1214

you just make a new struct NEW and embed the old one OLD, plus add your methods
NEW.x

If zig has interfaces and all functions (foo( i : some_interface_of_OLD)) that take the Interface that struct OLD satisfies they will all work with struct NEW as well with no work.
Just how much better the language would be with interfaces and struct embedding is staggering IMO 😄 .

@lerno
Copy link
Author

lerno commented Jul 14, 2018

Not as elegant as extending, but a working solution. Also, embedded structs is something I'm all for.

@binary132
Copy link

It is really convenient, and often helpful to readability, to define the methods of a type across multiple files in a module. Since the type is simply acting as a namespace for its methods and contained values, isn't the real question namespace extension?

I.e., while it is useful to define a new namespace to contain extensions to an existing namespace, and that should be supported cleanly (yay Go embedded structs), it isn't the use-case of this issue, is it?

Unless you're really proposing that if a type wants to be defined across multiple files, it should be composed out of a few structs, one from each of those files?

@ghost
Copy link

ghost commented Jul 15, 2018

It is really convenient, and often helpful to readability, to define the methods of a type across multiple files in a module.

not sure I agree, I think I don't

for me this issue would be an edge use case and embedded struct would be good enough

@lerno
Copy link
Author

lerno commented Jul 15, 2018

Using multiple files gives a lot more flexibility. Extensibility suffers otherwise.

@BarabasGitHub
Copy link
Contributor

BarabasGitHub commented Jul 15, 2018

You can still make new functions in another file for the same struct. They'll just not be in the same namespace. (As long as you don't need anything private.)

@lerno
Copy link
Author

lerno commented Jul 15, 2018

This is a common issue in Java:

string.toUpper();
string.toLower();
string.split(":");
MyStringUtils.splitOnCharacter(string, ':')
MyEmailUtils.isValidAsEmail(string);

As opposed to:

string.splitOnCharacter(':') // From my own string lib.
string.isValidAsEmail(); // Available if I import my email lib.

@ghost
Copy link

ghost commented Jul 15, 2018

You can just pass string as an argument rather than defining the method on the string class/ Struct.
Not a big deal, go does this all the time in the standard library and its easy to read.

@BarabasGitHub
Copy link
Contributor

It's actually one reason why I don't like methods on classes and such. I prefer to have free functions. But that does get harder with duck typing and such and becomes pretty much impossible if you don't have function overloading.

@ghost
Copy link

ghost commented Jul 15, 2018

@BarabasGitHub
As long as they live in different namespaces there is no issue. And they should live in different namespaces otherwise no one knows what is currently used. This is precisely why overloading is bad because you never know if your split or the standard split is called, this makes things really hard to read.

@BarabasGitHub
Copy link
Contributor

BarabasGitHub commented Jul 15, 2018

@monouser7dig sure, I don't disagree with namespaces, but how do you make a generic function if you don't have methods on a struct to call and no function overloading?

(I'm not saying we should have function overloading. I'm wondering how to solve that issue if your functions aren't on the object itself.)

@bjornpagen
Copy link

bjornpagen commented Jul 15, 2018

It's actually one reason why I don't like methods on classes and such. I prefer to have free functions.

This. But it's way too hard to change this anymore in Zig. :/
But as long as struct namespaces are not abused, it should be fine. Being able to extend struct namespaces probably is considered abuse.

Using multiple files gives a lot more flexibility. Extensibility suffers otherwise.

Wrong. Andrew says this: #1239 (comment)

@lerno
Copy link
Author

lerno commented Jul 15, 2018

@monouser7dig For me personally at least this is a problem. If you have everything as free functions then everything is fine, we have:

stringUpper( ... )
stringLower( ... )

And then you can add your function

 stringIsValidEmail( ... )

Discoverability is fine, because you assume all functions relating to string shares a common prefix.

But if you then decide to allow "functions on structs" you kill that uniformity. Basically it comes down to naming. If the one designing struct Foo uses struct functions for all calls, then what should I use as naming standard?

Ok, so I pick one, like stringisValid( ... ), but then I use another library, which uses isStringEmail( ... ) or stringUtilsIsEmail( ... ). Consistency and discoverability suffers. Sure, I can start renaming every function, but wouldn't it to simplify consistency?

@karlguy The comment on #1239 is not relevant. Let's say we create a generic 3D vector: Vector3. We then create a few methods to it, with conversion etc. We're on OS X, so we want to use both Metal and (the now deprecated) OpenGL. Consequently, we want a vector3.toGL() and vector3.toMetal() for each conversion (let's assume they are different).

Now I obviously don't want to include both the GL and Metal code in my Vector3 source file, because I want to localize the actual implementation to the subsystems that actually do the rendering.

If there is no way to borrow the namespace, then obviously I'd have to use vector3ToGl(...) and vector3ToMetal(...), but I want to demonstrate that even if I own the struct itself, it's not obvious that I want to place all its functions in the same file.

@binary132
Copy link

binary132 commented Jul 15, 2018

I think there are multiple, differing use-cases being discussed here.

  1. Cross-file namespace extension in the defining module 👍
  2. Namespace extension outside of the defining module 👎
  3. Readability in single-file namespace (are you really telling me that a single, large, complex type should all be defined in a single large file for readability's sake?? @monouser7dig 🤣 and the alternative is worse. You'd never split File into readFile and writeFile -- at least, I hope you wouldn't.)

I am talking about C++ style definition of names in a namespace across multiple files in a single folder. For example, "read" and "write" might be implemented on a "file" struct in two files, read.zig and write.zig in the same file directory, where the file type is being defined. Both read.zig and write.zig use the namespace keyword and Class::Method syntax to define implementations of the type's methods. This should be supported somehow.

file
├── file.zig
├── read.zig
└── write.zig

I definitely do not favor extending a pre-existing / imported namespace with new methods (string.MyStringMethod()). That is bad. Aliasing or embedding is perfect for this. In Go I would just do something like this:

(of course, in practice I would just write a func(string) instead of expanding the string name)

type MyString string // or struct { internal string } or whatever

func (m MyString) Reverse() string { return /* do stuff */ }

Note that in Rust you can add new methods to existing types as traits, and I think that is also a separate conversation. For example, you could define a "Buffer" method, which operates on any implementation of a "Read" trait. Then, by importing the definition of your Buffer method, you get to call Buffer on any Read in your local scope.

It's weird and complicated to use, but arguably powerful. However, this is dealing with generics, and I think it's a totally separate issue. I personally would probably not favor it -- just wrap the generic type in a struct defining your extension method.

@lerno
Copy link
Author

lerno commented Jul 15, 2018

From my perspective it's been a powerful tool.

What's a bit weird for me here is that this is a "library user" functionality. It's a way for me as user of a library to make it fit seamlessly with the rest of my code. And if I don't like it I don't need to use it. The idea is still that this would have to be explicitly imported, so if I use some lib that someone else wrote I would not get their struct extensions unless I wanted to.

That this freedom is so frowned upon here is a bit disheartening, so I'm just going to quietly take a step back here.

@lerno lerno closed this as completed Jul 15, 2018
@bjornpagen
Copy link

bjornpagen commented Jul 15, 2018

@lerno
Very valid example, sorry for not seeing that. I think you are completely right, and I agree currently Zig is currently in the fault here by not letting you manipulate the namespace of the struct. This is actually making me think (no troll) that functions in a struct are a bad idea, and we should talk about it.

Why do structs even have function namespaces in the first place? It incentivises people to not use functions like vector3ToGl(). A struct is nothing but a data structure, and forcing people to decide how or where they put their interfacing functions is a waste of time (for example this github issue).

The freedom here kills inconsistancy, and violates zig zen bullet 4. I want a solution to this issue as much as you do, and extending namespaces across files just feels like the wrong way to do it.

And of course my opinion here is going to be unpopular, but I see no way out of this hole otherwise.

EDIT: If no one tries to kill me for this idea in a bit, I'm going to try to open a github issue on this.

@binary132
Copy link

binary132 commented Jul 15, 2018

Well, I still want to be allowed to define a namespace across multiple files in the same definition module. 😛

Also @lerno, you might be right that it's fine as a library user, but what happens when a library author starts extending string?

@isaachier
Copy link
Contributor

There already is namespace extension. Look at the use of index.zig files throughout the std directories.

@BarabasGitHub
Copy link
Contributor

@binary132

Note that in Rust you can add new methods to existing types as traits, and I think that is also a separate conversation. For example, you could define a "Buffer" method, which operates on any implementation of a "Read" trait. Then, by importing the definition of your Buffer method, you get to call Buffer on any Read in your local scope.

You can do this in Zig as well right? Unless I misundertand. You can write a generic function that calls Read on the object that you pass in.

@andrewrk
Copy link
Member

@lerno may I keep the issue open until I have fully evaluated it? Admittedly my attention has been elsewhere during this discussion.

@binary132
Copy link

binary132 commented Jul 16, 2018

@BarabasGitHub, not exactly: http://xion.io/post/code/rust-extension-traits.html

The point of this is that replace is not a method in Option, and that this does not touch any concrete type.

Relevant code from the blog post:

impl<T> OptionMutExt<T> for Option<T> {
    fn replace(&mut self, val: T) -> Option<T> {
        self.replace_with(move || val)
    }
    // ...
}

Because opt implements Option, and OptionMutExt is implemented on Option, replace is now in opt's method set:

use ext::rust::OptionMutExt;  // assuming you put it in ext/rust.rs

// ...somewhere...
let mut opt: Option<u32> = ...;
match opt.replace(42) {
    Some(x) => debug!("Option had a value of {} before replacement", x),
    None => assert_eq!(None, opt),
}

Not that I'm in love with this syntax, but it shows how a method set can be extended without actually touching the method set of a particular concrete type. Maybe something to think about if Zig adds a static interface concept.

@andrewrk
Copy link
Member

Here's what this comes down to for me:

This is a common issue in Java:

string.toUpper();
string.toLower();
string.split(":");
MyStringUtils.splitOnCharacter(string, ':')
MyEmailUtils.isValidAsEmail(string);

We have a fundamental disagreement here. I find this superior, as it's completely clear where all the identifiers are coming from. I'm strongly in favor of keeping the language simple, and allowing patterns such as the above code to emerge.

@ghost
Copy link

ghost commented Apr 22, 2020

Here's what this comes down to for me:

This is a common issue in Java:

string.toUpper();
string.toLower();
string.split(":");
MyStringUtils.splitOnCharacter(string, ':')
MyEmailUtils.isValidAsEmail(string);

We have a fundamental disagreement here. I find this superior, as it's completely clear where all the identifiers are coming from. I'm strongly in favor of keeping the language simple, and allowing patterns such as the above code to emerge.

Struct method extensions with typedef #5132 would give flexibility as to where method syntax can be used, without creating the problem that is brought up here.

The method is either in the base struct, or in the typedef. There are only these two possibilities, so methods cannot be scattered around in different scopes and files.

const Td_u32 = typedef(u32, .Alias) {
   fn method(self: @This()) @This() {
      return (self + 1);
   }
  
   fn init(value : u32) @This() {
      return @as(@This(),value);
   }
}

test "" {
    const my_u32 = Td_u32.init(0); // access top level declaration of Td_u32 namespace
    _ = my_u32.methodCall(); // works.
    // typedef namespace contained a method, 
    // irrespective of base type being a primitive
}

Or to take the example that was presented, and how it would look with typedef.

string.toUpper();
string.toLower();
string.split(":");

 // MyString is typedef wrapping the type of string
const mystring : MyString = string; // coerce to the typedef
mystring.splitOnCharacter( ':') // is a function in MyString
mystring.isValidAsEmail(); // is a function in MyString

@zjhken
Copy link

zjhken commented Mar 23, 2023

I think this proposal's concept is more like the extend function in Kotlin.
It's just a syntax sugar for the function that takes the first argument as the specified struct.
So of course user should import that function in order to use it.
and so it will not add any complexity, just like the normal function.

with this feature, we can have powerful builder type API like:

"unknow_string".lowerCase()
    .trimAfter(":")
    .isEmail();

without this feature, we have to

isEmail(trimAfter(LowerCase("unknow_string")));

Rust doesn't have this feature, so people made a lot of ugly(I think) XXXExt type just want to run that function after a dot.

@andrewrk Hence I highly suggest to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests