Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] get for container types returning an Option[T] #35

Closed
dom96 opened this issue Apr 2, 2018 · 25 comments
Closed

[RFC] get for container types returning an Option[T] #35

dom96 opened this issue Apr 2, 2018 · 25 comments
Labels

Comments

@dom96
Copy link
Contributor

dom96 commented Apr 2, 2018

If I'm using the options module in my program it sucks that not all modules use it. It would be nice to implement it for the containers in the stdlib, for example proc get*[K, V](t: Table[K, V], key: K): Option[V].

We should standardise on the name too, it should be consistent across the stdlib.

@Araq
Copy link
Member

Araq commented Apr 3, 2018

But I don't like Option[T], it's messy to work with.

@zah
Copy link
Member

zah commented Apr 3, 2018

Option[T] has a lot of potential for being very well integrated into the language and producing very elegant code.

1. Add existential operator:

someOpt?.foo.bar is an expression that returns Option[type(someOpt.foo.bar)]. If someOpt has a value, the field traversal will be executed, otherwise, None is returned.

2. Add existential call operator:

foo(10, someOpt?) is an expression that makes a call only when someOpt has a value. Otherwise, None is returned.

3. Special if handling:

if someOpt:
   foo(someOpt) # here `someOpt` is now resolved to the wrapped regular value
                # no need to use `.value` everywhere

@zah zah changed the title Proposal: get for container types returning an Option[T] [RFC] get for container types returning an Option[T] Apr 3, 2018
@dom96
Copy link
Contributor Author

dom96 commented Apr 3, 2018

Why add [RFC] instead of an RFC label?

@andreaferretti
Copy link

Please, do not treat options specially in the language (special if handling) - Nim is powerful enough to provide a nice syntax for options without special support. For instance, libraries (such as patty) could provide pattern matching for options

@dom96
Copy link
Contributor Author

dom96 commented Apr 4, 2018

An existential operator should extend to nil as well. I agree with @andreaferretti regarding a special if.

@Araq
Copy link
Member

Araq commented Apr 4, 2018

Option[T] is annoying, you often end up with if Some(x) and x.ignoreWhitespace.isNotEmpty. In practice it just adds one value that enforces special casing to a whole set of values which are already invalid in a larger context.

In math, 0 * 4 is 0. It's not an exceptional case, it's not if Some(x): x * 4 else: None()

Option[T] is against "data oriented design" for the lack of a better term. seq[Option[T]] needs a map operation to turn into seq[T]. if you instead return (seq[T], errors: seq[bool]) you can take the first component and continue to work with it. More efficient than a map that most likely ends up allocating and copying things around.

@alehander92
Copy link

@Araq can't you have just if x?.ignoreWhitespace.isNotEmpty ?

The problem is that currently there is no one good way to signify a value is actually optional: everything has its own way to somehow represent "value is missing" and even with ref values, you can't use reliably nil, as not nil checking is not perfect yet

@Araq
Copy link
Member

Araq commented Apr 5, 2018

It's not about syntax, you can make it as short as you want and I still wouldn't like Option[T]. Firstly, the overlap with exceptions is just too big. And we do have exceptions and raises tracking already. Secondly, Option[T] does not scale, in no language environment I'm aware of the widely used allocation operation returns Option[T] because it can fail with OOM, that would be unworkable. Not even Haskell with all its monad transformations does it. Thirdly, Option[T] lures one into the wrong designs. Try to work with a Matrix[Option[float]] and see how well that goes.

@zah
Copy link
Member

zah commented Apr 10, 2018

@Araq, it's always the responsibility of the API designer (or the data structures designer) to use the right tool for the job. If some API would be better off with a (seq[T], errors: seq[bool]) signature, this should be pointed out to its author.

But in the particular case being discussed here, the container types ought to support a way to search for a key that doesn't exist. The objective reality is that by using contains you'll have to perform the search twice (in case of success) and procs like getOrDefault may not be appropriate for types that don't have a proper default value or situations when the user must be able to detect that the value was indeed missing.

Option[T] is just one nice way to enhance such APIs with some additional sugar coming from a library or the compiler.

@bluenote10
Copy link

When it comes to error handling my best experiences stem from my Scala days where error handling is done via things like Option, Either, and Try, and I completely see why Rust decided against exceptions and did the same. I think we should not miss out on that just because of performance concerns, which in the majority of the use cases is irrelevant.

To address the performance aspect first: As long as there is a getOrDefault I can always opt out of Option[T] wrapping as an optimization. I never had a case where I had problems with either coming up with a sensible default value for getOrDefault or living with the tiny overhead of the wrapping. My Scala days were full of optimizing performance, but Option[T] wrapping was never an actual issue.

@Araq The fact that you are arguing against Option[T] with an expression like if Some(x) and x.ignoreWhitespace.isNotEmpty shows that we are not really leveraging its power in Nim. The whole point of these monads is that they are highly composable and allow a very elegant solution to nested error handling. I hope you don't mind adding a few ideas from Scala to show the benefits...

The following shows the most basic example for nested error handling, i.e., a database may or may not have a user, and a user may or may not have an address.

case class DB(handle: Int)
case class User(name: String)
case class Address(street: String)

def fetchUser(db: DB): Option[User] = Some(User("John"))
def fetchAddress(user: User): Option[Address] = Some(Address("SomeStreet"))

Now any nested lookup is fully composable via flatMap (for functions of the type A => Option[B]) or map (for functions of the type A => B):

def lookupStreet(db: Option[DB]) =
  db.flatMap(db => fetchUser(db))
    .flatMap(user => fetchAddress(user))
    .map(_.street)

// or if you perfer the 'for comprehension' syntax:
// (note that this is simply syntactic sugar for the above)
def lookupStreetFor(db: Option[DB]) =
  for (
    db <- db;
    user <- fetchUser(db);
    address <- fetchAddress(user)
  ) yield
    address.street

The obvious benefits:

  • Syntactical elegance: No nested if checks are needed as it would be necessary with traditional nested error handling...
  • Everything is fully typesafe, no way to produce null accesses.

There are also less obvious benefits of such a monadic approach because it can be extended nicely. For instance it would be possible to

  • ... easily extend the example to having fetchUser return a list of users instead of a single search result. Since the entire syntax already treats options a single-valued container, almost no code change is required to make this step. In practice this has been a big plus for me when I decided to refactor something from "single optional return value" to "zero or more return values" with almost no required changes to the affected code.
  • ... easily extend the example to having an additional information in the error case. This is where the related Try and Either monads come into play (the Either[Err, T] type allows to specify the error type Err explicitly, while the Try fixes it to exceptions). Again there almost nothing has to be changed in the examples to support this, and this was also very convenient experience during refactorings.
  • ... easily incorporate legacy (= exception throwing) code into this framework by doing something like Try(mayThrowExcpetion).toOption.

Sorry for the long post, but I just don't feel it is appropriate to discard Option[T] due to performance concerns -- there are other aspects that can make working with them very pleasurable.

@Araq
Copy link
Member

Araq commented Apr 14, 2018

The fact that you are arguing against Option[T] with an expression like if Some(x) and x.ignoreWhitespace.isNotEmpty shows that we are not really leveraging its power in Nim.

I told you you can make the syntax as short as you like and I still wouldn't buy it.

Syntactical elegance: No nested if checks are needed as it would be necessary with traditional nested error handling...

That is not true. Ok, instead of an if statement you can pass a lambda to some map operation but come on, the if is still there and you still need to care about it for correctness.

Everything is fully typesafe, no way to produce null accesses.

That is only true if you mapped every invalid value into the None state. And that's rarely done or possible between API boundaries.

Sorry for the long post, but I just don't feel it is appropriate to discard Option[T] due to performance concerns -- there are other aspects that can make working with them very pleasurable.

I don't argue about performance, Option[T]

a) is yet another way to do things.
b) does not compose/scale well. Proof: The non-existence of Matrix[Option[float]] in any kind of production setting.
c) Does not lead to more effective program logic. In practice every int that I use in my program contains plenty of values that are invalid in some bigger picture. I cannot use if some(x) and be sure it's valid good number, it becomes if some(x) and x in A..B.

Not too mention that flapMap and friends propagate the error state up the call stack very much like exceptions do...

@bluenote10
Copy link

I told you you can make the syntax as short as you like and I still wouldn't buy it.

But you are arguing by complaining that it's messy to work with and giving a syntax example which is an anti-pattern.

That is not true. [...] the if is still there and you still need to care about it for correctness.

I was referring to the if in spaggeti-code that you get in traditional nested lookups, i.e., the notorious: first lookup -> check validity -> second lookup -> check validity -> third lookup... With traditional approaches the code indentation grows linearly with the number of nested lookups. Indentation is constant for the monad approach, i.e., it does compose/scale well...

That is only true if you mapped every invalid value into the None state. And that's rarely done or possible between API boundaries.

Never had this issue in many years working with them. If you need a custom validation function just inject it into the chained functions (example below).

a) is yet another way to do things.

Why "yet another"? There is just exceptions and the standard library already provides Option[T], so the number of ways to do things is not overwhelming here.

b) does not compose/scale well. Proof: The non-existence of Matrix[Option[float]] in any kind of production setting.

What kind of proof is this? I just gave an example that demonstrates the composability of nested lookups. We are talking about "optional return values" from container lookups/searches here. I don't see what the non-existence of Matrix[Option[float]] has to do with this.

c) Does not lead to more effective program logic. In practice every int that I use in my program contains plenty of values that are invalid in some bigger picture. I cannot use if some(x) and be sure it's valid good number, it becomes if some(x) and x in A..B.

I think we are mixing up optional input with optional output here. For instance if you lookup an int from a hashtable and want to pass it to a compute function that has its own special notion of invalid ints, this can be expressed cleanly by pluggin a validation step between the two:

for (
  i <- map.get(key);                   // returns Some(Int)
  iValidated <- validate(i, A, B);     // returns Some(Int) if i in A .. B
  result <- compute(iValidated)
) yield
  result

Benefit: validate is reusable (=> composable) and can be unit tested on its own, in contrast to compound expressions like if some(x) and x in A..B.

@zah
Copy link
Member

zah commented Apr 15, 2018

This should be obvious, but the existential operator I proposed above is equivalent to the flatMap mumbo-jumbo, only much shorter and nicer.

@Araq, you still haven't addressed my comment that the API of the container types just doesn't have the same functionality without the proposed get proc.

@Araq
Copy link
Member

Araq commented Apr 15, 2018

What kind of proof is this?

Obviously a hand wavy one. ;-) But this was not the first time I mentioned these example. Consider a version of Scala where out-of-memory is dealt via Option[T]. By your argument it would work out just fine. I'm not buying that.

@Araq
Copy link
Member

Araq commented Apr 15, 2018

@Araq, you still haven't addressed my comment that the API of the container types just doesn't have the same functionality without the proposed get proc.

That can also be done with a template/macro that takes an onHit handler. And that is even more efficient then. It also almost never comes up in my code, getOrDefault is just fine, the only problem with it is its somewhat long name. ;-)

@dom96
Copy link
Contributor Author

dom96 commented Apr 21, 2018

That can also be done with a template/macro that takes an onHit handler. And that is even more efficient then. It also almost never comes up in my code, getOrDefault is just fine, the only problem with it is its somewhat long name. ;-)

The problem with these as far as I can see is that they do not compose well.

@zah
Copy link
Member

zah commented Apr 23, 2018

Can we agree to add the low-level efficient interface with the onHit template? Then everyone will be able to implement whatever interface they want on top of this.

@Araq
Copy link
Member

Araq commented Apr 23, 2018

Can we agree to add the low-level efficient interface with the onHit template?

Fine with me.

@dom96
Copy link
Contributor Author

dom96 commented Apr 23, 2018

What would that look like?

@zah
Copy link
Member

zah commented Apr 30, 2018

Turns out that the required work-around already exists as a template called withValue, although it has some flaws:

import
  options, tables

var tbl = initTable[string, int]()

tbl["foo"] = 10
tbl["bar"] = 20

proc getOpt[A,B](tbl: var Table[A, B], key: A): Option[B] =
  tbl.withValue(key, value) do:
    return some(value[])
  do:
    return none(B)

var
  o1 = tbl.getOpt("foo")
  o2 = tbl.getOpt("baz")

echo o1
echo o2

Currently, the injected value symbol from the template is a pointer type. It would have been nicer if it was a regular var location. This can be implemented by defining value as a template that performs the dereferencing.

@zah
Copy link
Member

zah commented Apr 30, 2018

Also, it would be nice if the template can work both as a statement and as an expression. This can be achieved by setting the result type to untyped. Then it would be possible to use it like this:

var x = tbl.withValue("foo", value, some(value), none(int)) 

var y = tbl.withValue("foo", value) do:
  some(value)
do:
  none(int)

@zah
Copy link
Member

zah commented Apr 30, 2018

Third flaw is that withValue is defined only over var Tables.

@PMunch
Copy link

PMunch commented Jul 20, 2018

I've added a kind of existential operator and a match procedure that works to easily unwrap options @zah. nim-lang/Nim#8358

@narimiran narimiran transferred this issue from nim-lang/Nim Jan 2, 2019
@arnetheduck
Copy link

Here's an example of what this syntax can look like with nim today:

https://github.com/status-im/nim-eth/blob/4492bae6a2eb71a62c794f09ba8b4cf7a2be9090/eth/keys/secp.nim#L173

proc fromHex*(T: type SkSecretKey, data: string): SkResult[T] =
  T.fromRaw(? seq[byte].fromHex(data))

this does early return in case of error in fromHex and passes it up the stack - making the potential for failure visible at the call site but in a minimally intrusive way.

@github-actions
Copy link

This RFC is stale because it has been open for 1095 days with no activity. Contribute a fix or comment on the issue, or it will be closed in 7 days.

@github-actions github-actions bot added the Stale label Jun 18, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants