Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A function that returns the right constructor given a max length #20

Closed
oxinabox opened this issue Oct 29, 2020 · 7 comments
Closed

A function that returns the right constructor given a max length #20

oxinabox opened this issue Oct 29, 2020 · 7 comments

Comments

@oxinabox
Copy link
Member

I would like a way to simplify this code.

function shorten_strings(vec::AbstractVector{<:AbstractString})
    isempty(vec) && return vec

    # chose a type big enough for everything, to avoid large union
    max_len = maximum(ncodeunits, skipmissing(vec))
    return if max_len <= 7
        map(ShortString7, vec)
    elseif max_len <= 15
        map(ShortString15, vec)
    elseif max_len <= 30
        map(ShortString30, vec)
    elseif max_len <= 62
        map(ShortString62, vec)
    elseif max_len <= 126
        map(ShortString126, vec)
    else
        vec
    end
end

or even the more complex version that support missings:

using Missings: passmissing

function shorten_strings(vec::AbstractVector{<:Union{Missing, AbstractString}})
    isempty(vec) && return vec

    # chose a type big enough for everything, to avoid large union
    max_len = maximum(ncodeunits, skipmissing(vec))
    return if max_len <= 7
        map(passmissing(ShortString7), vec)
    elseif max_len <= 15
        map(passmissing(ShortString15), vec)
    elseif max_len <= 30
        map(passmissing(ShortString30), vec)
    elseif max_len <= 62
        map(passmissing(ShortString62), vec)
    elseif max_len <= 126
        map(passmissing(ShortString126), vec)
    else
        vec
    end
end

I am not sure what the best solution is.

@xiaodaigh
Copy link
Collaborator

This is very readable, why do you want to shorten it? I would add a function that returns a type stable function, so the type stabe version can be run.

@xiaodaigh xiaodaigh changed the title A function that returns the right constructor a given max length A function that returns the right constructor given a max length Oct 31, 2020
@ScottPJones
Copy link
Member

Instead of having ss3, ss7, etc., where the number doesn't actually match the number of characters if they are not ASCII (and note, you can't even store 1 Emoji in an ss3), why not have a macro: ss"string" that will pick the shortest power of 2 sized UInt, and then use the capability of a macro taking a second argument, so that ss"string"15 if you need to explicitly specify the size (in bytes) so that if you have a set of strings, you still can make them all in the same type.

@oxinabox
Copy link
Member Author

oxinabox commented Oct 31, 2020

Cute.
Using the second argument to the string macro is a great idea.

I still want a function form.
I think a function that takes a length and returns a constructor.

@xiaodaigh
Copy link
Collaborator

ss"string"15

definitely cute.

@xiaodaigh
Copy link
Collaborator

think a function that takes a length and returns a constructor.

agreed

@ScottPJones
Copy link
Member

Yes, it should also have a simple constructor, like the ss"..." with an optional size (that would probably be how to easily implement the macro anyway)

@ScottPJones
Copy link
Member

I can easily do a version that picks one of the UInt*s used in ShortStrings, but I think you should have the ability to pick your own (possibly larger, or possibly not a power of 2 size, such as a UInt24 or UInt48). I think to handle that, what is needed is a function that, given a vector or tuple of Unsigned types, returns the smallest one that can contain the given maxlen.
That type can then be used in the ShortString{T} constructor directly.

ScottPJones added a commit that referenced this issue Nov 4, 2020
Fix #20, adds constructor given just maximum length
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants