Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize creation and serialization of closures #11938

Open
amitmurthy opened this issue Jun 29, 2015 · 5 comments
Open

Optimize creation and serialization of closures #11938

amitmurthy opened this issue Jun 29, 2015 · 5 comments
Labels
parallelism Parallel or distributed computation performance Must go faster

Comments

@amitmurthy
Copy link
Contributor

amitmurthy commented Jun 29, 2015

Run

using Sockets, Distributed, Serialization

function srvr()
ls = listen(10000)
while true
    as = accept(ls)
    @async begin
        while true
            msg=deserialize(as)
            if isa(msg, Function)
                msg();
                serialize(as, ()->1)
            else
                serialize(as, msg)
            end
        end
    end
end
end

srvr()

in one process.

In a REPL

using Sockets, Distributed, Serialization

function testc_data(n)
    c = connect("localhost", 10000)
    @time for i in 1:n
        msg = serialize(c, 1)
        resp = deserialize(c)
        @assert resp == 1
    end
end

function testc_func(n)
    c = connect("localhost", 10000)
    @time for i in 1:n
        msg = serialize(c, ()->1)
        resp = deserialize(c)
        @assert resp() == 1
    end
end

Results

julia> testc_data(1000)
  40.453 milliseconds (22002 allocations: 1298 KB, 4.31% gc time)

julia> testc_func(1000)
 743.730 milliseconds (734 k allocations: 24448 KB, 0.19% gc time)

julia> testc_data(1000)
  36.762 milliseconds (22001 allocations: 1298 KB)

julia> testc_func(1000)
 741.483 milliseconds (741 k allocations: 24614 KB, 0.19% gc time)

Since the underlying model is function execution, as opposed to data messaging, just optimizing this will speed up parallel code that does a lot of remote calls.

@carnaval , originally the server code was not within a function. It is still slower but not as bad.

@vtjnash vtjnash added the priority This should be addressed urgently label Jun 30, 2015
@vtjnash
Copy link
Member

vtjnash commented Jun 30, 2015

marking this as priority, because serialization is critical to #8745

@malmaud
Copy link
Contributor

malmaud commented Jun 30, 2015

Is someone actively working on this? I might take a stab at it today otherwise.

@JeffBezanson
Copy link
Member

But doesn't #8745 use a different serializer? Also the performance of serialization itself is not important there.

@malmaud Cool, go ahead!

@JeffBezanson JeffBezanson added performance Must go faster parallelism Parallel or distributed computation labels Jun 30, 2015
@vtjnash
Copy link
Member

vtjnash commented Jun 30, 2015

it uses both

@ViralBShah
Copy link
Member

testc_data(1000) works, but seems 10x slower (although this is on a mac and I am sure the reported numbers were on linux). testc_func(1) just hangs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parallelism Parallel or distributed computation performance Must go faster
Projects
None yet
Development

No branches or pull requests

5 participants