-
-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using Dagger in an epidemiological analysis: visualising an implied DAG and other examples of usage #512
Comments
To add to this ideally we would be able to visualise the DAG both before and after running it - is that supported? Slightly unrelated but are you aware of any real world uses cases of relatively complex Dagger workflows? We have had a look but not been very successful. We are particularly interested in modelling our pipeline (and the dependencies between jobs) and being able to abstract parallelization without pipeline changes (so we could switch from local -> Slurm -> something cloud like azure batch) so resources in that direction would be great. |
Hey there! There is definitely a way to do this, but I realize the full example isn't documented. You can do something like this: using Dagger, GraphViz
# Enable logging, with required logging events for GraphViz
Dagger.enable_logging!(;tasknames=true, taskdeps=true, taskargs=true, taskargmoves=true)
# Run your code that uses `Dagger.@spawn` (or use a `DArray` or `DTable`)
x = Dagger.@spawn 1+1
y = Dagger.@spawn x*2
z = Dagger.@spawn x/3
fetch(z)
# Fetch and show the logs (shows up automatically in Jupyter)
logs = Dagger.fetch_logs!()
Dagger.render_logs(logs, :graphviz) However, I've noticed that the above example doesn't produce any viewable DAG, because there was an oversight in the dependency calculation that missed some task dependencies - I'm working on fixing this now, along with writing docs for all of this! There are also a number of options for the |
Thanks for getting back to us so quickly! Nice idea to include |
Each time you do You can also see the whole DAG since the start by manually combine logs from multiple
Depends on what you mean by "complex" - we have an active user who utilizes Dagger for their reinsurance pricing engine (https://www.youtube.com/watch?v=CD7U1IEXWBM), which is reasonably complicated. Our public user base is still small, but I've developed Dagger to support larger applications and use cases. If Dagger fails to work for your use case, it's likely just a bug, and is something I can help investigate and improve!
This is definitely something Dagger can support - local already works (just use |
That sounds like it would be really handy (@SamuelBrand1 is good at Julia but I have no idea so any help is useful).
Thanks for this will check it out. Something I have in packages that I think is useful is some kind of list of known uses - something like that in
We are really excited about this part of the functionality! |
Docstrings and fixes are in #513 - please give these changes a try and let me know if it doesn't work for you! |
Overall, these are not only fixes for an old API but for the modern one as well, because as for now, it looks like there was no possibility to visualize modern API DAGs due to the multiple dispatch issue described earlier and Nevertheless, there are still a few problems with visualizing a modern API, so I also tried to address them. First, current using Distributed
using Colors
using GraphViz
using Cairo
using Dagger
using FileIO
function taskA(simple_arg, dependencies...)
return "Result of task A"
end
function modernAPI_graph_setup()
a = Dagger.@spawn taskA(1)
b = Dagger.@spawn taskA(2, a)
c = Dagger.@spawn taskA(3, a, b)
return c
end
function dot_to_png(in, out, width=7000, height=2000)
dot_code = read(in, String)
graph = GraphViz.load(IOBuffer(dot_code))
GraphViz.layout!(graph)
surface = Cairo.CairoSVGSurface(IOBuffer(), width, height)
context = Cairo.CairoContext(surface)
GraphViz.render(context, graph)
write_to_png(surface, out)
end
# Configure LocalEventLog
ctx = Dagger.Sch.eager_context()
ctx.log_sink = Dagger.TimespanLogging.LocalEventLog()
graph_thunk = modernAPI_graph_setup()
fetch(graph_thunk)
logs = Dagger.TimespanLogging.get_logs!(ctx)
open("graph.dot", "w") do io
Dagger.show_logs(io, graph_thunk, logs, :graphviz_simple)
end
dot_to_png("graph.dot", "graph.png", 900, 200)
However, usage of:
or
still fails (the latter one fails for an old API too). I have also added a part of the code to infer the thunk arguments solely from the logs. Old API
Modern API
|
Nice work on fixing this @SmalRat! I'm planning to merge your PR (maybe with some tweaks that I'll suggest in the PR comments), but I also want to provide a few comments for clarity on why this code was neglected:
Regardless, great work on this, I really appreciate the hard work with figuring it all out! |
Looking forward to results of this PR! |
Hi everyone,
I'm implementing an analysis batch for an epidemiological modelling package https://github.com/CDCgov/Rt-without-renewal/tree/main/pipeline .
We're wondering if there is a functionality for graph plotting a DAG after it has been implied (e.g. by using
Dagger.@spawn
to generate a number ofThunk
s)? I've notedDaggerWebDash
but it would be much easier to parse if there was a complete tutorial example of a workflow and/or some links to example usage in the Julia community.The text was updated successfully, but these errors were encountered: