Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recompute flag in clojure math runner is ignored, and somewhat unclear in config #1914

Open
jucor opened this issue Feb 12, 2025 · 3 comments

Comments

@jucor
Copy link
Contributor

jucor commented Feb 12, 2025

TLDR: @metasoarous , do you remember please:

  • did you you ever managed to get the recompute flag working as you wanted?
  • if so, what cases did you use it for, please?
  • or is it vestigial code that could be removed?
    Thanks !

Details:

The clojure runner for math worker has a --recompute flags it advertises, which is supposed to "recompute conversations from scratch", rather than incrementally from last known position:

(def cli-options
"Has the same options as simulation if simulations are run"
[["-r" "--recompute" "Recompute conversations from scratch instead of starting from most recent values"]

I was hoping to use it for several tests.

However, it seems that while it is parsed, it is not used.
If I understand correctly the clojure code-base (with Cursor's help, so I'm treading carefully), it should be passed to conv-man/load-or-init, but it is not:

(defn update-conv
[conv-man zid]
(try
(let [conv (conv-man/load-or-init conv-man zid)

That function itself would then (were it called with that keyword, which it isn't) take it into account here:

(defn load-or-init
"Given a zid, either load a minimal set of information from postgres, or if a new zid, create a new conv"
[conv-man zid & {:keys [recompute]}]
;; TODO On recompute should try to preserve in conv and such
(log/info "Running load or init")
(if-let [conv (and (not recompute) (db/load-conv (:postgres conv-man) zid))]
(-> conv
;(->> (tr/trace "load-or-init (about to restructure):"))
restructure-json-conv
;(->> (tr/trace "load-or-init (post restructure):"))
;; What the fuck is this all about? Should this really be getting set here?
(assoc :recompute :reboot)
(assoc :raw-rating-mat
(-> (nm/named-matrix)
(nm/update-nmat (->> (db/conv-poll (:postgres conv-man) zid 0)
(map (fn [vote-row] (mapv (partial get vote-row) [:pid :tid :vote])))))))
(conv/mod-update
(db/conv-mod-poll (:postgres conv-man) zid 0)))
; would be nice to have :recompute :initial
(assoc (conv/new-conv) :zid zid :recompute :full)))

but the surprise expressed in the comments make me think it might not be entirely working as intended...

Beyond the runner, the load-or-init is only called in one place with potentially the :recompute flag, in conv-actor:

(defn conv-actor
[{:as conv-man :keys [conversations kill-chan config]} zid]
(log/info "Starting message batch queue and handler routine for conv zid:" zid)
(let [conv (load-or-init conv-man zid :recompute (:recompute config))

depending on the configuration defined in components/config.clj, which also come with comments showing some amount of uncertainty:
;; Need to think more about the semantics of a recompute; once; always; only if not booted; etc? XXX
:recompute {:parse ->boolean
:doc "Whether or not to perform a recompute"}

This would require the RECOMPUTE environment variable to set that flag. It is not set in example.env, nor can I see any trace of it anywhere else in the codebase. Therefore if it has ever been used, it has been done manually on an instance.

So it's unclear how much the recompute feature has been used, and for what, and whether the runner should be modified to allow for its use, or on the contrary whether the recompute functionality is vestigial code that could be removed.

@jucor
Copy link
Contributor Author

jucor commented Feb 12, 2025

For context: I was (thinking I was) using this functionality when tracing computations as part of #1893, to make sure I was using only the voting matrix regardless of whatever previous state the clojure math worker may have previously stored in various SQL blobs (I haven't gotten around to understanding those blobs yet, hence wanted isolation from them).

@jucor jucor changed the title Recompute flag in clojure math runner is ignored Recompute flag in clojure math runner is ignored, and somewhat unclear in config Feb 12, 2025
@metasoarous
Copy link
Member

I do write the best code comments, don't I?

Or did...

Anyway, this setting was something we used for fully recomputing conversation state from scratch. Looking back, I think I know what it's doing there, but my comments do collectively evidence that this setting was somewhat confusingly implemented. The main thing is that the config setting is just a boolean flag, but there are actually different levels of "recompute". The :recompute reboot (IIRC) effectively just rehydrates the conversation state from the math blob export, so that's the right thing to do if (not recompute). Meanwhile, :recompute :full starts the conversation from scratch, ignoring previous PCA loadings and K-means cluster centroids, and that method gets used when either the --recompute flag or RECOMPUTE env var are set.

This wasn't something we frequently used, but was implemented after we found an issue with the math results on certain conversations, and gave us a way to "reset" the results from scratch, to therefore avoid the iterative updates continually trying to build on an effectively broken conversation state. Whether it's still necessary is up for debate. If we're keeping iterative updates, it probably makes sense to keep as a safeguard, but ideally there would be a way to only run it on a select number of conversations.

An alternative might be to just design the implementation so that nuking the old math blob triggers a full recompute. That may be easier to trigger on a case by case basis, but would still require a process of stop math worker -> remove blob -> restart math worker.

@jucor
Copy link
Contributor Author

jucor commented Feb 12, 2025

You absolutely do, they're really fun to read, and show the passion that went into the code! It's a treat :)

Thanks for all the info and the quick reply! I'll digest it as I go through the full logic. Digesting 13 years of work is an endeavour 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants