Add handler for new lmi-dist #1595

rohithkrn · 2024-03-02T02:20:30Z

Description

Added lmi_dist_v2_rolling_batch.py which has implements rolling batch for rubikon engine. The implementation is similar to vllm rolling batch, hence created vllm rolling batch base class that will be shared by vllm and rubikon engine. As a result, had to refactor exisiting vllm rolling batch.

davidthomas426 · 2024-03-06T01:56:47Z

engines/python/setup/djl_python/rolling_batch/lmi_dist_v2_rolling_batch.py

+
+        :return: The same parameters dict, but with VLLM style parameter names.
+        """
+        parameters.pop('seed', None)


seed is supported by vllm now: vllm-project/vllm#2514

okay, will remove. In this PR, I kept what vllm rolling batch does currently and wanted to tune params in the next PR.

davidthomas426 · 2024-03-06T01:58:35Z

engines/python/setup/djl_python/rolling_batch/lmi_dist_v2_rolling_batch.py

+        :return: The same parameters dict, but with VLLM style parameter names.
+        """
+        parameters.pop('seed', None)
+        parameters.pop('do_sample', None)


Shouldn't do_sample=False map to temperature=0, basically? vllm does support greedy, it just uses temperature to accomplish that.

As I said above, I used vllm config here. But it's a good point, I believe this is removed for vllm because it doesn't support do_sample parameter whereas lmi-dist should (for backwards compatibility) and we should set default sampling params

Sure, we can punt on this, but I think this is another point to bring up about how consistent of an interface we want to provide across engine/backend, vs. how closely the interface should change to match each engine/backend.

engines/python/setup/djl_python/rolling_batch/vllm_rolling_batch.py

engines/python/setup/djl_python/rolling_batch/lmi_dist_v2_rolling_batch.py

davidthomas426 · 2024-03-06T02:24:41Z

engines/python/setup/djl_python/rolling_batch/vllm_rolling_batch.py

-            self.request_cache.pop(key)
-
-        return self.postprocess_results()
+        return random_uuid()


Is there a reason not to just be consistent and use req.id here as well?

I was not sure if there's a specific reason why req.id is not used, I asked internally but did not get a response so kept it as is.

davidthomas426 · 2024-03-06T02:26:38Z

engines/python/setup/djl_python/rolling_batch/vllm_rolling_batch_base.py

+
+    def _record_speculative_decoding_metrics(self, request_output, req_id):
+        completion_output = request_output.outputs[0]
+        if self.engine_config.record_acceptance_rate and request_output.finished and completion_output.acceptance_history:


Doesn't this need a hasattr guard somewhere to work with vanilla copy of the library?

FYI, i did not write this, just moved existing code to a function for better readability.

Do we anticipate multiple versions of the library in the container?

This is in the base class, is it not? I anticipate a vanilla copy of the library in the container. I've been very consistent in saying this. But, who knows?

'hasattr' would make it work either way. The current code will fail if the container has vanilla copy of vllm.

Yeah, this is in the base class, it was part of inference function in vllm rolling batch. Agree that this needs update to support vanilla vllm which needs updates in other places anyway like imports etc. Keeping as is as this not related to this specific change.

davidthomas426

I still think some things should be different here:

I think our code should be remain compatible with vanilla install of vllm without throwing errors
I think "lmi_dist" should route to the installed lmi_dist, and the handler should figure out whether that is v1 or v2 and act accordingly.

However, I'm ok with punting on some of this to a separate PR.

lanking520 · 2024-03-11T16:32:42Z

engines/python/setup/djl_python/properties_manager/vllm_rb_properties.py

            raise AssertionError(
                f"Need python engine to start vLLM RollingBatcher")
-        return engine
+
+        if rolling_batch == RollingBatchEnum.lmidist_v2 and engine != "MPI":


Could we just have individual parameter checking for each rolling batch implementation?

lanking520 · 2024-03-11T16:33:00Z

engines/python/setup/djl_python/huggingface.py

@@ -96,6 +96,9 @@ def get_rolling_batch_class_from_str(rolling_batch_type: str, is_mpi: bool,
    elif rolling_batch_type == "lmi-dist":
        from djl_python.rolling_batch.lmi_dist_rolling_batch import LmiDistRollingBatch
        return LmiDistRollingBatch
+    elif rolling_batch_type == "lmi-dist-v2":


why not just replace lmi-dist?

this will be done soon once the wheel is built and added to the container.

engines/python/setup/djl_python/rolling_batch/lmi_dist_v2_rolling_batch.py

lanking520

Here are the following two stuff I suggested to work on

Clean the property and keep vLLM and LMI-Dist V2 separately. Or provide a way to allow different default configuration in the vLLM/Rubikon Engine settings.
We can try to keep the common parts of different rolling_batcher together. But this needs some further clean up. Maybe instead of making a base class, just create a utils class to store some commonly shared functions. The step and other logic may have change with the growing of vLLM versions so keep function sharing is safer
Replace default LMI-Dist V1 class with V2 content

lanking520 · 2024-03-11T16:45:04Z

engines/python/setup/djl_python/rolling_batch/vllm_rolling_batch_base.py

+            record["output_size"] = len(completion_output.token_ids)
+            logging.info(f"Speculative Decoding {record}")
+
+    def _is_t5_with_lmi_dist(self):


Why this shows up in a base function? Could we make it LMI-Dist V2 only?

rohithkrn · 2024-03-11T17:45:14Z

I will create a follow-up PR to address these concerns.

rohithkrn requested review from zachgk, frankfliu and a team as code owners March 2, 2024 02:20

rohithkrn added 3 commits March 6, 2024 00:43

add handler for new lmi-dist

807c754

refactor and clean up

980ce9e

format python

e63eaaf

rohithkrn force-pushed the re_handler branch from a5eae08 to e63eaaf Compare March 6, 2024 01:13

rohithkrn changed the title ~~[WIP]Add handler for new lmi-dist~~ Add handler for new lmi-dist Mar 6, 2024

rohithkrn added 6 commits March 6, 2024 02:02

fix commented lines

b0939f8

fix typo

a03867d

refactor engine reset function

161b814

fix typo

f2821b0

support t5

445daef

format python

4d5332e

davidthomas426 requested changes Mar 8, 2024

View reviewed changes

davidthomas426 approved these changes Mar 9, 2024

View reviewed changes

rohithkrn merged commit 82d2bae into deepjavalibrary:master Mar 11, 2024
8 checks passed

lanking520 reviewed Mar 11, 2024

View reviewed changes

rohithkrn deleted the re_handler branch March 11, 2024 17:47

rohithkrn mentioned this pull request Mar 12, 2024

Refactor vllm and rubikon engine rolling batch #1623

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add handler for new lmi-dist #1595

Add handler for new lmi-dist #1595

rohithkrn commented Mar 2, 2024 •

edited

Loading

davidthomas426 Mar 6, 2024

rohithkrn Mar 8, 2024

davidthomas426 Mar 6, 2024

rohithkrn Mar 8, 2024

davidthomas426 Mar 8, 2024

davidthomas426 Mar 6, 2024

rohithkrn Mar 8, 2024

davidthomas426 Mar 6, 2024

rohithkrn Mar 8, 2024

davidthomas426 Mar 8, 2024

rohithkrn Mar 10, 2024

davidthomas426 left a comment

lanking520 Mar 11, 2024

lanking520 Mar 11, 2024

rohithkrn Mar 11, 2024

lanking520 left a comment •

edited

Loading

lanking520 Mar 11, 2024

rohithkrn commented Mar 11, 2024

Add handler for new lmi-dist #1595

Add handler for new lmi-dist #1595

Conversation

rohithkrn commented Mar 2, 2024 • edited Loading

Description

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

davidthomas426 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lanking520 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rohithkrn commented Mar 11, 2024

rohithkrn commented Mar 2, 2024 •

edited

Loading

lanking520 left a comment •

edited

Loading