-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: How to use vLLM with Tensor
input (customized tokenizer).
#3655
Comments
I believe if you implement tokenizer class that works with
Line 118 in 3492859
|
✨Thanks for your reply! It appears that my issue aligns closely with the following discussions:
Our tokenizer is actually a simple class BaseOrderTokenizer(nn.Module):
"""Tokenizer for order info."""
def __init__(
self,
max_order_index: int,
emb_dim: int,
num_max_orders: int,
) -> None:
super().__init__()
self.max_order_index = max_order_index
self.num_max_orders = num_max_orders
self.emb_dim = emb_dim
def forward(self, features: Tensor) -> Tensor:
raise NotImplementedError() Essentially, it's an embedding layer. While I can implement a Moreover, the initialization of |
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you! |
This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you! |
Your current environment
How would you like to use vllm
Hello,
I am currently working on a finance-related Large Language Model (LLM) project. In this project, I'm using a customized tokenizer which is inherited from
nn.Module
instead oftransformers.PreTrainedTokenizer
.Our model employs the
Llama2
architecture for the decoder. However, I am uncertain about how to effectively integrate vLLM with our model that utilizes our customized tokenizer.I would like to know if the following pipeline is feasible with the current version of vLLM: executing our
tokenize
method first, followed by usingLLM.generate
for generation tasks.More specifically:
vLLM
currently supportTensor
input?tokenizer
, or to only provide a dummytokenizer
without actually employing it in the process?Thank you for your patience and assistance. I eagerly await a response from the vLLM team.
The text was updated successfully, but these errors were encountered: