Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does the chatglm2-6B example support codegeex2-6b's building and running? #93

Closed
thendwk opened this issue Oct 24, 2023 · 9 comments
Closed
Assignees
Labels
bug Something isn't working triaged Issue has been triaged by maintainers

Comments

@thendwk
Copy link

thendwk commented Oct 24, 2023

i tried codegeex2-6b's building and running with chatglm2-6B example, but it resulted incorrectly.
the result listed as follows:
root@***:/app/tensorrt_llm/examples/chatglm2-6b# python3 run.py --input_text '# language: Python\n# write a bubble sort function\n'
[10/24/2023-08:24:41] [TRT] [I] Loaded engine size: 11921 MiB
[10/24/2023-08:24:43] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 12584, GPU 12164 (MiB)
[10/24/2023-08:24:43] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +2, GPU +10, now: CPU 12586, GPU 12174 (MiB)
[10/24/2023-08:24:43] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +11912, now: CPU 0, GPU 11912 (MiB)
[10/24/2023-08:24:43] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 12585, GPU 15910 (MiB)
[10/24/2023-08:24:43] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +1, GPU +8, now: CPU 12586, GPU 15918 (MiB)
[10/24/2023-08:24:44] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 11912 (MiB)
[10/24/2023-08:24:44] [TRT] [I] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 12680, GPU 15940 (MiB)
[10/24/2023-08:24:44] [TRT] [I] [MemUsageChange] Init cuDNN: CPU +0, GPU +10, now: CPU 12680, GPU 15950 (MiB)
[10/24/2023-08:24:44] [TRT] [I] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 11912 (MiB)


Input --->

language: Python\n# write a bubble sort function\n

Output --->
\n#bubblesort\n#takesalistofnumbers\n#returnsasortedlist\n\ndefbubble_sort(numbers):\nforiinrange(len(numbers)):\nforjinrange(len(numbers)-1):\nifnumbers[j]>numbers[j+1]:\nnumbers[j],numbers[j+1]=numbers[j+1],numbers[j]\nreturnnumbers\n\nprint(bubble_sort([5,2,1,8,4]))\n\n#bubblesort\n#takesalistofnumbers\n#returnsasortedlist\n\ndefbubble_sort(numbers):\nforiinrange(len(numbers)):\nforjinrange(len(numbers)-1):\nifnumbers[j]>numbers[j+1]:\nnumbers[j],numbers[j+1]=numbers[j+1],numbers[j]\nreturnnumbers\n\nprint(bubble_sort([5,2,1,8,4]))\n\n#bubblesort\n#takesalistofnumbers\n#returnsasortedlist\n\ndefbubble_sort(numbers):\nforiinrange(len(numbers)):\nforjinrange(len(numbers)-1):\nifnumbers[j]>numbers[j+1]:\nnumbers[j],numbers[j+1]=numbers[j+1],numbers[j]\nreturnnumbers\n\nprint(bubble_sort([5,2,1,8,4]))\n\n#bubblesort\n#takesalistofnumbers\n#returnsasortedlist\n\ndefbubble_sort(numbers):\nforiinrange(len(numbers)):\nforjinrange(len(numbers)-1):\nifnumbers[j]>numbers[j+1]:\nnumbers[j],numbers[j+1]=numbers[j+1],numbers[j]\nreturnnumbers\n\nprint(bubble_sort([5,2,1,8,4]))\n\n#bubblesort\n#takesalistofnumbers\n#returnsasortedlist\n\ndefbubble_sort(numbers):\nforiinrange(len(numbers)):\nforjinrange(len(numbers)-1):\nifnumbers[j]>numbers[j+1]:\nnumbers[j],numbers[j+1]=numbers[j+1],numbers[j]\nreturnnumbers\n\nprint(bubble_sort([5,2,1,8,4]))\n\n#bubblesort\n#takesalistofnumbers\n#returnsasortedlist\n\ndefbubble_sort(numbers):\nforiinrange(len(numbers)):\nforjinrange(len(numbers)-1):\nifnumbers[j]>numbers[j+1]:\nnumbers[j],numbers[j+1]=numbers[j+1],numbers[j]\nreturnnumbers\n\nprint(bubble_sort([5,2,1,8,4]))\n\n#bubblesort\n#takesalistofnumbers\n#returnsasortedlist\n\ndefbubble_sort(numbers):\nforiinrange(len(numbers)):\nforjinrange(len(numbers)-1):\nifnumbers[j]>numbers[j+1]:\nnumbers[j],numbers[j+1]=numbers[j+1],numbers[j]\nreturnnumbers\n\nprint(bubble_sort([5,2,1,8,4]))\n\n#bubblesort\n#takesalistofnumbers\n#returnsasortedlist\n\ndefbubble_sort(numbers):\nforiinrange(len(numbers)):\nforjinrange(len(numbers)-1):\nifnumbers[j]>numbers[j+1]:\n


Finished!

@byshiue
Copy link
Collaborator

byshiue commented Oct 24, 2023

Can you share the expected results and the steps you use to generate the results of TRT-LLM?

@byshiue byshiue self-assigned this Oct 24, 2023
@byshiue byshiue added the triaged Issue has been triaged by maintainers label Oct 24, 2023
@thendwk
Copy link
Author

thendwk commented Oct 24, 2023

Can you share the expected results and the steps you use to generate the results of TRT-LLM?

expected result is something like this:
`def bubble_sort(list):
for i in range(len(list) - 1):
for j in range(len(list) - 1):
if list[j] > list[j + 1]:
list[j], list[j + 1] = list[j + 1], list[j]
return list

print(bubble_sort([5, 2, 1, 8, 4]))`

my steps as follows:
1、git clone model:git clone https://huggingface.co/THUDM/codegeex2-6b
2、run build.py
python3 build.py --model_dir=/docker_storage/codegeex/codegeex2-6b/ \ --dtype float16 \ --use_gpt_attention_plugin float16 \ --use_gemm_plugin float16 \ --max_input_len 2048 \ --max_output_len 1024
doing above throws exception,i modified build.py source code as follows:
hf_model = transformers.AutoModel.from_pretrained( args.model_dir, trust_remote_code=True, torch_dtype=torch.float16).cpu()
maybe codogeex2-6b is trained in bf16
3、run prediction
python3 run.py --input_text '# language: Python\n# write a bubble sort function\n'

@byshiue
Copy link
Collaborator

byshiue commented Oct 24, 2023

The TRT-LLM results you shared at the beginning is not very different to the HF's result. Can you try building the engine with FP32 (because chatglm2 does not support BF16 now) first?

@thendwk
Copy link
Author

thendwk commented Oct 24, 2023

The TRT-LLM results you shared at the beginning is not very different to the HF's result. Can you try building the engine with FP32 (because chatglm2 does not support BF16 now) first?

build in default mode, it throws exception as follows:
image
does this indicate that codegeex2-6B is trained in bf16 and TensorRT-LLM does not support build codegeex2-6B?

@byshiue
Copy link
Collaborator

byshiue commented Oct 24, 2023

The TRT-LLM results you shared at the beginning is not very different to the HF's result. Can you try building the engine with FP32 (because chatglm2 does not support BF16 now) first?

build in default mode, it throws exception as follows: image does this indicate that codegeex2-6B is trained in bf16 and TensorRT-LLM does not support build codegeex2-6B?

For BF16 weight, TensorRT-LLM chatglm2 does not support now. It requires to add some flags and data type converter. So, that's why I mention you can try FP32 first.

@thendwk
Copy link
Author

thendwk commented Oct 24, 2023

The TRT-LLM results you shared at the beginning is not very different to the HF's result. Can you try building the engine with FP32 (because chatglm2 does not support BF16 now) first?

build in default mode, it throws exception as follows: image does this indicate that codegeex2-6B is trained in bf16 and TensorRT-LLM does not support build codegeex2-6B?

For BF16 weight, TensorRT-LLM chatglm2 does not support now. It requires to add some flags and data type converter. So, that's why I mention you can try FP32 first.

ok,i got,thx

@byshiue
Copy link
Collaborator

byshiue commented Oct 25, 2023

hf_model = transformers.AutoModel.from_pretrained( args.model_dir, trust_remote_code=True, torch_dtype=torch.float16).cpu()

I take a try and find that chatglm2 only supports FP16 now. So, you cannot run it on FP32. We will fix it soon.

@thendwk
Copy link
Author

thendwk commented Oct 25, 2023

hf_model = transformers.AutoModel.from_pretrained( args.model_dir, trust_remote_code=True, torch_dtype=torch.float16).cpu()

I take a try and find that chatglm2 only supports FP16 now. So, you cannot run it on FP32. We will fix it soon.

okkk thx

@byshiue
Copy link
Collaborator

byshiue commented Oct 27, 2023

This issue is fixed by this MR #148, you can try on latest main branch. Close this bug. Feel free to reopen if needed.

@byshiue byshiue closed this as completed Oct 27, 2023
@byshiue byshiue added the bug Something isn't working label Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

3 participants
@byshiue @thendwk and others