Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I use Chinese and English mixed dataset and a another custom 7B model to train eagle,but the performance in Chinese is not good enough #81

Closed
2024WY opened this issue Jun 17, 2024 · 1 comment

Comments

@2024WY
Copy link

2024WY commented Jun 17, 2024

I had modified the preprocess_function.And the used data is the sft data of the model.
The pictures are the results of the training and testing in training:
train
test

but I use Spec-Bench project to test on Chineses dataset, the Mean accepted tokens is only 2.5994782086414654, not good enough.

Anything else to pay attention, Can you give some advice?Thanks!

@2024WY 2024WY changed the title I use Chinese dataset and a another custom 7B model to train eagle,but the performance in Chinese is not good enough I use Chinese and English mixed dataset and a another custom 7B model to train eagle,but the performance in Chinese is not good enough Jun 17, 2024
@Liyuhui-12
Copy link
Collaborator

I noticed that your top-3 accuracy on the training set is only about 0.8, which is relatively low. What is your training accuracy on the English dataset? If it is close to the accuracy on the Chinese dataset, it could be that the structure or size of the draft model is not suitable. If the English accuracy is significantly higher than the Chinese accuracy, it is possible that your base model is not sufficiently trained on Chinese, and its features cannot effectively capture the semantic information of Chinese.

@hongyanz hongyanz closed this as completed Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants