You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
for i, (word, pos) in enumerate(seg_cut):
sub_initials = []
sub_finals = []
if not '\u4e00' <= word[0] <= '\u9fff':
if i > 0 and not '\u4e00' <= seg_cut[i-1][0][0] <= '\u9fff':
continue
else:
now_word_length = pre_word_length + 1
else:
now_word_length = pre_word_length + len(word)
The text was updated successfully, but these errors were encountered:
tmp = []
for x in pypinyin_result:
if x[0].isalnum():
tmp.append(x)
else:
tmp.extend(list(x[0]))
pypinyin_result = tmp
assert len(sent) == len(pypinyin_result)
我用的是r1.4分支。
当我输入文本 “一边...一边...”写出脱离险境的劳累 时,会报错,提示tone_sandhi.py第89行报错。
排查发现,问题在于zh_frontend.py第235行有问题,更深一步的原因在于seg_cut = psg.lcut(seg),而pinyins = self.g2pW_model(seg)[0],pinyins中的每个元素即可能是一个字的拼音,也可能是连续标点,例如'...',然后后续会报错。
解决办法是更改zh_frontend.py中的232行至235行,改成如下:
The text was updated successfully, but these errors were encountered: