-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Codex/create training and inference scripts #2791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Codex/create training and inference scripts #2791
Conversation
Summary of ChangesHello @ZZUZSL1024, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! 此拉取请求全面支持 Paraformer 模型的 LoRA 微调,提供了从训练到评估的完整工具链。同时,它还引入了一个基于 Flask 的新 Web 服务,为音频处理带来了声纹注册和说话人识别等高级功能。此外,对现有 WebSocket 客户端和服务器进行了大量增强,重点关注改进日志记录、延迟测量以及通过线程池和信号量实现强大的并发管理,从而确保了更稳定、更具扩展性的实时语音处理系统。 Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
本次拉取请求引入了LoRA微调Paraformer的训练和推理脚本,并对WebSocket客户端和服务端进行了大量改进。客户端增加了日志记录和延迟统计功能,服务端则通过引入线程池和信号量显著提升了并发处理能力和稳定性。这些改进对于提高系统的可观测性、性能和健壮性非常有益。同时,也发现了一些可以进一步优化和修正的地方,主要集中在数据处理效率和配置一致性上。
|
|
||
| for name, db_emb in speaker_db.items(): | ||
| # 计算余弦相似度 | ||
| data_list = json.loads(db_emb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在 process_cam_result_with_identify_speakers 函数中,json.loads(db_emb) 在循环内部被重复调用。如果 speaker_db 包含大量条目,这种重复的 JSON 解析会成为性能瓶颈。建议在加载 speaker_db 时一次性将 db_emb 解析为列表,或者至少在循环外部进行解析,以提高效率。
| data_list = json.loads(db_emb) | |
| # 计算余弦相似度 | |
| # 优化:db_emb 应该在 speaker_db 加载时就解析为列表,避免循环内重复解析 | |
| arr = np.array(json.loads(db_emb), dtype=np.float32) |
| with open(args.hotword, encoding="utf-8") as f_scp: | ||
| hot_lines = f_scp.readlines() | ||
|
|
||
| hot_list = [] | ||
| for line in hot_lines: | ||
| words = line.strip().split(" ") | ||
| if len(words) < 2: | ||
| print("Please checkout format of hotwords") | ||
| words = line.strip().split() | ||
| if not words: | ||
| continue | ||
| try: | ||
| fst_dict[" ".join(words[:-1])] = int(words[-1]) | ||
| except ValueError: | ||
| print("Please checkout format of hotwords") | ||
| hotword_msg = json.dumps(fst_dict) | ||
| # Python AutoModel: 用逗号分隔多个热词 | ||
| hot_list.append(words[0]) | ||
|
|
||
| hotword_msg = ",".join(hot_list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在 record_from_scp 函数中,热词处理逻辑已更改为生成一个逗号分隔的热词字符串。然而,funasr_wss_server.py(ws_serve 函数的第418-421行)期望 JSON 消息中的 hotwords 字段直接赋值给 websocket.status_dict_asr["hotword"] 和 websocket.status_dict_asr_online["hotword"]。如果服务器的 AutoModel 期望一个字典(如原始客户端代码和常见热词用法所示),那么发送一个逗号分隔的字符串可能会导致热词识别不正确。请确保客户端发送的 hotword_msg 格式与服务器 AutoModel 期望的格式一致。
| with open(args.hotword, encoding="utf-8") as f_scp: | |
| hot_lines = f_scp.readlines() | |
| hot_list = [] | |
| for line in hot_lines: | |
| words = line.strip().split(" ") | |
| if len(words) < 2: | |
| print("Please checkout format of hotwords") | |
| words = line.strip().split() | |
| if not words: | |
| continue | |
| try: | |
| fst_dict[" ".join(words[:-1])] = int(words[-1]) | |
| except ValueError: | |
| print("Please checkout format of hotwords") | |
| hotword_msg = json.dumps(fst_dict) | |
| # Python AutoModel: 用逗号分隔多个热词 | |
| hot_list.append(words[0]) | |
| hotword_msg = ",".join(hot_list) | |
| fst_dict = {} | |
| for line in hot_lines: | |
| words = line.strip().split() | |
| if not words: | |
| continue | |
| # Assuming the format is "hotword score" | |
| if len(words) < 2: | |
| print("Please checkout format of hotwords file (e.g., 'word score')") | |
| continue | |
| try: | |
| fst_dict[" ".join(words[:-1])] = int(words[-1]) | |
| except ValueError: | |
| print("Please checkout format of hotwords (score must be an integer)") | |
| continue | |
| hotword_msg = json.dumps(fst_dict, ensure_ascii=False) |
| if key is None and isinstance(record.get("source"), dict): | ||
| key = record["source"].get("key") | ||
| keys.append(key or "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
在 load_jsonl 函数中,key 的提取逻辑可以更清晰和健壮。如果 key 未找到,当前会默认为空字符串,这可能不利于后续记录的追踪。为了与输出逻辑(第72行)保持一致,建议在 key 缺失时生成一个唯一的标识符。
| if key is None and isinstance(record.get("source"), dict): | |
| key = record["source"].get("key") | |
| keys.append(key or "") | |
| key = record.get("key") | |
| if key is None and isinstance(record.get("source"), dict): | |
| key = record["source"].get("key") | |
| if key is None: | |
| key = f"input_utt_{len(keys)}" | |
| keys.append(key) |
| from werkzeug.utils import secure_filename | ||
| import torch | ||
| import os | ||
| from openai import OpenAI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| model="paraformer-zh", | ||
| model_revision="v2.0.4", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.