Greetings!

欢迎!

I am Tianyun, an MA student majoring in Interpreting at HKBU. “Entaro” is my research project based on IBM Watson automatic speech recognition, which supports multiple languages and key word detection. I have finished a pilot research, proving that automatic speech recognition can shift and reallocate the interpreter’s cognitive load in comprehending accented speeches.

我是Tianyun,香港浸会大学口译专业硕士生。Entaro是我搭建的云语音识别试验项目,基于IBM Watson引擎。目前系统已经能完成支持多种语言和关键词的低延迟实时语音识别,并能在实时识别的同时匹配术语译文。我在此基础上进行了一次科研实验,证明了实时语音识别可以在重口音情境下减轻译员的认知负荷,缩短译员的听译时间差(EVS)并提升译员的译文产出。

I am finishing the last step: UX design and bug test (sorry for slow progress since it takes lots of time). I will try to accomplish a specialized ASR engine i.e. tensor flow or Kaldi, therefore it can perform better in recognizing accented speeches (African or the Middle East accent).

目前的Entaro系统在语音识别方面已经基本解决了上述问题(还存在识别时间不够长、噪音抗性差、术语库容量有限等问题),我正在做最后的页面适配和bug测试等工作(请原谅耗费的精力很多,因此进展比较慢)。在之后的大版本更新中,我将尝试搭建适配非主流口音(如中东口音或东非口音)的自主语音识别引擎(如tensor flow或Kaldi)。

You can click here for the trial of a semi-automatic EVS analysis tool, by typing in the recognized timeline of both the source text and the target text, it can generate a flow chart for EVS analysis. I will improve the user’s experience in the later version. Stay tuned 😉

Entaro系统还能完成半自动化的EVS检测及图表生成,现有的方法会略显繁琐,需要使用者手动输入(自动识别生成的)原语及目标语小句时间,输入完成后自动生成含每个节点时间差及平均值的折线图。我会在后续版本中改进使用体验,敬请期待。

Research Project “Entaro”     About me

Update History / 更新日志:


V3. Alpha 1

New ASR engine: (from Microsoft Speech SDK to IBM Watson engine)

New UX design.

Now supports .xlsx file for term base.

V2.Alpha 2

将实时识别设置为单行显示并调整了显示自适应。

V2.Alpha 2视频演示:

V2.Alpha 1

构建了术语库功能并大幅改进了界面。

编写了简易的术语库 (json格式);

术语库会在识别到相应英文内容后自动弹出对应中文译文;

识别模式、输入类型分别嵌入了顶部汉堡菜单和右侧按钮切换;

增加了切屏按钮,将识别历史隐藏于另外一屏;

界面字体、背景、按钮样式等大幅优化;

V2.Alpha 视频演示:

V1.RC

设计了系统界面。

引进了Bootstrap设计框架;

增加了标志图和图标;

V1.RC版本视频演示:

V0: Prototype

实现了实时识别功能。

支持语言:English-US, English-GB and Chinese-CN;

识别模式:Dictation, Interactive and Conversation;

输入: Audio File and Microphone;

当前识别(Current Hypothesis)面板;

结果:识别成功的内容将归类与左侧结果(Results)框内;

V0版本视频演示: