Abstract: Lipreading refers to understanding and further translating the speech of a video speaker into textual outputs. State-of-the-art lipreading methods excel in interpreting overlap speakers, i.e ...