Note: in the result below, the real photo and label at each row are the ground-truth translation of each other.