diff options
author | wm4 <wm4@nowhere> | 2013-08-15 19:29:42 +0200 |
---|---|---|
committer | wm4 <wm4@nowhere> | 2013-08-15 23:40:02 +0200 |
commit | acb51c9243c7861774af6ad592acc07490fa7e7c (patch) | |
tree | 128dd7ff703495cde69fe56a29bf7b8765a803dc /mpvcore/bstr.c | |
parent | 380fa71fc79ba40936ea073cfdd183c708141420 (diff) | |
download | mpv-acb51c9243c7861774af6ad592acc07490fa7e7c.tar.bz2 mpv-acb51c9243c7861774af6ad592acc07490fa7e7c.tar.xz |
sub: if charset detection fails, treat it as broken UTF-8
Broken UTF-8 in this context means we treat it as UTF-8, but we also
interpret broken UTF-8 sequences as Latin1.
Also, run our own UTF-8 check function before the charset detectors.
This prevents from ENCA's UTF-8 check possibly messing up (like
detecting 7-bit clean UTF-8 as ASCII, or other things). It also takes
care of UTF-8 detection if no charset detector (ENCA, libguess) is
compiled in, and it lets us deal better with cut-off UTF-8 sequences.
Diffstat (limited to 'mpvcore/bstr.c')
0 files changed, 0 insertions, 0 deletions