sub: if charset detection fails, treat it as broken UTF-8

Broken UTF-8 in this context means we treat it as UTF-8, but we also interpret broken UTF-8 sequences as Latin1. Also, run our own UTF-8 check function before the charset detectors. This prevents from ENCA's UTF-8 check possibly messing up (like detecting 7-bit clean UTF-8 as ASCII, or other things). It also takes care of UTF-8 detection if no charset detector (ENCA, libguess) is compiled in, and it lets us deal better with cut-off UTF-8 sequences.
author: wm4 <wm4@nowhere> 2013-08-15 19:29:42 +0200
committer: wm4 <wm4@nowhere> 2013-08-15 23:40:02 +0200
commit: acb51c9243c7861774af6ad592acc07490fa7e7c (patch)
tree: 128dd7ff703495cde69fe56a29bf7b8765a803dc /mpvcore/charset_conv.h
parent: 380fa71fc79ba40936ea073cfdd183c708141420 (diff)
download: mpv-acb51c9243c7861774af6ad592acc07490fa7e7c.tar.bz2
mpv-acb51c9243c7861774af6ad592acc07490fa7e7c.tar.xz
1 files changed, 2 insertions, 1 deletions
diff --git a/mpvcore/charset_conv.h b/mpvcore/charset_conv.h
index ad10f010a0..171793ffab 100644
--- a/mpvcore/charset_conv.h
+++ b/mpvcore/charset_conv.h
@@ -7,10 +7,11 @@
 enum {
     MP_ICONV_VERBOSE = 1,       // print errors instead of failing silently
     MP_ICONV_ALLOW_CUTOFF = 2,  // allow partial input data
+    MP_STRICT_UTF8 = 4,         // don't fall back to UTF-8-BROKEN when guessing
 };
 
 bool mp_charset_requires_guess(const char *user_cp);
-const char *mp_charset_guess(bstr buf, const char *user_cp);
+const char *mp_charset_guess(bstr buf, const char *user_cp, int flags);
 bstr mp_charset_guess_and_conv_to_utf8(bstr buf, const char *user_cp, int flags);
 bstr mp_iconv_to_utf8(bstr buf, const char *cp, int flags);
author	wm4 <wm4@nowhere>	2013-08-15 19:29:42 +0200
committer	wm4 <wm4@nowhere>	2013-08-15 23:40:02 +0200
commit	acb51c9243c7861774af6ad592acc07490fa7e7c (patch)
tree	128dd7ff703495cde69fe56a29bf7b8765a803dc /mpvcore/charset_conv.h
parent	380fa71fc79ba40936ea073cfdd183c708141420 (diff)
download	mpv-acb51c9243c7861774af6ad592acc07490fa7e7c.tar.bz2 mpv-acb51c9243c7861774af6ad592acc07490fa7e7c.tar.xz