summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJehan <jehan@girinstud.io>2015-08-04 14:06:22 +0200
committerwm4 <wm4@nowhere>2015-08-04 17:51:00 +0200
commite7897dfb9b0e2d1f9d37ccc7df131b62694603e6 (patch)
treeb4aa5cc514eb745ad5393ee2918f2842e069cdce
parentaca591ded3f7600882b57ec4526bc49b1377f203 (diff)
downloadmpv-e7897dfb9b0e2d1f9d37ccc7df131b62694603e6.tar.bz2
mpv-e7897dfb9b0e2d1f9d37ccc7df131b62694603e6.tar.xz
charset_conv: "auto" encoding detection now uses uchardet.
If mpv is not built with uchardet, "enca" is still the fallback default encoding detection.
-rw-r--r--DOCS/man/options.rst9
-rw-r--r--misc/charset_conv.c4
2 files changed, 8 insertions, 5 deletions
diff --git a/DOCS/man/options.rst b/DOCS/man/options.rst
index 72fc74108b..9ad67c6492 100644
--- a/DOCS/man/options.rst
+++ b/DOCS/man/options.rst
@@ -1438,10 +1438,11 @@ Subtitles
``--sub-codepage=<codepage>``
If your system supports ``iconv(3)``, you can use this option to specify
- the subtitle codepage. By default, ENCA will be used to guess the charset.
- If mpv is not compiled with ENCA, ``UTF-8:UTF-8-BROKEN`` is the default,
- which means it will try to use UTF-8, otherwise the ``UTF-8-BROKEN``
- pseudo codepage (see below).
+ the subtitle codepage. By default, uchardet will be used to guess the
+ charset. If mpv is not compiled with uchardet, enca will be used.
+ If mpv is compiled with neither uchardet nor enca, ``UTF-8:UTF-8-BROKEN``
+ is the default, which means it will try to use UTF-8, otherwise the
+ ``UTF-8-BROKEN`` pseudo codepage (see below).
The default value for this option is ``auto``, whose actual effect depends
on whether ENCA is compiled.
diff --git a/misc/charset_conv.c b/misc/charset_conv.c
index b611538bc1..cef9c4a9e7 100644
--- a/misc/charset_conv.c
+++ b/misc/charset_conv.c
@@ -193,7 +193,9 @@ const char *mp_charset_guess(void *talloc_ctx, struct mp_log *log, bstr buf,
bool use_auto = strcasecmp(user_cp, "auto") == 0;
if (use_auto) {
-#if HAVE_ENCA
+#if HAVE_UCHARDET
+ user_cp = "uchardet";
+#elif HAVE_ENCA
user_cp = "enca";
#else
user_cp = "UTF-8:UTF-8-BROKEN";