| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Fixes e.g --sub-codepage=utf8:gb18030 if the subtitle us UTF-8.
This was broken in commit e5d31808.
Also log the detected charset in verbose mode.
|
|
|
|
|
|
|
|
|
|
| |
Some charsets can look like valid UTF-8, but aren't UTF-8. One example
is ISO-2022-JP. While ENCA apparently likes to get misdetect real UTF-8,
this is not the case with uchardet. uchardet can detect ISO-2022-JP
correctly, but didn't even get to try, because our own UTF-8 check
succeeded. So run the UTF-8 check when using ENCA only.
Fixes #2195.
|
|
|
|
|
| |
If mpv is not built with uchardet, "enca" is still the fallback default
encoding detection.
|
|
|
|
| |
Fixes #2186.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For now, it needs to be explicitly selected. ENCA is still the default.
This assumes uchardet returns iconv names. This doesn't seem to be
always the case, and the result are lots of iconv errors. So
explicitly check for this situation, and print a warning if it
occurs. It's entirely possible that uchardet support is actually
useless, because names are not necessarily iconv-compatible (but
uchardet doesn't seem to document whether it attempts to return
iconv-compatible names if possible).
Fixes #908.
|
|
|
|
|
|
|
|
|
| |
uchardet is written in C++, and thus doesn't appreciate the value of
using static strings, and internally stores the guessed charset as
allocated std::string. Add a minimal hack to deal with this. (I don't
appreciate that the code is potentially harder to understand by
returning either a static or allocated string, but I do appreciate for
not having to litter the existing code with strdups.)
|
|
|
|
|
|
|
|
|
|
|
| |
Useful for Windows stuff. Actually, ENCA support should catch this, but,
well, whatever, everyone seems to hate ENCA.
Detection with BOM is trivial, although it needs some hackery to
integrate it with the existing autodetection support. For one, change
the default value of --sub-codepage to make this easier.
Probably fixes issue #937 (the second part).
|
|
|
|
|
|
|
| |
It happens to work without strings.h on glibc or with _GNU_SOURCE, but
the POSIX standard requires including <strings.h>.
Hopefully fixes OSX build.
|
| |
|
|
|