summaryrefslogtreecommitdiffstats
path: root/demux
Commit message (Collapse)AuthorAgeFilesLines
* demux: propagate streaming flag through demux_timelinewm42019-09-203-3/+10
| | | | | | | | | | | | | | Before this commit, EDL or CUE files did not properly enable the cache if they were on "slow" media (stream->streaming==true). This happened because the stream is unset for demux_timeline, so the streaming flag could not be queried anymore. Fix this by adding this flag to struct demuxer, and propagate it exactly like the is_network flag. is_network is not used for checking the cache options anymore, and its main function seems to be something else. Normal http streams set the streaming flag already. This should fix #6958.
* demux_lavf: document intentional FFmpeg API violationwm42019-09-191-0/+4
| | | | | | | | | | | | | | | This field is documented as internal, so an API user should not access it. However, this is the only way to get some read statistics without replacing FFmpeg's entire HLS demuxer. (Using custom I/O as workaround doesn't work: the HLS code uses some weird internal APIs that cannot be provided by FFmpeg API users; I even made the author of the relevant patch to provide a public API, but which was shot down by another FFmpeg developer. So I take this as my right to access this field.) Mention this explicitly, as it affects ABI and API compatibility, and I don't want that anyone claims this was a "mistake". Add some explanations.
* packet: fix theoretical UB if called on "empty" packetswm42019-09-191-2/+4
| | | | | | | | | | In theory, a 0 size allocation could have made it memset() on a NULL pointer (with a non-0 size, which makes it crash in addition to theoretical UB). This should never happen, since even packets with size 0 should have an associated allocation, as FFmpeg currently does. But avoiding this makes the API slightly more orthogonal and less tricky, I guess.
* Revert "demux/packet: fix demux_packet_shorten"wm42019-09-191-2/+2
| | | | | | | | | | | This reverts commit 95636c65e73c1d0d8cba43d8c230291d99962e88. This change shouldn't be needed, and in fact it's wrong. The FFmpeg API function could do anything it wants with the packet, including changing the packet data pointer. Likewise, it's not guaranteed that the referenced packet's fields mirror the current state of the mpv packet struct (the AVPacket is only kept for the AVBuffer and the side data stuff).
* demux: fix another incorrect BOF cache flag issuewm42019-09-191-2/+5
|
* command, demux: add AB-loop keyframe cache align commandwm42019-09-192-0/+80
| | | | | | | | | | | | | | | Helper for the ab-loop-dump-cache command, see manpage additions. This is kind of shit. Not only is this a very "special" feature, but it also vomits more messy code into the big and already bloated demux.c, and the implementation is sort of duplicated with the dump-cache code. (Except it's different.) In addition, the results sort of depend what a video player would do with the dump-cache output, or what the user wants (for example, a user might be more interested in the range of output audio, instead of the video). But hey, I don't actually need to justify it. I'm only justifying it for fun.
* demux, command: add a third stream recording mechanismwm42019-09-192-0/+208
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | That's right, and it's probably not the end of it. I'll just claim that I have no idea how to create a proper user interface for this, so I'm creating multiple partially-orthogonal, of which some may work better in each of its special use cases. Until now, there was --record-file. You get relatively good control about what is muxed, and it can use the cache. But it sucks that it's bound to playback. If you pause while it's set, muxing stops. If you seek while it's set, the output will be sort-of trashed, and that's by design. Then --stream-record was added. This is a bit better (especially for live streams), but you can't really control well when muxing stops or ends. In particular, it can't use the cache (it just dumps whatever the underlying demuxer returns). Today, the idea is that the user should just be able to select a time range to dump to a file, and it should not affected by the user seeking around in the cache. In addition, the stream may still be running, so there's some need to continue dumping, even if it's redundant to --stream-record. One notable thing is that it uses the async command shit. Not sure whether this is a good idea. Maybe not, but whatever. Also, a user can always use the "async" prefix to pretend it doesn't. Much of this was barely tested (especially the reinterleaving crap), let's just hope it mostly works. I'm sure you can tolerate the one or other crash?
* demux: move packet cache reading to a functionwm42019-09-191-14/+27
| | | | Useful for a following commit.
* demux: move a seek helper to a separate functionwm42019-09-191-35/+47
| | | | | | | It makes some slight sense and helps with one of the following commits. Also rename that other function to make it sound less similar to find_seek_target().
* demux: minor simplification for backward cache size optionwm42019-09-191-2/+4
| | | | | | | | | | Always set max_bytes_bw to 0 if seekable cache is disabled, instead at the place of its use. This is the only use of it, so the commit should not change any behavior. (Alternatively, this could drop the max_bytes_bw variable, use the option directly, and keep the old code that resets it on use of the cache is disabled.)
* demux: allow backward cache to use unused forward cachewm42019-09-191-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, the following could happen: if you set a 1GB forward cache, and a 1GB backward cache, and you opened a 2GB file, it would prune away the data cached at the start as playback progressed past the 50% mark. With this commit, nothing gets pruned, because the total memory usage will still be 2GB, which equals the total allowed memory usage of 1GB + 1GB. There are no explicit buffers (every packet is malloc'ed and put into a linked list), so it all comes down to buffer size computations. Both reader and prune code use these sizes to decide whether a new packet should be read / an old packet discarded. So just add the remaining free "space" from the forward buffer to the available backward buffer. Still respect if the back buffer is set to 0 (e.g. unseekable cache where it doesn't make sense to keep old packets). We need to make sure that the forward buffer can always append, as long as the forward buffer doesn't exceed the set size, even if the back buffer "borrows" free space from it. For this reason, always keep 1 byte free, which is enough to allow it to read a new packet. Also, it's now necessary to call pruning when adding a packet, to get back "borrowed" space that may need to be free'd up after a packet has been added. I refrained from doing the same for forward caching (making forward cache use unused backward cache). This would work, but has a disadvantage. Assume playback starts paused. Demuxing will stop once the total allowed low total cache size is reached. When unpausing, the forward buffer will slowly move to the back buffer. That alone will not change the total buffer size, so demuxing remains stopped. Playback would need to pass over data of the size of the back buffer until demuxing resume; consider this unacceptable. Live playback would break (or rather, would not resume in unintuitive ways), even normal streaming may break if the server invalidates the URL due to inactivity. As an alternative implementation, you could prune the back buffer immediately, so the forward buffer can grow, but then the back buffer would never grow. Also makes no sense. As far as the user interface is concerned, the idea is that the limits on their own aren't really meaningful, the purpose is merely to vaguely restrict the cache memory usage. There could be just a single option to set the total allowed memory usage, but the separate backward cache controls the default ratio of backward/forward cache sizes. From that perspective, it doesn't matter if the backward cache uses more of the total buffer than assigned, if the forward buffer is complete.
* demux: don't clobber internal demuxer EOF state in cache seekswm42019-09-191-1/+1
| | | | | | | | | | | | | | | The last_eof field is the last known EOF state from the underlying demuxer. Normally, seeks reset it, because obviously if seek back into the middle of the file, you don't want last_eof to have a "wrong" value for a short time window (until a packet is read, which would reset the field to its correct value). This shouldn't happen during cache seeks, because you don't touch the underlying demuxer state. At first, I made this change because some other work in progress required it. It turned out that it was unnecessary, but keep the change anyway, since it's still correct and makes the logic cleaner.
* packet: change memory estimation heuristicswm42019-09-191-2/+5
| | | | | | | | | | | | | | | | | | | | Determining how much memory something uses is very hard, especially in high level code (yes we call code using malloc high level). There's no way to get an exact amount, especially since the malloc arena is shared with the entire process anyway. So the demuxer packet cache tries to get by with an estimate using a number of rough guesses. It seems this wasn't quite good. In some ways, it was too optimistic, in others it seemed to account for too much data. Try to get it closer to what malloc and ta probably do. In particular, talloc adds some singificant overhead (using talloc for mass-data was a mistake, and it's even my fault). The result appears to match better with measured memory usage. This is still extremely dependent on malloc implementation and so on. The effect is that you may need to adjust the demuxer cache limits to cache as much data as it did before this commit. In any case, seems to be better for me.
* packet: free some unnecessary memory in disk cache casewm42019-09-191-1/+2
| | | | | | | | | If the disk cache is used, the AVPacket is not used anymore and is completely deallocated when the packet is written to disk. As a minor bug, the AVPacket allocation itself was not freed (although it wasn't a memory leak, since talloc still automatically freed it when the entire demux_packet was freed). For very large caches, this could easily add up to over hundred MB, so actually free the unneeded allocation.
* demux: honor seek discontinuities with --stream-recordwm42019-09-191-0/+3
| | | | Do the same thing --record-file does when seeks happen.
* demux: runtime option changing for cache and stream recordingwm42019-09-191-33/+92
| | | | | | | | | Make most of the demuxer options runtime-changeable. This includes the cache options and stream recording. The manpage documents some of the possibly weird issues related to this. In particular, the disk cache isn't shuffled around if the setting changes at runtime.
* demux: enable --stream-record for things using timelinewm42019-09-191-0/+2
| | | | | | | | | | | | Although this is not useful in general, it makes --stream-record work with a certain video streaming service by a large dystopian company. In the general case, this fails because normal muxing can, quite obviously, not handle the segmented metadata in the packets. (There isn't even a file format which could handle these, except possibly mp4.) On the other hand, ytdl merely uses timeline/EDL to emulate DASH streaming (unfortunately), which does not use the segmented stuff, and stream recording will actually work.
* demux_mkv: add hacks to avoid a single warningwm42019-09-191-9/+26
| | | | | | | | | | | | | | It prints "Unexpected end of file (no clusters found)" when opening a webm init fragment. The warning is correct, but unwanted in this case. Add tons of kludges to avoid it. (Actually it prints that twice, for audio and video each.) Also, suppress another warning about a seek head entry that points exactly to the end of the file. This is a MATROSKA_ID_CUES, which is harmless, and, very strangely, doesn't point at any cues when you concatenate the init fragment with a media fragment. No idea what that crap is supposed to be.
* demux: make webm dash work by using init fragment on all demuxerswm42019-09-192-32/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Retarded webshit streaming protocols (well, DASH) chop a stream into small fragments, and move unchanging header parts to an "init" fragment to save some bytes (in the case at hand about 300 bytes for each fragment that is 100KB-200KB, sure was worth it, fucking idiots). Since mpv uses an even more retarded hack to inefficiently emulate DASH through EDL, it opens a new demuxer for every fragment. Thus the fragment needs to be virtually concatenated with the init fragment. (To be fair, I'm not sure whether the alternative, reusing the demuxer and letting it see a stream of byte-wise concatenated fragmenmts, would actually be saner.) demux_lavc.c contained a hack for this. Unfortunately, a certain shitty streaming site by an evil company, that will bestow dytopia upon us soon enough, sometimes serves webm based DASH instead of the expected mp4 DASH. And for some reason, libavformat's mkv demuxer can't handle the init fragment or rejects it for some reason. Since I'd rather eat mushrooms grown in Chernobyl than debugging, hacking, or (god no) contributing to FFmpeg, and since Chernobyl is so far away, make it work with our builtin mkv demuxer instead. This is not hard. We just need to copy the hack in demux_lavf.c to demux_mkv.c. Since I'm not _that_ much of a dumbfuck to actually do this, remove the shitty gross demux_lavf.c hack, and replace it by a slightly less bad generic implementation (stream_concat.c from the previous commit), and use it on all demuxers. Although this requires much more code, this frees demux_lavf.c from a hack, and doesn't require adding a duplicated one to demux_mkv.c, so to the naive eye this seems to be a much better outcome. Regarding the code, for some reason stream_memory_open() is never meant to fail, while stream_concat_open() can in extremely obscure situations, and (currently) not in this case, but we handle failure of it anyway. Yep.
* demux: never set demux->stream for timeline messwm42019-09-191-27/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Timeline (demux_timeline, for EDL and mkv ordered chapters) are a mess, because it's the only nested demuxer case. Part of the mess comes from shared struct stream pointers. This makes no sense, because the wrapper (demux_timeline) doesn't have any business setting it. Try to lessen it by not passing down streams. Instead, pass down NULL. This prevents unintended interference, and tightens the ownership rules. Now a demuxer always owns its stream. On the other hand, demuxer->stream can now be NULL. This was never the case before, and consequently there will be new bugs. At least they will be spotted, because they've been bugs before. struct stream is also used to access stream properties (such as whether something is considered a network stream). Most of these have been mirrored in struct demuxer (because the frontend has been forbidden to access struct stream because of threading). But during initialization was still used, so introduce an awkward struct parent_stream_info, which unifies these. Commit e0419fb181b3d2 changed demux_is_network_cached() to use demuxer->stream->streaming instead of demuxer->is_network. To enable timeline stuff to use the cache anyway, change it so that both flags can contribute to it. The stream NULL-check is obviously due to changes in this commit.
* stream: create memory streams in more straightforward waywm42019-09-194-4/+5
| | | | | | | | | | | | | | | Instead of having to rely on the protocol matching, make a function that creates a stream from a stream_info_t directly. Instead of going through a weird indirection with STREAM_CTRL, add a direct argument for non-text arguments to the open callback. Instead of creating a weird dummy mpv_global, just pass an existing one from all callers. (The latter one is just an artifact from the past, where mpv_global wasn't available everywhere.) Actually I just wanted a function that creates a stream without any of that bullshit. This goal was slightly missed, since you still need this heavy "constructor" just to setup a shitty struct with some shitty callbacks.
* demux_playlist: extend maximum line sizewm42019-09-191-1/+1
| | | | | | | | Raise it from 8KB to 512KB. Do this because ytdl_hook.lua generated a 40KB EDL file (from 80KB youtube-dl JSON output), and putting it into a .m3u file for easier debugging failed due to the size limit.
* demux: fix backward demuxing not grabbing all audio packetswm42019-09-191-5/+5
| | | | | | | | | | | | | | | | The previous commit broke audio playback (maybe this is what 4. was about?). But it wasn't the fault of the commit; it just exposed pre-existing issues. If the packet queue search can't get all packets, it checked queue->is_bof to see whether there could be previous packets. But obviously, is_bof can be set, even if the search start packet wasn't the first one. This was especially observable with audio, because audio by default needs preroll packets, and plays artifacts if they're missing. Fix by using the BOF playback condition for this purpose too.
* demux: another questionable backwards playback mud partywm42019-09-191-5/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In theory, backward demuxing does way too much work by doing a full cache seek every time you need to step back through a packet queue. In theory, it would be exceedingly more efficient to just iterate backwards through the queue, but this is not possible because I'm too stingy to add 8 bytes per packet for backlinks. (In theory, you could probably come up with some sort of deque, that'd allow efficient iteration into any direction, but other requirements make this tricky, and I'm currently too dumb/lazy to do this. For example, the queue can grow to millions of elements, all while packet pointers need to stay valid.) Another possibility is to "locally" seek the queue. This still has less overhead than a full seek. Also, it just so happens that, as a side effect, this avoids performing range merging, which commit f4b0e7e942 "broke". That wasn't a bug at all, but since range joining is relatively slow, avoiding it is good. This is really just a coincidental side effect, I'm not even quite sure why it happens this way. There are 4 ugly things about this change: 1. To get a keyframe "before" a certain one, we recompute the target PTS, and then subtract 0.001 as arbitrary number to "fudge" it. This isn't the first place where this is done, and hey, it wasn't my damn idea that MPlayer should use floats for timestamps. (At first, it even used 32 bit timestamps.) 2. This is the first time reader_head is reset to an earlier position outside of the seek code. This might cause conceptual problems since this code is now "duplicated". 3. In theory, find_backward_restart_pos() needs to be run again after the backstep. Normally, the seek code calls it explicitly. We could call it right in the new code, but then the damn function would be recursive. We could shuffle the code around to make it a loop, but even then there'd be an offchance into running into an unexpected corner case (aka subtle bug), where it would loop forever. To avoid refactoring the code and having to think too hard about it, make it deferred - add some new state and the check_backward_seek() function for this. Great, even more subtle mutable state for this backwards shit. 4. I forgot this one, but I can assure you, it's bad. Without doubt someone will have to clean up this slightly in the future (or rip it out), but it won't be me.
* demux: remove some redundancy in backward playback codewm42019-09-191-6/+5
| | | | | | | | | | | This code tries to determine the "current" position, which is used as base for the seek target when it needs to seek back more (the point is to prevent seeking back too far). But compute_keyframe_times() almost computes the same thing, so use that. Unfortunately needs a forward declaration. ("Almost", because it differs in some details that should not really matter.)
* demux_mkv: fix subtitle preroll in some caseswm42019-09-191-7/+6
| | | | | | | | | | | | | | | | | | | | | | Subtitle packets with a timestamp before the seek target may overlap with the seek target anyway. This is why this subtitle preroll crap exists: it needs to return packets before the seek target to ensure that the subtitle is displayed at the seek target. This didn't always work. Maybe it's a regression, but it must have been an old one. The breakage is triggered by heuristic that is to prevent excessive queuing of packets in garbage files (this heuristic apparently became immediately necessary when this preroll mechanism was implemented). If a video keyframe packet was found, but no audio packet yet, then subtitle_preroll was set to 0, and since a_skip_to_keyframe was still 0, the subtitle packet was discarded. The dumb thing is that subtitle and video seeking is finished at this point, so the preroll crap should not be applied at all. Fix this by moving the preoll overflow code into the block that handles preroll.
* demux: turn some redundant assignments into assertswm42019-09-191-3/+5
| | | | | | | | demux_packet.next should not be used outside of demux.c, and in this case it's a packet that was just passed to demux.c from the outside. demux_packet.stream is already set by the demuxer, and this is assured by the add_packet_locked() caller.
* demux: move a functionwm42019-09-191-14/+12
| | | | | | The new location makes equally much sense (or more, since it's close to its per-stream companion function), and we don't need a forward declaration.
* demux: disable backward demuxing if it fatally failswm42019-09-191-0/+13
| | | | | | | | | | | | | | We don't care much about this case, because backward playback can fail terribly without a good way to detect it, so this was fine. However, this froze in certain situations. Reading from a subtitle file for which backward demuxing failed could make it get stuck in demux_read_packet_async() in unthreaded mode. (That we don't support backwards subtitle decoding anyway doesn't matter for this.) So aggressively disable backward demuxing to prevent worse in these situations. The behavior will still be awful, because the frontend is still in backwards playback mode, but at least it won't freeze.
* demux: add a on-disk cachewm42019-09-196-39/+453
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Somewhat similar to the old --cache-file, except for the demuxer cache. Instead of keeping packet data in memory, it's written to disk and read back when needed. The idea is to reduce main memory usage, while allowing fast seeking in large cached network streams (especially live streams). Keeping the packet metadata on disk would be rather hard (would use mmap or so, or rewrite the entire demux.c packet queue handling), and since it's relatively small, just keep it in memory. Also for simplicity, the disk cache is append-only. If you're watching really long livestreams, and need pruning, you're probably out of luck. This still could be improved by trying to free unused blocks with fallocate(), but since we're writing multiple streams in an interleaved manner, this is slightly hard. Some rather gross ugliness in packet.h: we want to store the file position of the cached data somewhere, but on 32 bit architectures, we don't have any usable 64 bit members for this, just the buf/len fields, which add up to 64 bit - so the shitty union aliases this memory. Error paths untested. Side data (the complicated part of trying to serialize ffmpeg packets) untested. Stream recording had to be adjusted. Some minor details change due to this, but probably nothing important. The change in attempt_range_joining() is because packets in cache have no valid len field. It was a useful check (heuristically finding broken cases), but not a necessary one. Various other approaches were tried. It would be interesting to list them and to mention the pros and cons, but I don't feel like it.
* demux: move comment to slightly better locationwm42019-09-191-1/+1
|
* demux: fix excessive backwards seeking with backwards playbackwm42019-09-191-1/+2
| | | | | | | | | | | | | | | | Backwards demuxing usually seeks back back by a "random" amount (set by a user option) when it needs new preceding packets. It turns out a past change made these backwards seek amounts add up when it didn't need to (i.e. subtracting the amount from the seek pos without properly resetting it), which could possibly slow down playback as it went on. The reason for this was that back_seek_pos was set for every stream on every seek. This made the reset not affect other streams (in particular streams which weren't used and never were reset, or which didn't reset that often). But as the commit adding it showed, this is needed only to set the initial position. So do that. Fixes: "demux: fix initial backward demuxing state in some cases"
* demux: fix minor seek_preroll consistency issuewm42019-09-191-0/+2
| | | | | | | | | When packet appending sets the start of the range, it adjusts the range by seek_preroll. Do this when packets are pruned from the start of the range too. (Yeah, seek_preroll handling is probably broken in some other cases. It was halfhearted to begin with.)
* demux: mess with seek range updates and pruningwm42019-09-193-118/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main thing this commit does is removing demux_packet.kf_seek_pts. It gets rid of 8 bytes per packet. Which doesn't matter, but whatever. This field was involved with much of seek range updating and pruning, because it tracked the canonical seek PTS (i.e. start PTS) of a packet range. We have to deal with timestamp reordering, and assume the start PTS is the lowest PTS across all packets (not necessarily just the first packet). So knowing this PTS requires looping over all packets of a range (no, the demuxer isn't going to tell us, that would be too sane). Having this as packet field was perfectly fine. I'm just removing it because I started hating extra packet fields recently. Before this commit, this value was cached in the kf_seek_pts field (and computed "incrementally" when adding packets). This commit computes the value on demand (compute_keyframe_times()) by iterating over the placket list. There is some similarity with the state before 10d0963d851fa, where I introduced the kf_seek_pts field - maybe I'm just moving in circles. The commit message claims something about quadratic complexity, but if the code before that had this problem, this new commit doesn't reintroduce it, at least. (See below.) The pruning logic is simplified (I think?) - there is no "incremental" cached pruning decision anymore (next_prune_target is removed), and instead it simply prunes until the next keyframe like it's supposed to. I think this incremental stuff was only there because of very old code that got refactored away before. I don't even know what I was thinking there, it just seems complex. Now the seek range is updated when a keyframe packet is removed. Instead of using the kf_seek_pts field, queue->seek_start is used to determine the stream with the lowest timestamp, which should be pruned first. This is different, but should work well. Doing the same as the previous code would require compute_keyframe_times(), which would introduce quadratic complexity. On the other hand, it's fine to call compute_keyframe_times() when the seek range is recomputed on pruning, because this is called only once per removed keyframe packet. Effectively, this will iterate over the packet list twice instead of once, and with some locality. The same happens when packets are appended - it loops over the recently added packets once again. (And not more often, which would go above linear complexity.) This introduces some "cleverness" with avoiding calling update_seek_ranges() even when keyframe packets added/removed, which is not really tightly coupled to the new code, and could have been in a separate commit. Removin