diff options
author | Uoti Urpala <uau@glyph.nonexistent.invalid> | 2010-04-26 18:25:34 +0300 |
---|---|---|
committer | Uoti Urpala <uau@glyph.nonexistent.invalid> | 2010-04-26 18:25:34 +0300 |
commit | 7795726e0f8c70edd6ecde7fd2137214af302f4f (patch) | |
tree | 87a087e69a0e2912183736de409676f824fb2248 /DOCS/tech/general.txt | |
parent | ba3b65b92f3f822fa75b0210b841557f5b20f6d1 (diff) | |
parent | e16f02fe4001f3056b8efd1a099a563569b73f5d (diff) | |
download | mpv-7795726e0f8c70edd6ecde7fd2137214af302f4f.tar.bz2 mpv-7795726e0f8c70edd6ecde7fd2137214af302f4f.tar.xz |
Merge svn changes up to r31033
Diffstat (limited to 'DOCS/tech/general.txt')
-rw-r--r-- | DOCS/tech/general.txt | 252 |
1 files changed, 126 insertions, 126 deletions
diff --git a/DOCS/tech/general.txt b/DOCS/tech/general.txt index 631ee3f9de..4ca3671cf3 100644 --- a/DOCS/tech/general.txt +++ b/DOCS/tech/general.txt @@ -14,64 +14,64 @@ The main modules: 2. demuxer.c: this does the demultiplexing (separating) of the input to audio, video or dvdsub channels, and their reading by buffered packages. - The demuxer.c is basically a framework, which is the same for all the - input formats, and there are parsers for each of them (mpeg-es, - mpeg-ps, avi, avi-ni, asf), these are in the demux_*.c files. - The structure is the demuxer_t. There is only one demuxer. + The demuxer.c is basically a framework, which is the same for all the + input formats, and there are parsers for each of them (mpeg-es, + mpeg-ps, avi, avi-ni, asf), these are in the demux_*.c files. + The structure is the demuxer_t. There is only one demuxer. 2.a. demux_packet_t, that is DP. Contains one chunk (avi) or packet (asf,mpg). They are stored in memory as - in linked list, cause of their different size. + in linked list, cause of their different size. 2.b. demuxer stream, that is DS. Struct: demux_stream_t Every channel (a/v/s) has one. This contains the packets for the stream (see 2.a). For now, there can be 3 for each demuxer : - - audio (d_audio) - - video (d_video) - - DVD subtitle (d_dvdsub) + - audio (d_audio) + - video (d_video) + - DVD subtitle (d_dvdsub) 2.c. stream header. There are 2 types (for now): sh_audio_t and sh_video_t This contains every parameter essential for decoding, such as input/output - buffers, chosen codec, fps, etc. There are each for every stream in - the file. At least one for video, if sound is present then another, - but if there are more, then there'll be one structure for each. - These are filled according to the header (avi/asf), or demux_mpg.c - does it (mpg) if it founds a new stream. If a new stream is found, - the ====> Found audio/video stream: <id> messages is displayed. - - The chosen stream header and its demuxer are connected together - (ds->sh and sh->ds) to simplify the usage. So it's enough to pass the - ds or the sh, depending on the function. - - For example: we have an asf file, 6 streams inside it, 1 audio, 5 - video. During the reading of the header, 6 sh structs are created, 1 - audio and 5 video. When it starts reading the packet, it chooses the - stream for the first found audio & video packet, and sets the sh - pointers of d_audio and d_video according to them. So later it reads - only these streams. Of course the user can force choosing a specific - stream with - -vid and -aid switches. - A good example for this is the DVD, where the english stream is not - always the first, so every VOB has different language :) - That's when we have to use for example the -aid 128 switch. + buffers, chosen codec, fps, etc. There are each for every stream in + the file. At least one for video, if sound is present then another, + but if there are more, then there'll be one structure for each. + These are filled according to the header (avi/asf), or demux_mpg.c + does it (mpg) if it founds a new stream. If a new stream is found, + the ====> Found audio/video stream: <id> messages is displayed. + + The chosen stream header and its demuxer are connected together + (ds->sh and sh->ds) to simplify the usage. So it's enough to pass the + ds or the sh, depending on the function. + + For example: we have an asf file, 6 streams inside it, 1 audio, 5 + video. During the reading of the header, 6 sh structs are created, 1 + audio and 5 video. When it starts reading the packet, it chooses the + stream for the first found audio & video packet, and sets the sh + pointers of d_audio and d_video according to them. So later it reads + only these streams. Of course the user can force choosing a specific + stream with + -vid and -aid switches. + A good example for this is the DVD, where the english stream is not + always the first, so every VOB has different language :) + That's when we have to use for example the -aid 128 switch. Now, how this reading works? - - demuxer.c/demux_read_data() is called, it gets how many bytes, - and where (memory address), would we like to read, and from which + - demuxer.c/demux_read_data() is called, it gets how many bytes, + and where (memory address), would we like to read, and from which DS. The codecs call this. - - this checks if the given DS's buffer contains something, if so, it - reads from there as much as needed. If there isn't enough, it calls - ds_fill_buffer(), which: - - checks if the given DS has buffered packages (DP's), if so, it moves - the oldest to the buffer, and reads on. If the list is empty, it - calls demux_fill_buffer() : - - this calls the parser for the input format, which reads the file - onward, and moves the found packages to their buffers. - Well it we'd like an audio package, but only a bunch of video - packages are available, then sooner or later the: - DEMUXER: Too many (%d in %d bytes) audio packets in the buffer - error shows up. + - this checks if the given DS's buffer contains something, if so, it + reads from there as much as needed. If there isn't enough, it calls + ds_fill_buffer(), which: + - checks if the given DS has buffered packages (DP's), if so, it moves + the oldest to the buffer, and reads on. If the list is empty, it + calls demux_fill_buffer() : + - this calls the parser for the input format, which reads the file + onward, and moves the found packages to their buffers. + Well it we'd like an audio package, but only a bunch of video + packages are available, then sooner or later the: + DEMUXER: Too many (%d in %d bytes) audio packets in the buffer + error shows up. 2.d. video.c: this file/function handle the reading and assembling of the video frames. each call to video_read_frame() should read and return a @@ -101,7 +101,7 @@ Now, go on: The given stream's actual position is in the 'timer' field of the corresponding stream header (sh_audio / sh_video). - The structure of the playing loop : + The structure of the playing loop : while(not EOF) { fill audio buffer (read & decode audio) + increase a_frame read & decode a single video frame + increase v_frame @@ -111,89 +111,89 @@ Now, go on: handle events (keys,lirc etc) -> pause,seek,... } - When playing (a/v), it increases the variables by the duration of the - played a/v. - - with audio this is played bytes / sh_audio->o_bps - Note: i_bps = number of compressed bytes for one second of audio - o_bps = number of uncompressed bytes for one second of audio - (this is = bps*samplerate*channels) - - with video this is usually == 1.0/fps, but I have to note that - fps doesn't really matters at video, for example asf doesn't have that, - instead there is "duration" and it can change per frame. - MPEG2 has "repeat_count" which delays the frame by 1-2.5 ... - Maybe only AVI and MPEG1 has fixed fps. - - So everything works right until the audio and video are in perfect - synchronity, since the audio goes, it gives the timing, and if the - time of a frame passed, the next frame is displayed. - But what if these two aren't synchronized in the input file? - PTS correction kicks in. The input demuxers read the PTS (presentation - timestamp) of the packages, and with it we can see if the streams - are synchronized. Then MPlayer can correct the a_frame, within - a given maximal bounder (see -mc option). The summary of the - corrections can be found in c_total . - - Of course this is not everything, several things suck. - For example the soundcards delay, which has to be corrected by - MPlayer! The audio delay is the sum of all these: - - bytes read since the last timestamp: - t1 = d_audio->pts_bytes/sh_audio->i_bps - - if Win32/ACM then the bytes stored in audio input buffer - t2 = a_in_buffer_len/sh_audio->i_bps - - uncompressed bytes in audio out buffer - t3 = a_buffer_len/sh_audio->o_bps - - not yet played bytes stored in the soundcard's (or DMA's) buffer - t4 = get_audio_delay()/sh_audio->o_bps - - From this we can calculate what PTS we need for the just played - audio, then after we compare this with the video's PTS, we have - the difference! - - Life didn't get simpler with AVI. There's the "official" timing - method, the BPS-based, so the header contains how many compressed - audio bytes or chunks belong to one second of frames. - In the AVI stream header there are 2 important fields, the - dwSampleSize, and dwRate/dwScale pairs: - - If the dwSampleSize is 0, then it's VBR stream, so its bitrate - isn't constant. It means that 1 chunk stores 1 sample, and - dwRate/dwScale gives the chunks/sec value. - - If the dwSampleSize is >0, then it's constant bitrate, and the - time can be measured this way: time = (bytepos/dwSampleSize) / - (dwRate/dwScale) (so the sample's number is divided with the - samplerate). Now the audio can be handled as a stream, which can - be cut to chunks, but can be one chunk also. - - The other method can be used only for interleaved files: from - the order of the chunks, a timestamp (PTS) value can be calculated. - The PTS of the video chunks are simple: chunk number * fps - The audio is the same as the previous video chunk was. - We have to pay attention to the so called "audio preload", that is, - there is a delay between the audio and video streams. This is - usually 0.5-1.0 sec, but can be totally different. - The exact value was measured until now, but now the demux_avi.c - handles it: at the audio chunk after the first video, it calculates - the A/V difference, and take this as a measure for audio preload. + When playing (a/v), it increases the variables by the duration of the + played a/v. + - with audio this is played bytes / sh_audio->o_bps + Note: i_bps = number of compressed bytes for one second of audio + o_bps = number of uncompressed bytes for one second of audio + (this is = bps*samplerate*channels) + - with video this is usually == 1.0/fps, but I have to note that + fps doesn't really matters at video, for example asf doesn't have that, + instead there is "duration" and it can change per frame. + MPEG2 has "repeat_count" which delays the frame by 1-2.5 ... + Maybe only AVI and MPEG1 has fixed fps. + + So everything works right until the audio and video are in perfect + synchronity, since the audio goes, it gives the timing, and if the + time of a frame passed, the next frame is displayed. + But what if these two aren't synchronized in the input file? + PTS correction kicks in. The input demuxers read the PTS (presentation + timestamp) of the packages, and with it we can see if the streams + are synchronized. Then MPlayer can correct the a_frame, within + a given maximal bounder (see -mc option). The summary of the + corrections can be found in c_total . + + Of course this is not everything, several things suck. + For example the soundcards delay, which has to be corrected by + MPlayer! The audio delay is the sum of all these: + - bytes read since the last timestamp: + t1 = d_audio->pts_bytes/sh_audio->i_bps + - if Win32/ACM then the bytes stored in audio input buffer + t2 = a_in_buffer_len/sh_audio->i_bps + - uncompressed bytes in audio out buffer + t3 = a_buffer_len/sh_audio->o_bps + - not yet played bytes stored in the soundcard's (or DMA's) buffer + t4 = get_audio_delay()/sh_audio->o_bps + + From this we can calculate what PTS we need for the just played + audio, then after we compare this with the video's PTS, we have + the difference! + + Life didn't get simpler with AVI. There's the "official" timing + method, the BPS-based, so the header contains how many compressed + audio bytes or chunks belong to one second of frames. + In the AVI stream header there are 2 important fields, the + dwSampleSize, and dwRate/dwScale pairs: + - If the dwSampleSize is 0, then it's VBR stream, so its bitrate + isn't constant. It means that 1 chunk stores 1 sample, and + dwRate/dwScale gives the chunks/sec value. + - If the dwSampleSize is >0, then it's constant bitrate, and the + time can be measured this way: time = (bytepos/dwSampleSize) / + (dwRate/dwScale) (so the sample's number is divided with the + samplerate). Now the audio can be handled as a stream, which can + be cut to chunks, but can be one chunk also. + + The other method can be used only for interleaved files: from + the order of the chunks, a timestamp (PTS) value can be calculated. + The PTS of the video chunks are simple: chunk number * fps + The audio is the same as the previous video chunk was. + We have to pay attention to the so called "audio preload", that is, + there is a delay between the audio and video streams. This is + usually 0.5-1.0 sec, but can be totally different. + The exact value was measured until now, but now the demux_avi.c + handles it: at the audio chunk after the first video, it calculates + the A/V difference, and take this as a measure for audio preload. 3.a. audio playback: - Some words on audio playback: - Not the playing is hard, but: - 1. knowing when to write into the buffer, without blocking - 2. knowing how much was played of what we wrote into - The first is needed for audio decoding, and to keep the buffer - full (so the audio will never skip). And the second is needed for - correct timing, because some soundcards delay even 3-7 seconds, - which can't be forgotten about. - To solve this, the OSS gives several possibilities: - - ioctl(SNDCTL_DSP_GETODELAY): tells how many unplayed bytes are in - the soundcard's buffer -> perfect for timing, but not all drivers - support it :( - - ioctl(SNDCTL_DSP_GETOSPACE): tells how much can we write into the - soundcard's buffer, without blocking. If the driver doesn't - support GETODELAY, we can use this to know how much the delay is. - - select(): should tell if we can write into the buffer without - blocking. Unfortunately it doesn't say how much we could :(( - Also, doesn't/badly works with some drivers. - Only used if none of the above works. + Some words on audio playback: + Not the playing is hard, but: + 1. knowing when to write into the buffer, without blocking + 2. knowing how much was played of what we wrote into + The first is needed for audio decoding, and to keep the buffer + full (so the audio will never skip). And the second is needed for + correct timing, because some soundcards delay even 3-7 seconds, + which can't be forgotten about. + To solve this, the OSS gives several possibilities: + - ioctl(SNDCTL_DSP_GETODELAY): tells how many unplayed bytes are in + the soundcard's buffer -> perfect for timing, but not all drivers + support it :( + - ioctl(SNDCTL_DSP_GETOSPACE): tells how much can we write into the + soundcard's buffer, without blocking. If the driver doesn't + support GETODELAY, we can use this to know how much the delay is. + - select(): should tell if we can write into the buffer without + blocking. Unfortunately it doesn't say how much we could :(( + Also, doesn't/badly works with some drivers. + Only used if none of the above works. 4. Codecs. Consists of libmpcodecs/* and separate files or libs, for example liba52, libmpeg2, xa/*, alaw.c, opendivx/*, loader, mp3lib. |