summaryrefslogtreecommitdiffstats
path: root/DOCS/tech/colorspaces.txt
blob: 4fb887509837cce06e3e6380bc3d5915651d4ac0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
In general
==========

There are planar and packed modes.
- Planar mode means: you have 3 separated image, one for each component,
each image 8 bites/pixel. To get the real colored pixel, you have to
mix the components from all planes. The resolution of planes may differ!
- Packed mode means: you have all components mixed/interleaved together,
so you have small "packs" of components in a single, big image.

There are RGB and YUV colorspaces.
- RGB: Read, Green and Blue components. Used by analog VGA monitors.
- YUV: Luminance (Y) and Chrominance (U,V) components. Used by some
  video systems, like PAL. Also most m(j)peg/dct based codecs use this.

With YUV, they used to reduce the resolution of U,V planes:
The most common YUV formats:
fourcc:    bpp: IEEE:      plane sizes: (w=width h=height of original image)
?          24   YUV 4:4:4  Y: w * h  U,V: w * h
YUY2,UYVY  16   YUV 4:2:2  Y: w * h  U,V: (w/2) * h
YV12,I420  12   YUV 4:2:0  Y: w * h  U,V: (w/2) * (h/2)
YVU9        9   YUV 4:1:1  Y: w * h  U,V: (w/4) * (h/4)

conversion: (some cut'n'paste from www and maillist)

RGB to YUV Conversion:
    Y  =      (0.257 * R) + (0.504 * G) + (0.098 * B) + 16
    Cr = V =  (0.439 * R) - (0.368 * G) - (0.071 * B) + 128
    Cb = U = -(0.148 * R) - (0.291 * G) + (0.439 * B) + 128
YUV to RGB Conversion:
    B = 1.164(Y - 16)                  + 2.018(U - 128)
    G = 1.164(Y - 16) - 0.813(V - 128) - 0.391(U - 128)
    R = 1.164(Y - 16) + 1.596(V - 128)

In both these cases, you have to clamp the output values to keep them in
the [0-255] range. Rumour has it that the valid range is actually a subset
of [0-255] (I've seen an RGB range of [16-235] mentioned) but clamping the
values into [0-255] seems to produce acceptable results to me.

Julien (surname unknown) suggests that there are problems with the above
formulae and suggests the following instead: 
         Y = 0.299R + 0.587G + 0.114B
    Cb = U'= (B-Y)*0.565
    Cr = V'= (R-Y)*0.713
with reciprocal versions:
    R = Y + 1.403V'
    G = Y - 0.344U' - 0.714V'
    B = Y + 1.770U'
note: this formule doesn't contain the +128 offsets of U,V values!

Conclusion:
Y = luminance, the weighted average of R G B components. (0=black 255=white)
U = Cb = blue component (0=green 128=grey 255=blue)
V = Cr = red component  (0=green 128=grey 255=red)


Huh. The planar YUV modes.
==========================

The most missunderstood thingie...

In MPlayer, we usually have 3 pointers to the Y, U and V planes, so it
doesn't matter what is the order of the planes in the memory:
    for mp_image_t and libvo's draw_slice():
	planes[0] = Y = luminance
	planes[1] = U = Cb = blue
	planes[2] = V = Cr = red
    Note: planes[1] is ALWAYS U, and planes[2] is V, the fourcc
    (YV12 vs. I420) doesn't matter here! So, every codecs using 3 pointers
    (not only the first one) normally supports YV12 and I420 (=IYUV) too!

But there are some codecs (vfw, dshow) and vo drivers (xv) ignoring the 2nd
and 3rd pointer, and use only a single pointer to the planar yuv image. In
this case we must know the right order and alignment of planes in the memory!

from the webartz fourcc list:
YV12:  12 bpp, full sized Y plane followed by 2x2 subsampled V and U planes
I420:  12 bpp, full sized Y plane followed by 2x2 subsampled U and V planes
IYUV:  the same as I420
YVU9:   9 bpp, full sized Y plane followed by 4x4 subsampled V and U planes