summaryrefslogtreecommitdiffstats
path: root/DOCS/tech/realcodecs/audio-codecs.txt
blob: 8a0d958354eacaef8f6a9ea38693294cd4e404be (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
all audio codecs (cook,atrk,14_4,28_8,dnet,sipr) have the same interface,
but i have only analyzed the cook codec


audio properties (hex)

00 short text/description of the format (bitrate, when to use)
01 bitrate (bits/s) //avg. bytes/sec output
02 ulong: ?
   ulong: samples per second
   ushort: bits/sample
   ushort: number of channels
03 same as 02 //constant 2
04 long description
05 constant 1 (always?)
06 ulong: block align (input frame size for RADecode)
07 string: minimum player version
08 n/a
09 n/a
0A n/a
0B n/a
0C n/a
0D ?
0E ? leaf size
0F ?
10 ?
11 ?
12 ?
13 min. output buffer size? max. number of samples?
14 ?




functions:

ulong result=RAOpenCodec2(ra_main_t *raMain);

ulong result=RAInitDecoder(ra_main_t *raMain, ra_init_struct *raInit);
struct ra_init_struct {
	ulong sample_rate;
	ushort bits_per_sample;	// unused by RAInitDecoder
	ushort number_of_channels;
	ushort unknown1;	// 0
	ushort unknown2;	// also unused (100)
	ulong leaf_size;	// leaf size (used for interleaving, but
				// exists in audio stream description header (ASDH))
	ulong block_align;	// packet size
	ulong bits_per_sample;	// unused (always 16)
	char *ext_data;		// 16 bytes located at the end of the
				// ASDH
};

There are some information missing that you usually need for playback,
like bits per sample (the fileds aren't read by RAInitDecoder()). These
are hard coded in the "flavors", i.e. the sub formats. A flavor is an entry
in the list of available format variations like bitrate, number of channels,
decoding algorithm, and so on.We can get those information with the
following command:



void *GetRAFlavorProperty(ra_main_t *raMain, ulong flavor, ulong property,
	short *property_length_in_bytes);
returns property data for a specific data

This is not important, because it's just a read only function.
These flavor properties don't seem to exist in


ulong RADecode(ra_main_t *raMain, char *input_buffer,
	ulong input_buffer_size, char *output_buffer,
	ulong *decoded_bytes, ulong p6=-1);

RAFreeDecoder(ra_main_t *);

RACloseCodec(ra_main_t *);


ulong RASetFlavor(ra_main_t *ra_main, ulong flavor);

Set the flavor of the stream.

a flavor is an entry in the list of available format variations like
bitrate, number of channels, decoding algorithm, and so on


audio data storage:
-------------------

With Real Audio V5 (or earlier?), the audio streams can be interleaved,
i.e. the stream is striped amongst several data packets. The packets
(which have a fixed size packet_len) are split up into a fixed number
of num_parts equally sized parts - I call them leaves in lack of
better name. The leaves have the size leaf_size = packet_len / num_parts.

To create a bunch of packets, you need 2*num_parts stream packets.
The first part of the first stream packet is stored in leaf number 0,
the first part of the second into leaf number num_parts, the one of the
next one into leaf number 1 etc. The following part of a stream packet
is stored 2*num_packets behind the current part of the same stream packet.

In short words: when you have a matrix with the leaves as the values,
it's a transposition in conjunction with a permutation.

packet | leaf          | stream packet, part no.
-------+---------------+------------------------
0      | 0             | (0,0)
0      | 1             | (2,0)
.      | .             | .
.      | .             | .
0      | num_parts-1   | (2*num_parts-2,0)
0      | num_parts     | (1,0)
0      | num_parts+1   | (3,0)
.      | .             | .
.      | .             | .
0      | 2*num_parts-1 | (2*num_parts-1,0)
1      | 0             | (0,1)
.      | .             | .
.      | .             | .


sequence of calls:
------------------

RAOpenCodec2()

RAInitDecoder()

RASetFlavor()

RAGetFlavorProperty(0xE)

sequence of RADecode()s

once a RAGetFlavorProperty(0xE) after some RADecode()s

and occasionally the following sequence:
RAGetFlavorProperty(0)
RAGetFlavorProperty(7)
which is rather pointless because they only return
cleartext audio descriptions

RAFreeDecoder()

RACloseCodec()



RAFlush(ra_main_t *raMain, char *output_buffer, ulong *retval)
will be called when seeking
output_buffer points to the output buffer from the last
decode operation.
retval is unknown, returning always 0x18 in a specific sample
-> further investigation needed