89 lines
		
	
	
		
			4.2 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			89 lines
		
	
	
		
			4.2 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| CELT is a very low delay audio codec designed for high-quality communications.
 | |
| 
 | |
| Traditional full-bandwidth  codecs such as Vorbis and AAC can offer high
 | |
| quality but they require codec delays of hundreds of milliseconds, which
 | |
| makes them unsuitable  for real-time interactive applications like tele-
 | |
| conferencing. Speech targeted codecs, such as Speex or G.722, have lower
 | |
| 20-40ms delays but their speech focus and limited sampling rates 
 | |
| restricts their quality, especially for music.
 | |
| 
 | |
| Additionally, the other mandatory components of a full network audio system—
 | |
| audio interfaces, routers, jitter buffers— each add their own delay. For lower
 | |
| speed networks the time it takes to serialize a  packet onto the network cable
 | |
| takes considerable time, and over the long distances the speed of light
 | |
| imposes a significant delay.
 | |
| 
 | |
| In teleconferencing— it is important to keep delay low so that the participants
 | |
| can communicate fluidly without talking on top of each  other and so that their
 | |
| own voices don't return after a round trip as an annoying echo.
 | |
| 
 | |
| For network music performance— research has show that the total one way delay
 | |
| must be kept under 25ms to avoid degrading the musicians performance. 
 | |
| 
 | |
| Since many of the sources of delay in a complete system are outside of the
 | |
| user's control (such as the  speed of light) it is often  only possible to
 | |
| reduce the total delay by reducing the codec delay. 
 | |
| 
 | |
| Low delay has traditionally been considered a challenging area in audio codec
 | |
| design, because as a codec is forced to work on the smaller chunks of audio
 | |
| required for low delay it has access to less redundancy and less perceptual
 | |
| information which it can use to reduce the size of the transmitted audio.
 | |
| 
 | |
| CELT is designed to bridge the gap between "music" and "speech" codecs,
 | |
| permitting new very high quality teleconferencing applications, and to go
 | |
| further, permitting latencies much lower than speech codecs normally provide
 | |
| to enable applications such as remote musical collaboration even over long
 | |
| distances.  
 | |
| 
 | |
| In keeping with the Xiph.Org mission—  CELT is also designed to accomplish
 | |
| this without copyright or patent encumbrance. Only by keeping the formats
 | |
| that drive our Internet communication free and unencumbered can we maximize
 | |
| innovation, collaboration, and interoperability.  Fortunately, CELT is ahead
 | |
| of the adoption curve in its target application space, so there should be 
 | |
| no reason for someone who needs what CELT provides to go with a proprietary
 | |
| codec.
 | |
| 
 | |
| CELT has been tested on x86, x86_64, ARM, and the TI C55x DSPs, and should
 | |
| be portable to any platform with a working C compiler and on the order of
 | |
| 100 MIPS of processing power. 
 | |
| 
 | |
| The code is still in early stage, so it may be broken from time to time, and
 | |
| the bit-stream is not frozen yet, so it is different from one version to 
 | |
| another. Oh, and don't complain if it sets your house on fire.
 | |
| 
 | |
| Complaints and accolades can be directed to the CELT mailing list:
 | |
| http://lists.xiph.org/mailman/listinfo/celt-dev/
 | |
| 
 | |
| To compile:
 | |
| % ./configure
 | |
| % make
 | |
| 
 | |
| For platforms without fast floating point support (such as ARM) use the
 | |
| --enable-fixed argument to configure to build a fixed-point version of CELT.
 | |
| 
 | |
| There are Ogg-based encode/decode tools in tools/. These are quite similar to
 | |
| the speexenc/speexdec tools. Use the --help option for details.
 | |
| 
 | |
| There is also a basic tool for testing the encoder and decoder called
 | |
| "testcelt" located in libcelt/: 
 | |
| 
 | |
| % testcelt <rate> <channels> <frame size> <bytes per packet> input.sw output.sw
 | |
| 
 | |
| where input.sw is a 16-bit (machine endian) audio file sampled at 32000 Hz to 
 | |
| 96000 Hz. The output file is already decompressed.  
 | |
| 
 | |
| For example, for a 44.1 kHz mono stream at ~64kbit/sec and with 256 sample
 | |
| frames:
 | |
| 
 | |
| % testcelt 44100 1 256 46 intput.sw output.sw 
 | |
| 
 | |
| Since 44100/256*46*8 = 63393.74 bits/sec.
 | |
| 
 | |
| All even frame sizes from 64 to 512 are currently supported, although
 | |
| power-of-two sizes are recommended  and most CELT development is done
 | |
| using a size of 256.  The delay imposed by CELT is  1.25x - 1.5x  the 
 | |
| frame duration depending on the frame size and some details of CELT's
 | |
| internal operation.  For 256 sample frames the delay is 1.5x  or  384
 | |
| samples, so the total codec delay in the above example is 8.70ms 
 | |
| (1000/(44100/384)).   
 |