Apr 27, 2012

ASIO Interface

There are several audio engines available under Windows.

Waveform Audio (MME) — the old one, still provides simple and clean API, working anywhere from Windows 3.0 to Windows 7 and even Windows CE.

DirectShow (DirectSound) — powerful architecture with hardware integration.

Core Audio — new audio interface introduced in Windows Vista.

ASIO — driver protocol specified by Steinberg. Its main purpose it to achieve minimal possible latency, which is important for professional applications. Many hardware manufacturers provide ASIO drivers for their products. And if they don't, there is ASIO4ALL.

ASIO API is based on COM technology, and according to Ross Bencina was unfortunately declared with thiscall calling convention.Not many compilers support it, and for Delphi you have no other option, but to use built-in assembler:

// --  --
function unaAsioDriver.controlPanel(): ASIOError;
begin
{$IFDEF CPU64 }
  result := f_asio.controlPanel();
{$ELSE }
  asm
    mov    eax, [self]
    mov    ecx, [eax][f_asio]
    mov    eax, [ecx]
    call   dword ptr [eax + cofs_controlPanel]
    //
    mov    result, eax
  end;
{$ENDIF CPU64 }
end;
where cofs_controlPanel is a displacement for ControlPanel entry point in IASIO interface. Fortunately x64 target uses one standard convention.

ASIO is based on I/O buffers and callbacks. When buffers are filled with recorded audio, a BufferSwitch callback is called, so you have a chance to read fresh data and provide ASIO with new data for playback there. Here is the prototype:

procedure(index: long; processNow: ASIOBool); cdecl;
index specifies buffer index being reported.

There is no way to get IASIO driver instance in callback, so if you open two or more devices at once, you will need two or more different callbacks.

There are plenty of audio sample formats defined in ASIO, so you have to deal with all kind of conversion from one format to another (16 to 24, MSB to LSB and so on).

Samples are organized into buffers, and buffers belong to channels. Each channel could be input or output. That makes it easy to access any sample in any channel at any time (bearing in mind original sample format).

Overall, despite some drawbacks mentioned above, ASIO is a simple and efficient way to record and playback multi-channel audio with a minimal latency.

Feb 22, 2012

RTSP and STUN

RTSP is old but still good. Flexible HTTP-like request/response model makes it easy to create custom applications based on it.

There is one drawback though, which I found impossible to overcome with standard means. When client sends a SETUP request, it should provide transport options via Transport header:

Transport: RTP/AVP;unicast;client_port=4588-4589

The problem is, client may not know its ports. That is because NAT may change the ports when packets are passing through it back and forth.

OK, that is why STUN was defined and should be used, right? But even with STUN it could be impossible. First, STUN server should be running on the same host where RTP will source will be located. Second, STUN server should be running on the same port as RTP source! Again, that is because NAT may change ports. Even when you send a packet from the same socket to the same host, but to different port, source port may change.

Here is what happening:

1) Client binds two sockets to local ports, say 25004 and 25005, and sends two packet to STUN host at port 3478. NAT changes the original client ports to something else, say 35004 and 35005. And those ports are reported back to client by STUN server.

2) Client sends SETUP request to RTSP server, saying it will use ports 35004 and 35005 for RTP/RTCP:

Transport: RTP/AVP;unicast;client_port=35004-35005

3) RTSP server setups an RTP source and starts streaming to client host at ports 35004-35005. Client receives RTP/RTCP packets, so far so good.

4) Now its time for client to send RTCP Receiver Report packet to RTP source, so it will not timeout the client. Client sends a packet from local port 25005 to remote port 5005, in hope NAT will change the source port to 35005, same as it did for STUN packet. But some NATs may be paranoid enough to change the source port to something else, when destination port is different (even though destination host is the same). I have no idea why they do that, but I saw it happen some times.

5) RTCP at source host will receive RR packet from client host, but from an unexpected port (not from 35005). Because there are no bindings between RTSP session and RTP source, it could be impossible for it to map a packet from unexpected port with existing session. And that will lead to removing the client from destinations.

So, unless you are running STUN and RTCP at the same port, there is a possibility for client to be timed out even if it sends RR packets.

What we use is simple (but non-standard!) additional field in Transport header:

Transport: RTP/AVP;unicast;SSRC=1276485;client_port=35004-35005

Client reports its SSRC when establishing a session, so RTP source may be presented with strong binding between RTSP session and RTP destination. Now, even when receiving an RR packet from unexpected port, it can map it to existing session by SSRC included in the packet.

If this field in not provided, source RTCP will simply assume the port provided by client in SETUP request will not change.

Some NATs may modify original SETUP request, if they recognize the RTSP protocol, which simply adds another level of confusion. Disable STUN functionality in client if your router is one of those.

Our new RTSP sample is running at our new development host avoxum.com at port 1500 and is ready to serve local files or re-broadcast remore stream. There is no sound card there, so live recording will not work.

Jan 11, 2012

RegEx library

As a part of SIP NAPTR lookup you may need to do a regex replacement.

Since Delphi up to 2010 does not include regex library, there was no other way but to write our own:

http://lakeofsoft.com/regex/

It was fun creating it, and I hope it would be fun using it.
Any comments and suggestions are welcome.