Assignment 3: udp2sip (OPTIONAL)
--------------------------------

Again, this assignment is optional.  However, as it should be a
relatively limited amount of work, it is recommended that you pursue
it anyway to get an initial feeling of SIP.

This assignment is a straightforward extension to udp2rtsp:
_in_addition_ to RTSP-based access to media streams, the new udp2sip
should also allow for SIP control connections.

The other tasks of udp2rtsp remain unchanged: like before, it shall
listen to UDP addresses for incoming RTP-based audio and/or video
streams.  The option of a TCP-based control connection for RTSP shall
remain and a SIP-based listening port shall be added, so that there
are two alternatives to obtain access to a UDP-based packet stream.
With SIP, you shall implement a small UA that accepts incoming calls
from a SIP-based IP phone (software client or hardware phone),
performs offer/answer negotiation, and then forwards media packets
accordingly until the IP phone is hung up.  Forwarding shall, again,
be done for both RTP and RTCP packets.  RTCP packets from the client
shall be fowarded back to the media sender.

udp2sip shall be controlled via the same set of command line
parameters like udp2rtsp with one addition: -S <addr-spec> to indicate
the SIP UA listening address.

udpspy -a <addr-spec> -v <addr-spec> -i <interface> -s -l <dumplen>
       -r <tcp-addr-spec> -f <output-file> -S <addr-spec>
	

       -a: used to specify an audio transport address that shall be
	   used for listening for incoming UDP packets.  Option -a may
	   be given exactly once and requires two UDP listeing sockets
	   to be created (for one RTP and one for RTCP).  The
	   <addr-spec> may use any of the following formats:

	   a.b.c.d/p1[-p2]      with a.b.c.d being an IPv4 address in
				dotted decimal notation.
	   hostname/p1[-p2]	with hostname being an hostname that can
				be resolved into an IPv4 (or IPv6)
				address by using the name resolution
				mechanisms offered by the operating system
	   /p1-[p2]		only specifies a port number so that
				INADDR_ANY is used as IP address; this
				implies unicast reception.  In this
				case, assume IPv4.
	   a:b:c:d:e:f:g:h/p1-[p2] (voluntary) with a:...:h representing
				an IPv6 address.

				Note: RFC2732 contains numerous
				examples for valid textual notations
				for IPv6 addresses.

	   p1 and p2 indicate the port number for RTP and RTCP,
	   respective.  If p2 is not specified, p2=p1+1 is assumed.

	   As a simplification, an optional /<payload-type> may be
	   appended to provide the audio payload type (e.g. a.b.c.d/p1-p2/pt).  
	   You do not have to support this parameter; if you don't you
	   just have to extract the payload type from the RTP headers.

	   For the IP addresses, the assumptions from assignment 1 hold:

	   All addresses and hostnames may be or resolve to unicast as
	   well as multicast addresses.  Remember: multicast addresses
	   are class D IP addresses and are in the range 224.0.0.0 to
	   239.255.255.255.  Just determine the address type and then
	   handle the address accordingly.

	   Note: Resolving IP address using getaddrinfo() or
	   gethostbyname (), etc. may return multiple IP addresses.
	   You only need to process one of these but you need to make
	   sure that the address family is the "right" one.

	   The IP unicast address and the port number (or just the
	   port number) but _not_ the IP multicast address need to be
	   handed to the bind() system call in the data structure
	   struct sockaddr_in (for IPv4).  For multicast reception, do
	   not forget the setsockopt() with IP_ADD_MEMBERSHIP.

       -v: same as -a but for video

       -i: Used in conjunction with multicasting only and allowed to
	   be specified at most once, this option provides the
	   local IP address of the interface that shall be used for
	   joining all multicast groups.  This information needs to go
	   along with IP_ADD_MEMBERSHIP in "struct ip_mreq".

       -f: Dump contents and/or summary of the received packets into a
           file.  The filename is specified as argument <output-file>.
	   If <output-file> consists of the character '-' stdout is
           used.  If the option -f is not specified no output will be
           produced.

       -s: Short form: turns off the hexdump output and thus creates a
	   single line of output per packet containing a timestamp
	   with microsecond resolution, source and destination IP
	   address and port number and the packet size.

	   For RTP packets, it also shows the SSRC, sequence number,
	   RTP timestamp, payload type and the value of the M bit.

	   For RTCP packets, the SSRC and the CNAME of the sender are
	   printed.

	   If -s is not given, the output will also have the hexdump.

       -l: Limits the hexdump output length to the specified number of
	   bytes per packet (otherwise, the full packet contents shall
	   be dumped).

       -r: Open a listening socket for incoming TCP-based RTSP control
	   connections.  A media player client shall be able to
	   connect to the respective <address> and <port> pair by
	   specifying the following RTSP URI:

	   rtsp://<address>:<port>/<local-resource>

	   udp2rtsp must receive, parse, process RTSP 1.0 commands
	   from the client and generate proper responses.  The
	   commands to be understood include DESCRIBE, SETUP, PLAY,
	   PAUSE, and TEARDOWN.  OPTIONS should be understood (and it
	   easy to respond to anyway).  SET_PARAMETER and
	   GET_PARAMETER may be understood and dealt with by means of
	   a minimal response.

	   You do not have to support all of the RTSP headers but
	   remember that responses need to have matching sequence
	   numbers and that the Session: header needs to be filled in
	   properly.  The Transport: header is key to session setup
	   in both requests and responses.

	   The DESCRIBE method needs to return an SDP-encoded session
	   description (Content-Type: application/sdp, don't forget to
	   set Content-Length: accordingly).

	   The SDP description shall contain m= lines depending on the
	   command line parameters -a and -v and the <local-resource>
	   of the RTSP URI.  If -a is specified, m=audio should be
	   included, if -v is specified, m=video should be included.
	   If both are present, we get two m= lines.

	   The <local-resource> parameter may -- voluntarily -- be
	   used to select just one out of two media streams: if -a and
	   -v are specified on the command line, <local-resource> may
	   choose just one of the streams if present:

	   rtsp://<address>:<port>/audio    for audio and
	   rtsp://<address>:<port>/video    for video.
	   
	   rtsp://<address>:<port>/all	    will select both streams.
	   
	   As soon as a PLAY command is received, media packets from
	   the respective addresses (-a and/or -v) shall be forwarded
	   to the media client.  Whenever PAUSE is received, the
	   forwarding shall be suspended.

	   TEARDOWN terminates the media session and the udp2rtsp process.

       -S: Open a listening socket for incoming TCP-based SIP control
	   connections.  A SIP phone shall be able to connect to the
	   respective <address> and <port> pair by specifying the
	   following SIP URI:

	   sip:<resource>@<address>:<port>

	   udp2sip must receive, parse, process SIP 2.0 commands from
	   the client and generate proper responses.  The commands to
	   be understood include INVITE, ACK, and BYE.  OPTIONS may be
	   understood (and it easy to respond to anyway).

	   You only have to support the minimal SIP header fields
	   needed.  Remember that responses need to have matching
	   sequence numbers, that To:-tags need to be added, etc.
	   The SDP-based offer/answer negotiation is key to your media
	   streaming forwarding setup.

	   Depending on the IP phone's capabilities, you may receive
	   one or two m= lines for audio and one m= line for video.

	   One audio line usually lists a number of codecs understood
	   by the IP phone; for details you may need to consult the
	   a=rtpmap: line that maps a codec name to a particular
	   payload type.  You need to filter out the codecs you do not
	   want to use (you don't have to but you should) and if there
	   is no matching codec for your media stream format you MUST
	   reject the offer.  You may use the a=sendonly attribute to
	   indicate to the IP telephony client that you will send
	   media only (your IP phone client may react with a re-INVITE
	   to this).  The same handling applies to m=video.

	   Another audio line may indicate support for the generation
	   of telephone-events.  Just reject this in your answer by
	   setting the corresponding port number to 0.

	   The username ,termed <resource> above, may -- voluntarily
	   -- be used to select just one out of two possible media
	   streams: if -a and -v are specified on the command line,
	   <local-resource> may choose just one of the streams if
	   present:

	   sip:audio@<address>:<port>     for audio and
	   sip:video@<address>:<port>     for video.
	   
	   sip:all@<address>:<port>       will select both streams
					  (or whatever is possible 
					  based upon the offer/answer
					  exchange)
	   
	   You may start forwarding packet immediately after sending
	   the 200 OK message (i.e., you do not have to wait for the
	   ACK).
	   
	   BYE terminates the media session and the udp2rtsp process.

Note that this is an exercise to get some feeling for parsing
text-based protocols and that you do not have to implement a full SIP
state machine (otherwise this will just grow too large).  So, you may
make simplifying assumptions.  In particular, you should work without
registering.  If you try to REGISTER with some SIP proxy, you first of
all have to set up or find one and then you will most likely need to
implement HTTP digest authentication for SIP -- which creates a
significant extra overhead at little gain from this exercise's point
of view.

Consult RFC 3261, RFC 3264, and RFC2327 for details on SIP and SDP.

Also note that, again, different IP phones are likely to behave
differently.  Find out which one works best for you.  There are plenty
of SIP clients out there and quite a few ones are free to download.
Examples:

kphone	       http://www.wirlab.net/kphone/
X-lite	       http://www.xten.com/
MS Messenger   (but not all versions support SIP)

The program shall be orderly terminated when the user sends a SIGINT
(equivalent to pressing Control-C) upon which it SHOULD send a BYE
message to the peer, wait for the 200 OK, and then print, as a final
action, the time elapsed since its invocation, the number of received
packets, and the total number of bytes received.  Furthermore, per RTP
stream (not for RTCP), the absolute number of received and missed
packets and the fraction of lost packets shall be printed.

This could look like:

    Duration: 123.123456 seconds, received 57 packets, 4530 bytes total
    SSRC:  0x1234aabb  received:  19345   lost: 123   fraction: 0.63%

This information shall also be output if the program terminates after
the media session was closed with a TEARDOWN command via the control
connection.


Output
------
Example for short form output:

14:09:00.123456 134.102.218.59:40000 -> 134.102.218.58:47000 [79] 0x1234AACC pt=00 T=00122000 #23456 M