Assignment 3: udp2sip (OPTIONAL) -------------------------------- Again, this assignment is optional. However, as it should be a relatively limited amount of work, it is recommended that you pursue it anyway to get an initial feeling of SIP. This assignment is a straightforward extension to udp2rtsp: _in_addition_ to RTSP-based access to media streams, the new udp2sip should also allow for SIP control connections. The other tasks of udp2rtsp remain unchanged: like before, it shall listen to UDP addresses for incoming RTP-based audio and/or video streams. The option of a TCP-based control connection for RTSP shall remain and a SIP-based listening port shall be added, so that there are two alternatives to obtain access to a UDP-based packet stream. With SIP, you shall implement a small UA that accepts incoming calls from a SIP-based IP phone (software client or hardware phone), performs offer/answer negotiation, and then forwards media packets accordingly until the IP phone is hung up. Forwarding shall, again, be done for both RTP and RTCP packets. RTCP packets from the client shall be fowarded back to the media sender. udp2sip shall be controlled via the same set of command line parameters like udp2rtsp with one addition: -S to indicate the SIP UA listening address. udpspy -a -v -i -s -l -r -f -S -a: used to specify an audio transport address that shall be used for listening for incoming UDP packets. Option -a may be given exactly once and requires two UDP listeing sockets to be created (for one RTP and one for RTCP). The may use any of the following formats: a.b.c.d/p1[-p2] with a.b.c.d being an IPv4 address in dotted decimal notation. hostname/p1[-p2] with hostname being an hostname that can be resolved into an IPv4 (or IPv6) address by using the name resolution mechanisms offered by the operating system /p1-[p2] only specifies a port number so that INADDR_ANY is used as IP address; this implies unicast reception. In this case, assume IPv4. a:b:c:d:e:f:g:h/p1-[p2] (voluntary) with a:...:h representing an IPv6 address. Note: RFC2732 contains numerous examples for valid textual notations for IPv6 addresses. p1 and p2 indicate the port number for RTP and RTCP, respective. If p2 is not specified, p2=p1+1 is assumed. As a simplification, an optional / may be appended to provide the audio payload type (e.g. a.b.c.d/p1-p2/pt). You do not have to support this parameter; if you don't you just have to extract the payload type from the RTP headers. For the IP addresses, the assumptions from assignment 1 hold: All addresses and hostnames may be or resolve to unicast as well as multicast addresses. Remember: multicast addresses are class D IP addresses and are in the range 224.0.0.0 to 239.255.255.255. Just determine the address type and then handle the address accordingly. Note: Resolving IP address using getaddrinfo() or gethostbyname (), etc. may return multiple IP addresses. You only need to process one of these but you need to make sure that the address family is the "right" one. The IP unicast address and the port number (or just the port number) but _not_ the IP multicast address need to be handed to the bind() system call in the data structure struct sockaddr_in (for IPv4). For multicast reception, do not forget the setsockopt() with IP_ADD_MEMBERSHIP. -v: same as -a but for video -i: Used in conjunction with multicasting only and allowed to be specified at most once, this option provides the local IP address of the interface that shall be used for joining all multicast groups. This information needs to go along with IP_ADD_MEMBERSHIP in "struct ip_mreq". -f: Dump contents and/or summary of the received packets into a file. The filename is specified as argument . If consists of the character '-' stdout is used. If the option -f is not specified no output will be produced. -s: Short form: turns off the hexdump output and thus creates a single line of output per packet containing a timestamp with microsecond resolution, source and destination IP address and port number and the packet size. For RTP packets, it also shows the SSRC, sequence number, RTP timestamp, payload type and the value of the M bit. For RTCP packets, the SSRC and the CNAME of the sender are printed. If -s is not given, the output will also have the hexdump. -l: Limits the hexdump output length to the specified number of bytes per packet (otherwise, the full packet contents shall be dumped). -r: Open a listening socket for incoming TCP-based RTSP control connections. A media player client shall be able to connect to the respective
and pair by specifying the following RTSP URI: rtsp://
:/ udp2rtsp must receive, parse, process RTSP 1.0 commands from the client and generate proper responses. The commands to be understood include DESCRIBE, SETUP, PLAY, PAUSE, and TEARDOWN. OPTIONS should be understood (and it easy to respond to anyway). SET_PARAMETER and GET_PARAMETER may be understood and dealt with by means of a minimal response. You do not have to support all of the RTSP headers but remember that responses need to have matching sequence numbers and that the Session: header needs to be filled in properly. The Transport: header is key to session setup in both requests and responses. The DESCRIBE method needs to return an SDP-encoded session description (Content-Type: application/sdp, don't forget to set Content-Length: accordingly). The SDP description shall contain m= lines depending on the command line parameters -a and -v and the of the RTSP URI. If -a is specified, m=audio should be included, if -v is specified, m=video should be included. If both are present, we get two m= lines. The parameter may -- voluntarily -- be used to select just one out of two media streams: if -a and -v are specified on the command line, may choose just one of the streams if present: rtsp://
:/audio for audio and rtsp://
:/video for video. rtsp://
:/all will select both streams. As soon as a PLAY command is received, media packets from the respective addresses (-a and/or -v) shall be forwarded to the media client. Whenever PAUSE is received, the forwarding shall be suspended. TEARDOWN terminates the media session and the udp2rtsp process. -S: Open a listening socket for incoming TCP-based SIP control connections. A SIP phone shall be able to connect to the respective
and pair by specifying the following SIP URI: sip:@
: udp2sip must receive, parse, process SIP 2.0 commands from the client and generate proper responses. The commands to be understood include INVITE, ACK, and BYE. OPTIONS may be understood (and it easy to respond to anyway). You only have to support the minimal SIP header fields needed. Remember that responses need to have matching sequence numbers, that To:-tags need to be added, etc. The SDP-based offer/answer negotiation is key to your media streaming forwarding setup. Depending on the IP phone's capabilities, you may receive one or two m= lines for audio and one m= line for video. One audio line usually lists a number of codecs understood by the IP phone; for details you may need to consult the a=rtpmap: line that maps a codec name to a particular payload type. You need to filter out the codecs you do not want to use (you don't have to but you should) and if there is no matching codec for your media stream format you MUST reject the offer. You may use the a=sendonly attribute to indicate to the IP telephony client that you will send media only (your IP phone client may react with a re-INVITE to this). The same handling applies to m=video. Another audio line may indicate support for the generation of telephone-events. Just reject this in your answer by setting the corresponding port number to 0. The username ,termed above, may -- voluntarily -- be used to select just one out of two possible media streams: if -a and -v are specified on the command line, may choose just one of the streams if present: sip:audio@
: for audio and sip:video@
: for video. sip:all@
: will select both streams (or whatever is possible based upon the offer/answer exchange) You may start forwarding packet immediately after sending the 200 OK message (i.e., you do not have to wait for the ACK). BYE terminates the media session and the udp2rtsp process. Note that this is an exercise to get some feeling for parsing text-based protocols and that you do not have to implement a full SIP state machine (otherwise this will just grow too large). So, you may make simplifying assumptions. In particular, you should work without registering. If you try to REGISTER with some SIP proxy, you first of all have to set up or find one and then you will most likely need to implement HTTP digest authentication for SIP -- which creates a significant extra overhead at little gain from this exercise's point of view. Consult RFC 3261, RFC 3264, and RFC2327 for details on SIP and SDP. Also note that, again, different IP phones are likely to behave differently. Find out which one works best for you. There are plenty of SIP clients out there and quite a few ones are free to download. Examples: kphone http://www.wirlab.net/kphone/ X-lite http://www.xten.com/ MS Messenger (but not all versions support SIP) The program shall be orderly terminated when the user sends a SIGINT (equivalent to pressing Control-C) upon which it SHOULD send a BYE message to the peer, wait for the 200 OK, and then print, as a final action, the time elapsed since its invocation, the number of received packets, and the total number of bytes received. Furthermore, per RTP stream (not for RTCP), the absolute number of received and missed packets and the fraction of lost packets shall be printed. This could look like: Duration: 123.123456 seconds, received 57 packets, 4530 bytes total SSRC: 0x1234aabb received: 19345 lost: 123 fraction: 0.63% This information shall also be output if the program terminates after the media session was closed with a TEARDOWN command via the control connection. Output ------ Example for short form output: 14:09:00.123456 134.102.218.59:40000 -> 134.102.218.58:47000 [79] 0x1234AACC pt=00 T=00122000 #23456 M