File: resip/stack/doc/resip-epoll-notes.txt
Author: Kennard White
Created: Nov 10, 2010
Updated: Dec 3, 2010

This memo describes the epoll() support within resiprocate.

NOTE: This only applies to platforms with epoll (most UNIX, but not Windows).

Overview
--------
The primary purpose of the first version of epoll() support is to allow
resiprocate to have more than 1024 sockets concurrently open. With
select() based system, the fd_set is compile-time limited to 1024 file
descriptors (starting at fd zero). Some systems (which?) allow this
to be increased at compile time, but there are several warnings in code
not to try this. So I didn't :-).

See resip/stack/test/testStack for example of using epoll mode.

For the curous, my immediate application for needing many concurrent sockets
is a server-side instance of repro that is handling TCP connections in
"outbound" mode; e.g. the connection stay open indefinately.

By default, epoll() support is disabled. A runtime call to
SipStack::setDefaultUseInternalPoll(true).  must be made to turn-on
epoll support.

This version uses epoll() internally, but doesn't change any external
interfaces. E.g., StackThread doesn't change at all. This is possible
for two reasons:
  * epoll maybe used hierarchically/recursively.
  * select works fine for handful of descriptors that are small

In particular, it is possible to select() on the epoll
file-descriptor(s). SipStack internally opens an epoll descriptor that
it uses to manage sockets resip-ares (for DNS lookup) and transports.
This epoll descriptor is passe up to the threads which can then use
select.  All the descriptors seen by select are opened up early and are
less than 1024.

Performance
-----------
This primary purpose of the change is to support a large number of
transports/connections, something that is just not possible with select().
The epoll support, in its current form, actually makes things slower!
This is because (I assume) there are two levels of system calls involved.

Changes closely related to epoll
--------------------------------
Added rutil/FdPoll files and related classes. This implements the system
calls for epoll. Note that this ONLY supports epoll, unlike Poll.hxx
which ONLY supports select and poll. (Aside from the system calls used,
the purpose and interfaces are different).

Changes to contrib/ares and rutil/dns/AresDns. The challenge here is
that while ares only uses a handful of concurrent socket, they are
dynamically opened. In particular, they can be opened after we have all
of our tranport sockets open, and thus ares will get fds > 1024 and the
fdset stuff will fail. Thus needed to make ares work with epoll. Added
new callback system from contrib/ares into AresDns class, which then
handles the epoll interface on its behalf. This is the trickiest
set of changes.

Plumb useInternalPoll flag from SipStack, ExternalDns, DnsStub, to AresDns.

For Transport, when using InternalPoll, work previously performed by
process() is now performed by combination of processPollEvent() and
processTransmmitQueue().  The former is called by the FdPoll dispatcher,
while the later is called by TransportSelector.

Restructure Connection objects to pass received messages up
to owning transport rather than using a fifo passed in via process().

resip/stack/test/testStack extended to have --epoll option to enable.
This doesn't test very much yet, but it is something.

Other Changes
-------------
Changed UdpTransport::hasDataToSend() to always return false. We use
the socket-writability callback (fdset or poll) to drive draining the
queue. Returning true here makes the select() timeout zero, which isn't
what we want. E.g., we will wait for writability, even if process() is
immediately invoked.

Fix some test programs (port selection & log levels)

Fix ares set non-blocking function. Added a lot of diags (now commented out)
to find this problem.

Add getSocketError() function to rutil/Socket.cxx

Clarify the multi-platform Socket support in rutil/Socket.hxx

Fix compiler warning in OpenSSLInit.cxx

Fix int vs long int warnings in ares

Questions
---------
Does any code use rutil/Poll.hxx?

Valgrind w/epoll
----------------
Valgrind has issues when the application is using more than ~1000 sockets.
See bug http://bugs.kde.org/show_bug.cgi?id=73146
As written in the bug report "it doesn't seem that important".
You cannot use resip::increaseLimitFds() (e.g., -n to testStack).
To work around, this you need to increase the number of allowed fds *before*
running valgrind. E.g.:
   % sudo bash
   # ulimit -Hn 200000
   # ulimit -Sn 200000
   # valgrind --leak-check=full ./testStack -n 10000

Vi editing
----------
(These probably belong elsewhere). Resip uses 3-space indends
without tabs. Suitable options for vi are:
   set shiftwidth=3
   set expandtab

The command
   :%retab
will get rid of all existing tabs

The following can be added at the bottom of each file:
   /* ex: set shiftwidth=3 expandtab: */
For above to work, you must have modelines enabled in your ~/.vimrc
   set modeline
It is disabled by default on many platforms for security reasons.
