DCL === This is the README file for DCL, the HashCaml distributed channel library. DCL implements an asynchronous, message passing model of communication over typed channels. What's so special about this library? The underlying HashCaml machinary allows us to: - Guaranteed type safety during communication by using type-safe marshalling - Generate fresh, typed channel names for communication Other features: - The library is thread-safe - We suspect that the code might be reasonably stable - We get a throughput of around 750 messages / second over a fastish LAN, when bouncing a closure (actually, an environment and code pointer) between two machines. But don't read too much into this figure: we're using bytecode and no thread pooling, and things slow down a bit when GC hits. See the implementation section for more details. Compiling DCL ============= A simple `make' will do the job. This command will produce both the DCL (`dChan.cma') and a test program (`test'). Running the test ================ The DCL comes with a simple example to demonstrate usage. We bounce a lambda (actually, just a code pointer and environment) between two hosts. One host is initially listening, whilst the other is initially sending. Example usage, with host A listening and host B sending: Host A: ./test Host B: ./test or on a single machine via the loopback interface: ./test 127.0.0.1 3333 ./test 127.0.0.1 4444 127.0.0.1 3333 Note that the listener should be started before the sender. Resolvable host names are OK. The compiled code for the test must be identical on both machines, otherwise an unmarshal failure will occur. Using DCL ========= How do we use the DCL? The following section is a basic exposition of the dChan.mli file. Initialisation -------------- At the start of day, we initialise a server thread to listen for incoming messages: val init : ?inet_addr:Unix.inet_addr -> port:int -> unit -> (Thread.t * com_handle_t) This call returns a abstractly-typed pair of a thread identifier (tid) and a communication handle (com_h). Description of these values: - tid is the thread identifier of the server. We may `Thread.join' on tid to wait for the server to return -- this shouldn't happen, but it's a convenient way for us to avoid spinning in the main program code. We may also kill the server using this identifier. An industrial version of this library would include a shutdown command. - com_h encapsulates the server state. We use this handle when adding or removing a message receiver to/from the server. Note that if the socket address is omitted then the server defaults to listening on TCP port 1024 on all internet address of the current host. Sending messages ---------------- We may send a message to another server at a given address by using the `send' command: val send : inet_addr:Unix.inet_addr -> port:int -> chan:'a name -> dat:'a -> unit This function is synchronous. Any exceptions will propagate back to the caller. TCP is used as the transport protocol. Receiving messages ------------------ We register a receiver on a specified channel by invoking the `register_recv' on our communication handle: val register_recv : com_h:com_handle_t -> chan:'a name -> recv:('a -> unit) -> replicate:bool -> timeout:float option -> recv_handle_t If `replicate' is set to true, then the receiver is effectively re-registered after it has handled an incoming message. Otherwise, the receiver is one-shot. The `timeout' parameter specifies if/when the receiver should be unregistered. In the case where both the receiver is replicated and there is a timeout, then the multi-shot receiver is simply de-registered after the timeout. When a value arrives on `chan' it is passed to the receiver. Note that the receiver is executed in its own thread, and therefore any exception raised by the receiver will cause the termination of this thread (but not the server). We may register multiple receivers on the same channel; they are added in LIFO order to the per-channel receiver queue. In order to ensure fairness, replicated receivers are placed at the back of the queue after invocation. The returned receiver handle can be used to unregister the receiver. val unregister_recv : com_h:com_handle_t -> recv_h:recv_handle_t -> unit This call should always succeed, irrespective of whether the receiver has already been unregistered. Implementation ============== The code is compact (circa 500 lines) and reasonably clean. The DCL is built over local channels. The message coding and marshalling occurs inside dChan.ml, whilst lChan.ml handles (decoded) message reception and receiver registration. Enable debugging output by setting Debug.enabled to true. Files ----- server.ml Message reception dChan.ml Distributed channels lChan.ml Local channels exist.ml Functorised existentials debug.ml General debugging output test.ml Test code Distributed channels -------------------- When the server receives a connection, it invokes DChan.recv. This function attempts to unmarshal the byte sequence to a message, raising an exception if the unmarshalled value is not of the message type. Otherwise, the message is unpacked and passed to the local channel module (LChan.send). Note that DChan.recv itself performs the majority of its work inside a separate thread to prevent any exceptions from killing the server thread. Messages inhabit the type: exist T. {msg_chan = T name, msg_dat = T} Although Caml lacks true existentials, we may encoded them via the universals-in-records feature. See exist.ml for the gory details, and dChan.ml for the instantiation. Message sending is straightforward. Local channels -------------- The local channel module operates around a hash table, keyed by existentially- coded channel names. Each row of the hash table effectively has the following type (although we actuallly use records for the values, as well as references to speed things up a little): (exist T. T name) * (exist T. T name * T list * (T -> unit) list) Note that we'd like to give our hash table rows the following type: exist T. T name * ('a list * ('a -> unit) list) However, there's no way to do this using without creating a new hash table type constructor and implementation. We maintain the invariant that at least one of either the message list or the receiver list is empty; The addition of a message or a receiver will cause a receiver thread to be created if both queues are now non-empty. A single mutex protects the whole hash table, ensuring that only one thread can be mutating the table at a time. One might considering splitting the lock to allow greater concurrency, but there probably wouldn't be a great performance increase in typical use. We use the `ifname' construct to compare channel names. This construct is morally similar to a typecase construct; more details may be found in HashCaml.README. General restrictions: - Unmarshalled closures cannot rebind to local state. - We cannot marshal C values