Update: renamed mcerlang to erlmc to avoid namespace conflicts with the model checker.
» http://github.com/JacobVorreuter/erlmc
the protocol
The binary protocol was introduced with memcached version 1.3. It provides a more efficient and extensible alternative to the text-based protocol that preceded it. All of the operations supported by the original protocol are available in the new protocol, and for this reason erlmc uses only the binary protocol for client-server communication.
download erlmc
download erlmc-0.2.tgz
-or-
git clone git://github.com/JacobVorreuter/erlmc.git
unpack and install erlmc
** ensure that memcached is running at this point (>= version 1.3)
tar xvf erlmc-0.2.tgz
cd erlmc-0.2
make
make test (optional)
sudo make install
start erlmc
Open an Erlang shell and start erlmc.
$jacobvorreuter> erl
Eshell V5.7.2 (abort with ^G)
1> erlmc:start().
ok
2> erlmc:stats().
[{{"localhost",11211},
[{evictions,"0"},
{total_items,"7"},
{curr_items,"0"},
{bytes,"0"},
{conn_yields,"0"},
{threads,"5"},
{cmd_set,[...]},
{cmd_get,...},
{...}|...]}]
The client will connect with the instance of memcached running on localhost, port 11211 by default. To connect to a different server or multiple server instances you must specify a list of server configurations when starting erlmc:
1> erlmc:start([{"localhost", 22222, 1}]).
ok.
2> erlmc:start([{"localhost", 11211, 5}, {"localhost", 22122, 5}]).
ok.
The third element in the server config tuple is the connection pool size (number of open sockets).
basic operations
3> erlmc:get(hello).
<<>>
4> erlmc:set(hello, <<"World">>).
<<>>
5> erlmc:get(hello).
<<"World">>
Uninitialized keys will return an empty binary as will the set operation. Keys can be any Erlang term, however they are converted to strings in the client to allow for hashing over a continuum. Values to be cached must be passed in as binary data.
cache expiration
A third argument can be passed to write operation functions to specify the expiration in seconds.
6> erlmc:set(abc, <<"def">>, 3).
<<>>
7> erlmc:get(abc).
<<"def">>
8> timer:sleep(4000).
ok
9> erlmc:get(abc).
<<>>
distributed memcached and connection pooling
Distributed caching is done by associating a given key with a server instance running memcached. That association must be consistent so that you only have to perform a lookup on one server to determine if data has been cached for that key. This is done with a consistent key hashing function and a continuum. You can imagine a continuum as being a circle with placeholders for your memcached servers spaced around the edge. Each server instance gets multiple placeholders around the circle (100 is a good number). The distribution is determined in erlmc by hashing the server's host and port and a random number into a 128 bit, unsigned integer:
1> <<Int:128/unsigned-integer>> = erlang:md5("localhost" ++"11211" ++ integer_to_list(random:uniform(65000))).
<<101,189,232,82,109,65,199,7,215,240,94,78,204,56,14,55>>
2> Int.
135238083730090345086709718695252856375
server placement around the continuum
Here you can see a sample continuum with two servers and two placeholders each. Assume we've generated four keys by hashing the server hosts, ports and random ints and placed them around the continuum based on their values. The 128 bit uints are abbreviated to fit on the image:

If we were to add a third server to the continuum, it might look like the image below. Note that the placement any new servers on the continuum is random and becomes more evenly distributed when more placeholders are used for each server.

using the continuum to determine which server a cache key belongs to
The point of the continuum is to reliably associate a given cache key with a server. The way to do this is to hash the cache key to a 128 bit uint, place that value on the continuum in the correct spot and move clockwise around until the next largest value is encountered. That is the key for the server to use.
1> <<Int:128/unsigned-integer>> = erlang:md5("zyx").
<<250,201,126,87,150,57,190,63,16,219,103,26,68,98,237,145>>
2> Int.
333353213137747226088668276759233490321

In the example above we've hashed the key "zyx" to the number 333...321 and placed that on the continuum. We then move clockwise around the continuum to find the next server, which happens to be "localhost" 11211 in this case. That is the server that we map the "zyx" key to.
So let's fire up another memcached server instance on a different port and try some distributed caching:
$jacobvorreuter> memcached -d -m 1024 -p 22122 -l localhost
$jacobvorreuter> erl
Eshell V5.7.2 (abort with ^G)
1> erlmc:start([{"localhost", 11211, 1}, {"localhost", 22122, 1}]).
ok
2> erlmc:stats().
[{{"localhost",11211},
[{evictions,"0"},
{total_items,"0"},
{curr_items,"0"},
{bytes,"0"},
{cmd_flush,"0"},
{cmd_set,[...]},
{cmd_get,...},
{...}|...]},
{{"localhost",22122},
[{evictions,"0"},
{total_items,"0"},
{curr_items,"0"},
{bytes,"0"},
{cmd_flush,[...]},
{cmd_set,...},
{...}|...]}]
The erlmc:stats/0 function will return a proplist of [{{Host, Port}, Stats}].
3> erlmc:set("abc", <<"abc">>).
<<>>
4> erlmc:set("zyx", <<"zyx">>).
<<>>
5> erlmc:stats().
[{{"localhost",11211},
[{evictions,"0"},
{total_items,"1"},
{curr_items,"1"},
{bytes,"64"},
{cmd_set,[...]},
{cmd_get,...},
{...}|...]},
{{"localhost",22122},
[{evictions,"0"},
{total_items,"1"},
{curr_items,"1"},
{bytes,"64"},
{cmd_flush,[...]},
{cmd_set,...},
{...}|...]}]
Now, if we set two keys there's a 33% chance that they'll be distributed between the two servers. We can see in the output above that the stats function returns a total_items value of "1" for both servers. Success!
Documentation for erlmc is available at http://github.com/JacobVorreuter/erlmc