Designing for Actor Based Systems

Many people are intrigued and excited about Erlang style concurrency. Once they get the capability in their hands though they realize that they don’t know how to take advantage of the capabilities processes or actors provide. To do this we need to understand how to decompose systems with process based concurrency in mind. Keep in mind, that this material works equally well for actors in Scala or agents in F#. Differences between actors and processes don’t much matter for the sake of this discussion. Before we dive into process based design it will be helpful to look at a more familiar approach so we can contrast the two.

If you come from an OO background your natural instinct is to design much like you do when decomposing a problem for OO programming. After all, processes are much like objects in that they send messages to one another and they hold state. It goes something like this

  1. Determine your use cases
  2. Create a narrative of what it is you are trying to design
  3. Run through the narrative and pull out the nouns as potential classes
  4. Do the same for the verbs acting on the nouns as potential methods on the classes
  5. Clean all this up getting consolidating any duplication

For example, lets say you were trying to build software to run a vending machine. The use cases might be paying for and getting a soda. Another one might be paying too little and getting change back. So one of the narratives there might be

As a customer I put sufficient coins into the vending machine and then press the selection button for coke and then press the button to vend and the robotic arm fetches a coke and dumps it into the pickup tray. The coke is nice and cold because the cooling system keeps the air in the vending machine at 50 degrees.

Now we think about all the unique nouns in our narrative which are: customer, coins, vending machine, selection button, coke, vend button, robotic arm and cooling system. And we generally turn them into objects. Next we consider the verbs that act on those nouns and consider them for methods.

selection button : push

vend button : push

robotic arm : pickup (coke)

etcetera… After this you apply some lovely object oriented design principles and voila – you have a system that nicely models your narrative but does not take advantage of more than a single core on your system and is positively undistributable.

Oh come on you say, don’t be daft, substitute the word object for actor and you are good to go. Well as it turns out not quite. Do we really need a coin process, how about a vend button process and lets not forget the coke process?? Darnit, this makes no sense! Lets back up and see what we can do about this.

Designing for Process Based Concurrency

The first thing you must do before we move on is say this three times

“Processes are not threads. Processes are really cheap. Ohmmmm”

“Processes are not threads. Processes are really cheap. Ohmmmm”

“Processes are not threads. Processes are really cheap. Ohmmmm”

This trips up folks new to process bases systems. They want to be stingy with processes worrying that they will take a long time to create, have massive context switching times, pollute L1 cache, etc… Remember that in almost all such systems, certainly for Erlang, Scala and F# processes/actors/agents are green threads. This means they have their own schedulers built into the VM they are running in. You never have to swap out a thread to switch between running one process vs another. With Erlang based systems you usually configure one erlang scheduler per core on the system. These schedulers remain relatively constant.

With that in mind we solve some of the sticking points many new to process based systems have; not taking advantage of all the concurrency in the system or using complex “process pooling”.

One process for each truly concurrent activity in the system.

That is the rule. Going back to our vending machine what do we have that is really concurrent in that system. Coins? Not really. Slots? Not really. Buttons? Not really. Those are not activities they are things. What are the truly concurrent activities, the activities that do not have to happen in synchronous lock step?

  • Putting coins into the slot
  • Handling coins
  • Handling selections
  • Fetching the coke and putting it into the pickup tray
  • Cooling the soda

We can use a process for all of these activities. We don’t If you want to name them for the nouns that perform the activities – but remember we are not making them processes because they are nouns. Notice how we go granular here, we did not just create a process for the customer and the vending machine. We created one for all the truly concurrent activities in our narrative – in this way we leverage more of the concurrency available to us. Now we know what processes we need, the next step is to organize them.

Organizing Processes

The various languages that use process based concurrency have differing levels of sophistication here. I am going to draw on the concepts from the Erlang language which have been used in the Akka system for Scala and which I have rolled successfully myself in F#.

Again, forget all of your OO modeling techniques. Processes are not objects, they are fundamental units of concurrency. Forget all your thread modeling techniques – it’s not even close. Share nothing copy everything changes the game. To get started think about which of your processes have to cooperate with one another. In this case what do we have.

  • Putting coins in a slot, cooperates with
  • Handling coins, cooperates with
  • Handling selections, cooperates with
  • Fetching the coke and putting it into the pickup tray

and nothing cooperates with, Cooling the soda – it happens whether or not other processes are there to support it or not. Putting coins in the slot however makes no sense if there is no way to handle them, and handing them makes no sense of you can’t make a selection and making a selection… well you get the drift.

To model this we are going to use a tree of “Supervisors”. Supervisors create and watch over processes. Because of copy everything share nothing properties of actors one can’t corrupt another. So, a supervisor can watch over an actor and restart it when it blows up in the presence of some error. This means we get some incredible fault tolerance. But, that aside, lets talk about how to model these dependencies. We do so in a tree. First, we setup a supervisor at the top of the tree which models no dependencies between any of the processes it starts. In this layer we add the cooling system and then we add another supervisor which will start the group of dependent processes in the order in which they depend on one another. This supervisor will restart processes that die according to their dependencies. If a dependency dies the supervisor will kill and restart dependent processes so that everything starts in a known base state down the chain.


proctree

Now with things decomposed into processes, dependencies fleshed out and placed into a supervision hierarchy you are basically ready to go. Is there more to design for actor based concurrency – yes of course there is but here you have the fundamentals. Now it’s time to go and play with it and generate questions. Feel free to ask them here or on twitter at @martinjlogan.  I may do a second installment on some more advanced topics based on feedback.

If you want to learn more come to Erlang Camp Oct 10 and 11, 2014 in Austin!

How to use Vim for Erlang Development

vim editor logo

This post sponsored by ErlangCamp 2013 in Nashville which was epic!

You are about to learn to use Vim as your editor for Erlang development. You will learn how to install and use a variety of really powerful Vim plugins to make Erlang dev with Vim smooth and satisfying!

I have been developing Erlang now for about 13 years, many of them full time and even wrote a book on Erlang: Erlang & OTP in Action. I have loved every minute of it but there was always one thing that made me sad, probably makes you sad too – Emacs. Emacs is the de-facto editor for Erlang. The emacs mode included with the Erlang distro is quite wonderful. The fact still remains, Emacs, we do not like it. ctrl ~, ctrl x ctrl f etc… Nope!

Setting up Vim for Erlang

Let’s get started setting up Vim for Erlang development. The first thing we need to do is setup pathogen so that installing subsequent packages is really simple. The first thing to do is create the directory $HOME/.vim/autoload. Download pathogen.vim from here and place it into this directory. Now add the following 2 commands to your $HOME/.vimrc file.


call pathogen#infect()
call pathogen#helptags()

At this point pathogen will install and generate help documentation for any plugin you place into the $HOME/.vim/bundle directory – which you should of course create.

With this created now we are ready to start installing plugins to make your life easier. Try these on for size by cloning these git repos directly into the $HOME/.vim/bundle directory. They will simply work next time you start vim.

vimerl.vim Indenting, autocomplete and more for Erlang
ctrlp.vim ctrl p and open a powerful fuzzy file finder. Makes navigating file trees a thing of the past.
NERDTree Powerful file tree navigator right in vim – don’t use it much since I installed ctrlp though.
NERDTree Tabs Add the NERDTree file finder to all tabs you have open in vim.

Before we get into basics on how to use all these plugins to create Erlang magic I want to show you two bonus tricks I really love. First, get a better color scheme. To do this create the directory $HOME/.vim/colors and find yourself a slick color scheme to drop into it. I recommend vividchalk.vim by TPope.

Pro Tip
For dropbox or other file sync users keep all your vim installs in sync easily like so; take your .vim and your .vimrc and move them into your Dropbox directory. Then run:


ln -s ~/Dropbox/.vim ~/.vim
ln -s ~/Dropbox/.vimrc ~/.vimrc

Now all your machines vim installs will run just the same. If you have compatibility problems on any one, well then just skip this for that machine.

Ok, so now on to how to use these plugins for Erlang/Vim greatness.

How to Use our Vim Plugins for Erlang Dev

I am going to use the source for Erlware Commons as an example. So I clone it first and then change into the erlware_commons directory and run vim. Now lets say I know what file I want to update, specifically the “ec_date.erl” file. The first thing I do is type p and then start typing ec_date.erl.

                                                                                                                                                          
~                                                                               
[No Name] [TYPE= unix] [0/1 (100%)]                                             
> test/ec_dictionary_proper.erl
> src/ec_dictionary.erl
> src/ec_date.erl                                                               
 prt  path  ={ files }=  >> ec_da

You can see that as I start typing and get to “ec_da” ctrlp has already displayed a narrowed down list of files in the directory tree under where I have opened vim that match. The file on the bottom ec_date.erl is the one selected and so just pressing enter here will open it. If I wanted to select “test/ec_dictionary_proper.erl” then I could simply press the up arrow and select it or keep typing until it was the only selection.

Now, what if I don’t know what file I want to select? This is where NERDTree comes into play. Run :NERDTree and you will pop open the file browser. Like this:

  Press ? for help             |
                               |~                                               
.. (up a dir)                  |~                                               
<lang-projects/erlware_commons/|~                                               
▸ doc/                         |~                                               
▸ priv/                        |~                                               
▸ src/                         |~                                               
▸ test/                        |~                                               
  CONTRIBUTING.md              |~                                               
  COPYING                      |~                                               
  Makefile                     |~                                               
  README.md                    |~                                               
  rebar.config                 |~                                               
  rebar.config.script          |~                                               
~                              |~                                               
~                              |~                                               
~                              |~                                                                                                                                       

Here we can see the directory tree for Erlware Commons. Each of the directories can be easily selected and expanded. Individual files can be selected and opened. There are a variety of ways to open a file. Below are the most common:

  • <enter> will open the file in the right pane
  • T will open in a new tab within vim and keep focus in NERDTree
  • t will open in a new tab and bring focus to the new tab

IF you want to see the NERDTree browser in all your tabs use :NERDTreeTabsToggle to toggle it on and off. It will be the exact same NERDTree in the exact same state and cursor position on all tabs – nice! Once you are focused on the code in a given tab and you want to jump back to the left and into the NERDTree pane use <ctrl> ww

Once you have a load of tabs open you need to switch between then and to do this you need only two commands:

  • gt will goto the next tab the next tab
  • gT will goto the previous tab

Pro Tip
Map the tab commands and the NERDTreeTabsToggle command by adding the following to your vimrc.


map <C-t> :tabn<Enter>
map <C-n> :tabnew<Enter>
map nt :NERDTreeTabsToggle<Enter>

Ok, now on to editing Erlang with vimerl.

Editing with vimerl

This is not going to be an exhaustive list of vimerl editing commands but just a few of the goodies. The 20% you will use 80% of the time.

Auto-indenting

vimerl will auto-indent for you as you type. But if you come across a line that you want to indent try typing ==. Lets say you want to indent a block of code. Simple, mark the line that starts the block with ma then go to the end of the block and tell vimerl to indent to the mark as such: ='a. Now if your whole file is a mess then try gg to go to the beginning of your file and then =G to indent all the way to the end. You can do this all in one step as in gg=G.

Code Completion

ctrl-x ctrl-o after typing a module name and a : will cause vimerl to suggest function names for you. It does this by searching the .beam and .erl files in the erlang code path (code:get_path() to see what they are) as well as looking at your rebar deps_dir if you are using rebar.config as part of your project.

Skeletons

This is the feature that I loved most about the emacs mode for Erlang, well this and the auto indenting (most of the time, the fun() indenting still feels like a kick in the teeth). Here is a list of the most useful skeletons and the commands to generate them from within vimerl.

  • :ErlangApplication generate the skeleton for an OTP application behaviour.
  • :ErlangSupervisor generate the skeleton for an OTP supervisor behaviour.
  • :ErlangGen[Server|Fsm|Event] skeletons for gen server, fsm and event – yay!

Brilliant isn’t it. Before I let you go there is one more invaluable command you should know about which is :help vimerl which will give you a list of all the other useful commands you may want to use. Remember to get it working be sure to add call pathogen#helptags() to the top of your .vimrc file. Goodbye Emacs, welcome back old friend Vim.

Follow me on twitter @martinjlogan

<esc>:wq

Mixed Erlang and Scala with Scalang

This is a summary of a talk by Cliff Moon @moonpolysoft given at Strangeloop about building mixed Erlang and Scala systems with Scalang. Boundary does network analytics as a service. Their architecture uses a mixture of Erlang and Scala. Erlang is very very good at doing things like no down time deploys. We can have very low downtime on public facing parts of the system and we don’t even have to go down for deploys. On the data processing side one of the things that erlang is very bad at is dealing with numbers, and generally anything where mutability has a high value.

Trying to make the scala side talk do the Erlang side was required to handle the language choices for the system. Turns out that Erlang ships with Jinterface which is just the thing – or so it seem. Unfortunately it ended up being really really cumbersome. Jinterface is at the wrong level of abstraction. Erlang is all about actors and Jinterface only exposes mailboxes. All the rich interface you get in erlang with actors goes away when you are stuck with only mailboxes. The other problem is it is not performant. Primitives end up getting wrapped twice, first by Jinterface and then by a case class in scala which is just want to heavy weight when trying to process millions of pieces of data.

They decided to take a step back, something that would be easier to use. They were looking for more correctness in behavior, things kind of behave like Erlang actors. They wanted performance, and then simplicity; not having to deal with custom serializers and other such cruft. The internal architecture is built on NIO sockets and Netty. There are also a bunch of codecs to do encoding and decoding between erlang and scala. There is also a delivery system wich deals with registration and actors which run in Jetlang – an actor framework for the JVM.

The main interface into the system on the JVM side is something called a Node – this should be very familiar to Erlangers. It takes a node name and a magic cookie. So, pretty much exactly what you would expect.

Once you have a node you want to make a process. Processes are spawned and messages are sent with the ! operator, just like in Erlang. You can send messages to a Pid, you can send messages to a local registered name, and you can send messages to a remote registered name by supplying a name node tuple. So basically just like Erlang.

Cliff Moon talking about processes in scalang

Error Handing in Scalang

Scalang fires a link breakage exit signal anytime a Scalang process throws an uncaught exception. It works between Erlang and JVM. The one problem is that this is not preemptive on the JVM side as lightweight preemptive actors on the JVM side seems hard to do.

Erlang to Scala Type Mappings

Most things are a one to one mapping for primitives. Anything that does not fit, like numbers, will be tured into something reasonable on the scala side. If just that is not quite good enough for you; you wnat to do rich type mappings. You can use a rich type mapping plugin to turn rich types into records and vice-versa.

Scalang Services

One of the big things about Erlang is OTP. You typically use gen_servers, behaviors. These behaviors give you messaging primitives for sync and async and lots of other good stuff. Scalang wants to be able to interact with gen_servers transparently on the other side. So three functions are implemented for Scalang processes:

handleCall
handleCast
handleInfo

These will look very familiar for most Erlangers (if you are coding in OTP like you should be). Scalang also supports anonymous processes. You can just spawn processes with funs for those times when you don’t want a gen_server.

Runtime Metrics

This is what boundary does, so they wanted to bake them into all of their JVM stuff. Scalang has a full suite of runtime metrics. You get things like meters showing how many messages have come across the wire for each process. Histograms for process performance. Time spent in serialization. Message queue sizes and quite a number of other metrics. The idea was to make it similar to pulling up a remote shell into an erlang instance and being able to query to see where the bottlenecks are.

Scalang JVM Performance Tuning

We are all about running fast here. Scalang aims to make things easily tunable. It turns out one of the best way to performance tune is to screw around with the thread pools. The ThreadPoolFactory lets you screw around with different implementation. There are 4 kinds

Boss Pool – initial connection and accept handling
Worker Pool – non blocking reads and writes
Actor Pool – process callbacks
Batch executor – per process execution logic

Editorial: The system really looks to be quite powerful. It allows for the features I describe above as well as easy remote shell invocation of JVM nodes. It actually interacts nicely with EPMD for native feeling messaging. The system seems to be abstracted more appropriately than any Erlang to X intercoms library I have run across. I look forward to hearing about experiences using it.

Martin Logan (@martinjlogan) also, if you are into distributed systems and metrics you should check out Camp DevOps Conf in Chicago this Oct

Here is where you can find the code and example usage information: https://github.com/boundary/scalang