Messaging Systems

Intro

Why use messaging at all ?

Recover-ability – if all else fails hopefully you will still have messages and once normal services have been restored, then you can proceed getting your systems back online, hopefully with minimal damage done.

Dealing with spikes in traffic – you want to make sure that you systems can handle load above it's capacity even if takes a long time to process each message.

There are probably other reasons also but those are two main ones I can think of right now.

Options

I have used a few and here is a brief discription of each and a comparison of their value. 

RabbitMq

Rabbit is written in Erlang. It is an interesting language and has the following characteristics, from wikipedia  – it is functional, fault-tolerant, highly available, soft-realtime (soft means tolerable if there is a delay in processing): https://en.wikipedia.org/wiki/Erlang_(programming_language)

It is also known for it’s pattern matching capabilities which I guess are useful with topics (more later) and above all I guess concurrency is it’s principal strength – which is great when you are dealing with millions of messages.

I once spoke to guy who programmed in Erlang and he seemed so engrossed in the language that I am sure he swapped English for Erlang and was speaking to me like so.

It uses a protocol called AMQP.

It also has many client libraries in Js, C# and many other languages which are pretty well implemented. This shouldn’t be taken for granted!

Messages are submitted to exchanges and queues are bound to the exchanges. You can have one to one relationships or indeed one to many.

Topics

You can submit with topics which ensures only the consumers with that specific topic receive those messages.  Subscribing to the relevant queue with the correct topic ensure that you only get messages you desire.

You can host yourself or use something like cloudAmqp.

SQS

To SNS or not to SNS

This is Amazons messaging system. Can be used in conjunctions with SNS or on it’s own. When used with SNS – SNS works as a sort of exchange and SQS as the queue.

SQS uses standard http underneath and offers a polling model for subscription. Is a little primitive in my opinion. The client libraries are also pretty basic although people are starting to build better libraries on top. With dotnet as an example though, easynetq is much more complete than any client offerings in SQS – we have had to essentially roll our own.

It’s not as sophisticated as something like RabbitMq. It’s strengths lie in it’s simplicity and it’s ability to scale, also the fact that hosting is taken care of by aws means we don’t have to worry about that side of things and also that it plays well with other AWS technologies such as lambda Dynamo db.

JMS

JMS is from the Java community and is an API specification as opposed to a protocol like AMQP. It can use a number of different protocols underneath – we use tcp in our implementation.

Coming from a dotnet background I found it a little difficult as had to consume using a c# library called NMS.

It has various implementations and the one I used was called ActiveMq. It us open source and relatively easy to install although creating and consuming messages is a little trickier.

Summary

All three options are useful depending on your needs. Rabbitmq probably has the nicest feature set but is not the easiest to get setup and started. Sqs is a better option if already using aws and I guess jms / active mq is nice for those who use Java. 

There are other options also such as Kafka but I would encourage you think if you really need the complexity of a messaging system before implementing as it does add it's own complexity.

Aws Musings

I wasn’t sure what to entitle this post – I woke this morning with a sort of an urge to write about Amazon AWS and my experiences developing applications and using its tools to this point. So I settled with the above.

I guess I have been using aws for 4 or 5 years now and have used a wide variety of the tools on offer. Initially, like a lot of people I could appreciate it’s advantages in terms of flexibility and scalability but what I realise more and more is just how it spreads beyond that.

I have been using lambda quite a bit recently and sns and sqs also. They give you great control over your costs just like ec2, autoscaling etc. Likewise dynamo dB. Balanced against that though is the old ‘with great power comes great responsibility’ Peter Parker chestnut.

It is possible to get your auto scaling wrong and end up with far too many server instances. You also have to be aware that you get charged for polling messages in sqs, long polling helps with this but it’s not default. There are many pit falls and it is worth bearing this in mind before you go too far down the AWS path. S3 costs can really mount also if you are not deleting.

The point is though it means developers need to be more aware of costs than before as they have direct control over that – so I think it changes your role a bit and the things you think about from day to day. I guess in the past there was still a lot of input from developers on software and infrastructure selection and it’s costs but not to the same degree.

Another thing I found or maybe not found is documentation. AWS has grown rapidly and it does appeal to our imagination in that things like lambda bring a certain amount of excitement with the thought of the possibilities but the other side is sometimes I feel like shortcuts have been taken.

Documentation is probably one of the first things to go when time is of the essence. There is a lack of it when it comes to AWS in general, particularly with some of the newer technologies.