Generally, there are 2 ways to have microservices communicate with each other [1]. One technique uses REST to have 1 service directly query or act on another service. The other technique has services pass messages over a message bus. There is plenty of information online about what makes a good REST endpoint, but what makes a good message? There are two ways to think about this question. What does a good message look like? What is a good message about?

A good message doesn’t necessarily need much and is definitely a less is more type of situation. A message needs some identifying information. Who sent it? What’s it about? Some verb? A message also needs a payload. If you separate the what from the content you can then encrypt the message itself without impacting the consumers ability to determine if it’s a message worth paying attention to or not.

What goes in a good message? This is a bit more interesting to me and probably a lot more subjective. I believe this is the true difference between using REST or message passing. In REST one service is asking something of another service, but in message passing one service is telling all the others that something happened. In REST, service A would ask service B for some information, but in message passing service B would send a message letting everyone know about the new state of the world which happens to be consumed by service A.

The downside, and hidden upside, to this is that information will be duplicated. When service B sends an event saying a resource was added and service A consumes that event, service A likely needs some place to store that information. This means duplication of information which is unfortunate. Though a positive is that service A isn’t depending on service B running which increases the stability of your system. Another positive is that service A can store the information in a way that suites it’s needs (I would go so far as to say that it is a bad smell if a resource looks exactly the same in 2 different services).

By keeping all messages in the system informative, instead of action requests, it should be faster to create a system with faster response times. Because all services have centralized the information they need to do what they need, they will not send requests to other services, cutting out extra network time.

What makes a good message? Keep the message itself separate from information about the message. Keep it simple. Stick to messages that inform other services of events that occurred. Avoid messages that ask another service to do something.

 


  1. http://martinfowler.com/articles/microservices.html#SmartEndpointsAndDumbPipes

Recently, I found myself having some significant issues writing unit tests. I couldn’t quite get the class I was testing into the correct state for some method calls. The class in question, through initializing with request data, was used to determine if a request was authorized or not. At about the time I started googling ways to mock or disable the constructor I realized my issue was that my constructor did too much. This got me thinking, what makes a good constructor?

Wikipedia says that a constructor, “prepares the new object for use, often accepting arguments that the constructor uses to set required member variables.” One of the books on my shelf, which I either learned java from or simply collected over the years, Java Software Solutions by Lewis and Loftus, 3rd edition, says a constructor, “is similar to a method that is invoked when an object is instantiated. … we often use a constructor to initialize the variables associated with each object.” Simula 67, the first language with classes, doesn’t so much have a constructor as allow the programmer to place logic in the class body to be executed. Smalltalk, the first OO language, doesn’t have constructors, though there are conventions. A quote from the first edition of The C++ Programming Language, by Bjarne Stroustrup, may be able to add to this narrative, though I don’t have a copy. Regardless, perhaps history is not be best way to solve this issue.

Thankfully, the quote from the wikipedia article and from Lewis and Loftus, come to an agreement, that a constructor is for setting up object variables. This sounds straightforward enough, but what about my previous example. Is the authorized status of the request a property that should be set by the constructor? Clearly not based on my experience. Perhaps authorized isn’t even a property, but merely an opportunistic caching of a calculation that isn’t going to change, once determined, during the lifetime of the object? I’m inclined to say not; that authorized is a state of the object, which indicates the existence of an object variable to hold said state.

If that is the case, then what states can my object exist in? For my stripped down example, my request object has the states of authorized and unauthorized. Which is where I got into trouble as determining whether a request was authorized required calling a healthy proportion of the object’s methods, and therefore, made it hard to unit test. Do I mock half of the classes methods, nearly all of which are private? find a way to disable the constructor? no is the answer to both of those questions. So I came up with a 3rd state, a base state where I don’t yet know if the request is authorized or not. Now my constructor is nothing but assignment statements and it’s easy to set an object up for unit testing. Unfortunately I’ve had to push logic down into various other functions to handle situations when the information isn’t known and throw appropriate errors.

Woops, turns out that the state of not knowing isn’t a class invariant. Requests aren’t quantum, they actually are either authorized or not. And now I’ve got a problem. Do I make a class that creates objects that always obey their invariants but are not testable, or do I allow what amounts to having 2 methods to fully construct the object but allow it to be fully tested? The later is annoying but the clear winner. An alternate solution I haven’t considered is passing this authorized status to the constructor as an optional parameter or override.

What does the mean for the role of the constructor? It means that the constructor’s role is to set up a state of the object, meaning setting the objects variables, in a way that obeys the class invariants. In a perfect world it would anyway. More practically, it means picking the maximal subset of expert advice and best practices, where maximal is based on the legibility and maintainability of the class. Which is what wikipedia said all along, “prepares the new object for use,” emphasis on prepare.

Everyone loves JSON. I love JSON. JSON is great at storing complex data structures to be written and read by programs. It can even be written such that it’s fairly easily human readable. Yet, the JSON format is missing comments. Those things we mostly agree we need for anything non-trivial. This is completely fine for anything written by and for programs –I might have named this article, “Don’t use JSON for your config files which humans might have to edit,” but I opted for brevity –but not so great when our meat computers are involved.

Some context is in order. Lately, I have been working on some CloudFormation templates. CloudFormation is a way to textually describe to AWS what you want to build. These files might contains things like a VPC object, and a couple subnet objects, and a couple objects bringing the VPC and subnets together, and then you fill that full of security groups, and autoscaling groups or EC2 instances, and you end up with a very long file. These files can easily become hundreds of lines long. I would really love to leave some comments for 3-months-from-now me, but I can’t, because it’s JSON.

I was going to spend a few paragraphs discussing the history of comments in an effort to explain why they are important, but there really isn’t much to discuss. If you look at the popular, early  programming languages, COBOL, LISP, FORTRAN, they have comments. Assembly has comments. If you examine Ada Lovelace’s notes on Babbage’s Analytical Engine, you find that she describes note G (regarded as the first program). I will take this evidence as indicating the self evidence of comments.

The question then, is if comments are so prevalent, why doesn’t JSON have them? The answer is because JSON is a data encoding format meant to encode complex data structures for storage or transfer between programs (javascript <-> web server). It wasn’t designed for scripting. So lets please stop using a screw driver to hammer in a nail. Our future selves will appreciate it.

Might I suggest YAML instead?

We, being the company I work for, recently set up a mysql galera cluster and haproxy to load balance connections between the nodes. Haproxy has a mysql health check, but it only logs into the server, and we wanted a bit more than that (galera’s rsync option puts the server that is being synced from in read_only mode). What I didn’t want to do was install apache or similar because I wanted to leave as much of the systems resources available to mysql. I solved the problem with a perl script.

Before I move on, I should mention that I don’t like perl. Other languages, such as Go, provide just as easy of a solution, but perl is installed on pretty much all linux distros, and, therefore, was less setup. The backbone of the script is the HTTP::Simple::Server:CGI package. My version of the script weighs in at a whole 26 lines of code. Here it, mostly, is.

#!/usr/bin/perl

use File::Pid;

{
package MyWebServer;

use HTTP::Server::Simple::CGI;
use base qw(HTTP::Server::Simple::CGI);

    sub handle_request {
        my ($self, $cgi) = @_;
        $isFine = 0;
        //-----
        //do your checking logic here
        //-----
        if($isFine) {
            print "HTTP/1.0 200 OK\r\n";
            print $cgi->header;
        } else {
            print "HTTP/1.0 503 Service Unavailable\r\n";
            print $cgi->header;
        }
    }
}

my $pidfile = File::Pid->new();
if(!$pidfile->running) {
    my $server = MyWebServer->new(12345);
    $server->host('YOUR SERVERS IP GOES HERE');
    my $pid = $server->background();
    $pidfile->pid($pid);
    $pidfile->write;
}

The above code checks to see if a running PID for the script exists and exits if it does (the if block towards the bottom). It then sets up the server to listen on part 12345, use whatever port you want. The next line tells it to listen on a specific ip address, I set that from chef as part of the .erb that builds this script, you could pass a parameter to the script if you don’t want to do that ($ARGV[0]). It then creates the server in the background and writes the PID file.

Of course, the real action is in the handle_request function in the package. That function gets called every time the script receives an http request. All mine does, and you could do a lot more here, is collect some information about the state of the server, a bit more on that in a second, and either returns a status of 200 or 503 which is all haproxy cares about. If your load balancer checks for actual content in the response then, you would add some prints after the $cgi->header calls.

As I mentioned in the first paragraph, the reason we set this up was to discover if the server happens to be in read_only mode. Thus, all my check does it shell out to mysql with a -e option to show global variables, and then runs a regex over that for read_only being set to off.

I’ve also set cron up to run the script ever minute, which is why the PID stuff is in there. Pretty simple really.

With the reliance most web apps place on databases making sure they are always available is important for improving your reliability. I consider MySQL to be finicky (a manual master to slave fail-over at 5am is not my idea of fun), but it is what I’m stuck with. Here is my understanding of the options for keeping your app up when mysql isn’t.

Master – Slave

This is a pretty basic and common pattern. You have one server that gets all of the writes (the master) and another server that is replicating those writes (the slave). The replication is asynchronous and; therefore, can fall behind. Generally, if you keep the load under control on the slave it should keep up. You can send all of your traffic to the master or you can send reads to the slave distributing your read performance. Writing to the slave will probably break replication (definitely break it if you insert into an auto incrementing column).

Cleaning up a broken slave can be difficult. You either need to hunt down and undo any changes made to the slave or pull a dump from the master and import it. Care must be taken when importing to make sure that you know what position in the binlog to resume replication from.

If the master fails you can fall over to the slave though this is a manual process. You’ll need to stop anything being written to the master (if it isn’t completely dead), stop the slave process on the slave, tell your application to write to the slave, get the master back up and running, get the master up to date (you’re probably not going to know the binlog position which means you’ll likely need to do a full import), get what was the master set up to be the slave, and start it replicating.

The benefit here is that it’s easy to set up, mysql is pretty stable so you’ll rarely have to fix it, and as long as 1 server can give you enough write throughput, you can be reasonably happy. If you ever need more read throughput you can add additional slaves. The replication overhead on the master is low.

Master – Master

Similar to master-slave but now both servers are configured to be slave and master for each other. This allows you to read and write to either server because it will be replicated the other. Of course all of the same problems regarding repairing broken slaves and known binlog positions, etc still apply. Both servers can’t attempt to get the same id from an auto increment so you’ll need to do something in the lines of configuring one server to only use even numbers and the other to only user odd. If servers come under load they might start to fall behind in replication. The order updates and inserts are applied in might be different for each server which might lead to them having different data.

One way to resolve some of these problems is to only send traffic to one server.

With a VIP

If you are running on linux then you can use a virtual IP. This requires a bit of network wizardry. What you end up with is a system where the passive server (the not in use server) is polling the active server (the in user server) to make sure it’s alive. If it discovers that the active server is down then it steals the VIP and in doing so promotes itself to being active. Your application won’t need to know about it as the ip it’s connecting to never changes, just the destination. You’ll still need to figure out what was wrong with the broken server and get it working again but in theory they can be no downtime to the user.

With a distributed file system

The idea here is that the file system the servers write to is shared between them (NAT or SAN or the like). This is really more an active-passive solution than master-master as one of you’re servers will need to be turned off or you risk corrupting data. You can also combine this with the VIP method though you’ll need something to mount the drive (depending on sharing method) and start mysql. What you get with this is the knowledge that the data on the active and passive node will be the same (it’s the same mount) at the expensive of a little downtime while the passive mysql starts.

NDB

This is mysql’s cluster. It’s it’s own engine so you can’t use innodb or myisam, it has many moving parts, and it requires at least 3 servers, but it gives you a system where you can read from and write to any node without any of the data integrity complications inherent in the previous patterns. The system is composed of API nodes (generally mysql), data nodes, and master nodes. These processes can live on distinct machines or all on the same machine. Unlike the previous examples, all of your data does not live on all of your servers but is distributed across the cluster. This has the benefit of increasing your throughput as the number of nodes increases though individual query performance can be impacted.

Data can be mirrored between the storage nodes meaning the loss of any individual node will not result in the loss of data. Nodes can be inserted and dropped without fuss or harm. For example, NDB updates itself through a rolling update were each node, one at a time, is dropped out of the pool, updated, and entered back into the pool.

You will probably need to make some application changes in order to use NDB. One set of concerns relates to security as NDB is innately insecure, requiring proper use of DMZs. Please take a look at mysqls documentation for more information. Large join, sort, etc performance can also be bad as the rows involved will likely be spread across the data nodes.

Galera

Galera is a solution for MariaDB or Percona, which are forks of mysql. Galera is also a clustered solution that replaces mysql’s asynchronous replication with synchronous replication. Galera combines recent research and Percona’s XtraDB fork of innodb (myisam is in beta I believe but isn’t production ready) to provide solid performance for synchronous replication. As with NDB, Galera allows you to read and write to any node, and add or remove nodes with ease. When you add a node the cluster will automatically get it synced with the rest of the cluster.

Unlike NDB, all data lives on all nodes. This has benefits and drawbacks for performance. Read performance is fast and joins, sorts, etc are fast as everything is on the node getting the request. Inserts and updates speed will be depending on the speed of the slowest node in your cluster. This is important to consider given that you will likely run a Galera cluster on commodity hardware. You can find benchmark data at this webpage.

I favor this solution but suspect that it isn’t suitable for solutions that require a lot of mysql servers to meet throughput demands. Using NDB with a caching layer to speed up frequent reads might be a better solution in that scenario.

Tungsten

Like Galera, Tungsten is a cluster solution that replaces mysql’s built in replication. It allows for complex replication strategies and replication between different versions of mysql or different databases altogether. Replication happens by Tungsten Replicator which is a separate process from mysql, so the solution is not as simple as Galera but probably makes up for it in it’s flexibility.

RDS

Amazon’s RDS (relational database service) is part of it’s AWS offerings. You define what type of throughput you need and they handle the rest. The only drawback I know of here is that they don’t yet support encryption at rest so if you have PHI or other data you need to encrypt you are SOL. If you are in AWS and don’t require encryption at rest this is probably the right place to start.

I just spent an embarrassing amount of time trying to figure out why some resources in one of my puppet class where not being executed. All of these were inside of an if branch that was checking if another class was defined. It was defined (wouldn’t be much of a blog post if that wasn’t the case), there were not errors, it just wasn’t happening. Turns out that defined does not work the way I thought it did. From Puppet’s documentation (https://docs.puppetlabs.com/references/latest/function.html#defined):

Checking whether a given resource has been declared is, unfortunately, dependent on the parse order of the configuration

Because defined happens during the parse step and not another step the order in which your resources are declared matters. To give you a concrete example, I have two classes, php and imagemagick. The imagemagick class installs the imagick php extension for you if php is defined. How nice of it, right? This has always worked perfectly. Until my most recent manifest where I had something like:

class{'imagemagick': }
class{'php':
  before => Class['imagemagick'],
}

Which looks good. Php has to happen before imagemagick so it should be defined. But, because imagemagick is parsed before php, php isn’t actually defined for imagemagick, so nothing inside of my if was run during the apply. To make this actually work it needed to like like:

class{'php':
before => Class['imagemagick'],
}
class{'imagemagick': }

And now it works. Ridiculous. Hope this saves someone some time.

Recently, I needed to add https support to our dev installs of our web app. The app itself needed to know it was using https, to generate proper urls and the like, so terminating the ssl connection at the proxy was not a viable solution for me. HAProxy added support for SSL in 1.5 but this article isn’t about that because I’m using CentOS and therefore am stuck with HAProxy 1.4.

First up, how not to solve this problem. My first thought was that if I put HAProxy in tcp mode it shouldn’t know anything about whether the connection was SSL or not. This did not work. Unfortunately my notes don’t say why this didn’t work but I assume either HAProxy was spitting out BADREQ with PR– in the logs or the payload was getting messed up and causing errors in negotiations.

Enter stunnel. Stunnel is an SSL tunnel and is what I used to handle the https request. Stunnel can be configured in either a server mode, which terminates SSL connections, or as a client, which initiates SSL connections. This solution uses both. The general solution, which I found here, is to have the https connection received by a stunnel server, who forwards the now http connection to haproxy, who forwards the http connection to a stunnel client, who changes it back to https and forwards it along to server.

–https–> stunnel server –http–> haproxy –http–> stunnel client –https–> web server

Not pretty, but effective, and because all of the traffic between stunnel and haproxy is on localhost, it’s relatively fast.

The first thing needed is to get stunnel installed. It’s in yum.

Now to set up stunnel. I made a folder at /etc/stunnel to hold my configs and .pem file, which is the .key and .crt file concatenated together. I placed the .pem file in that folder. Now you will need a config file for the stunnel server and client. I named my server.conf and client.conf. You might be able to do this with 1 config file, I’m not that familiar with stunnel. In both config files you will need/want 5 global settings defined:

cert=<path to your .pem file>
pid=/var/run/stunnel_(server|client).pid or something similar
output=/var/log/stunnel_(server|client).log or something similar
socket=l:TCP_NODELAY=1
socket=r:TCP_NODELAY=1

Basically, define where the pem lives because we need that for the SSL handshake, define a pid file and a log file because those are handy, and two lines that are basically gibberish to me but seem important (all of the google results seemed to have them). I would also suggest you add in foreground=yes while you test the config files so that you can easily see what is happening and kill (ctrl-c) the process to make changes. The next bits are define what the stunnel client is actually going to be doing. For the server:

[https]
accept=443
connect=8081

That basically says to listen on port 443 and if you get something there forward the connection to 8081. Port 443 is important but you could change 8081 to be something different. The client will look at lot the same:

[https]
client=yes
accept=8082
connect=<server ip address>:443

There we tell stunnel that it’s to operate in client mode (client=no is default which is why it wasn’t in the server config), to listen on 8082 (which you could change to something else), and to connect to our webserver on 443. If you had multiple web servers you could put multiple connect lines in and it will round robin the connections.

The last thing we need to change is our HAProxy config. This, at it’s most basic, would look something like:

frontend my_ssl_webpage
bind :8081
default_backend my_ssl_webpage_backend

backend my_ssl_webpage_backend
reqadd X-Forwarded-Proto:\ https
server stunnel1 127.0.0.1:8082

Now, if you restart HAProxy, and start stunnel for both config files (sudo stunnel <path to config file>), you should have https requests arriving on your webserver.

 

Bonus:
Found a great manual for HAProxy 1.4. Here is the link:
http://cbonte.github.io/haproxy-dconv/configuration-1.4.html