Development

Webhook Support for Product Availability

The Requirement

Currently, the Make it Social Product API provides a production-ready booking service for experience providers with the following features.

  • It is a product repository accessible through a Restful API;
  • It has a builtin booking engine also accessible via Restful API;
  • It uses RRule to specify product availability by date and time;
  • And it supports group-size and vacancy control;
  • It notifies the product owner via webhook for booking status.

While this is enough for the Make it Social hosted products, it does not support those hosted on third-party booking engines.In our previous monolithic system, we built plugins to support as many third-party booking engines as necessary.But in the new PAPI system, we use a different approach, namely, the webhooks. Instead of calling various APIs with their specified data format, we let the third-party engines or a middle service to handle our requests.

The Webhooks

The client or a middleman service must listen on an endpoint for availability and booking requests by PAPI on behalf of the users.The requests will be a HTTP POST call to the endpoint, for example:

curl -X POST -d '{"start_date":"2016-01-01T09:30:00Z",..}' https://example.com/products/availability -H "Content-Type:application/json"

It’s product provider’s responsibility to configure these webhooks and setup a webservice to listen on these endpoints.This can be done by adding the following to either the provider or the product dataset.

{
	"webhooks": {
		"availability": "https://..",
		"prepayment": "https://..",
		"postpayment": "https://..",
		"shared_secret": ".."
	}
}

The shared_secret is a string to be used by PAPI to hash the post request for a signature.

Availability endpoint

This endpoint will be called when a user joins the activity.Minimal request data:The post body will be a JSON string like the following example.

{
	"provider_id": "..",
	"product_id": "..",
	"group_size": 2,
	"start_date": "yyyy-mm-ddTHH:MM:SSZ",
	"end_date": "yyyy-mm-ddTHH:MM:SSZ",
	"time_code": "",
	"rate_code": "",
	"hash": "hash(..)"
}

The product_id is the unique product ID on PAPI.

Expected response

{
	"available": true,
	"hash": "hash(..)"
}

or

{
	"available": false,
	"reason": "Error calling the booking engine",
	"hash": "hash(..)"
}

Pre-payment endpoint

This will be called immediately before the user starts the payment.

The Request

{
	"id": "booking_id",
	"provider_id": "..",
	"product_id": "..",
	"currency": "GBP",
	"amount": 199.99,
	"user": {
		"uid": "user_id",
		"first_name": "<first name>",
		"last_name": "<surname>",
		"email": "email@example.com",
		"address": [
			"s1": "street",
			"s2": "",
			"city": "City",
			"postcode": "postcode",
			"country": "GB"
		],
		"data_capture": {
			"name": "value"
		}
	}
	"group_id": "group_id",
	"start_date": "yyyy-mm-ddTHH:MM:SSZ",
	"end_date": "..",
	"group_size": 2,
	"rate_code": "",
	"time_code": "",
	"voucher": "",
	"extras": [
		{
			"xid": "product_id",
			"name": "product name",
			"price": 10.00
		}
	],
	"timestamp": 1445526011477,
	"hash": "hash(..)"
}

Expected Response

The webhook handler should return with an availability status, and a booking_id if done so. The booking_id will be used for confirm or cancel after payment.

{
	"available": true,
	"reserved": 1,
	"booking_id": "",
	"expire_at": "ISO time",
	"hash": "hash(..)"
}

The reserved and expire_at properties are optional.The response for failed reservation:

{
	"available": false, 
	"reason": "No longer available",
	"hash": "hash(..)"
}

Post-payment endpoint

This will be called after a successful or failed payment. To confirm a successful payment, the request body will contain a property action:”CONFIRM”, otherwise, action:”CANCEL”.

Request data

The request body will be the same as that for pre-payment, plus “prepayment”: {} from the response of pre-payment call if done.

{
	"action": "CONFIRM",
	"prepayment": {
		"booking_id": "1234"
	}
	...
}

Expected response

{
	"status": "OK",
	"ref_no": "the reference number",
	"link": "URL for user to collect the ticket or to read the guide",
	"hash": "hash(..)
}

or for failure:

{
	"status": "FAILED",
	"reason": "..",
	"hash": "hash(..)
}

Note on pre- and post-payment endpoints

It is not necessary to have both depending on the booking engine. But at least one should be available for booking. If pre-payment booking (reservation) is available, then a booking_id in any format must be available so that later confirmation or cancellation can use as the ID, unless confirmation or cancellation are not necessary.

The Signature

Post to the endpoint as well response data received from the webhook should have a signature as a hash of the data content and a shared secret.
The hashing content is a string concatenation of all the properties flattened into a key=value array. For example:

{
	"action": "CONFIRM",
	"prepayment": {
		"booking_id": "a1223"
	}
}

will be converted into the following array of strings.

 ["action=CONFIRM","booking_id=a1223"]

And then concatenate the strings together and append the shared secret to it, and then hash it using sha256 and encode the result in base64.The signature generation function can be something like this:

function hash(obj, secret) {
	var buf = [];
	function flattern(obj){
		for (var k in obj) {
			var v = obj[k];
			if (typeof(v)=='object') {
				flattern(v);
			} else {
				buf.push(k+'='+obj[k]);
			}
		}
	}
	flattern(obj);
	buf.sort();
	buf.push(secret);
	return crypto.createHash('sha256')
		.update(buf.join(''))
		.digest('base64');
}
Development

Database Sharding for GAPI2

It’s a misconception that the cloud platform, or PaaS, provides out-of-the-box scalability. A data-centric application must be well planned in order to support massive growth of userbase. And this is the reason why we experiment with sharding at the beginning.

What is sharding

Sharding is the way to break a huge database into smaller ones by saving a chunk of data rows on each database server. Each of such server is called a shard.
See [http://dbshards.com/database-sharding/] for a complete description.

What’s difficult with GAPI

The primary database table that requires sharding is the user table. This table has two indices, an integer ID as the primary key, and an email address as the user login name.

When a user tries to login, the system queries the database with the user’s email address. And other related databases, including external ones, use the integer ID as a reference, like a foreign key.

The difficulty is how to efficiently find out which shard a user is stored either by the ID or the email address. While we could use a special table to store the index-shard map, we want to find a simpler solution.

Sharding by Email address

Email addresses are mostly alpha-numeric strings, and we can use the prefixes as the shard keys. If we save all email addresses in lower-case letters, the first two characters are combinations of “0-9a-z”. To use one shard for each pair, we need about 1000 shards.
Because these prefixes are evenly distributed in the real world, it may be necessary to break a large prefix into two by adding a third letter, or merge two rarely used prefixes into one.

Sharding by ID

The integer IDs as primary keys are normally auto-incremented values, thus data rows on different database servers may share the same sequence. In order to let users have unique IDs across shards, we can assign a range of different IDs to each shard. For example, ID numbers from 1 to 10000 fall into shard A, and 10001 to 20000 in shard B, etc.

How can we relate the IDs to the email address prefixes?

One map for both

With Facebook userbase as a reference, we can use a 32-bit integer to represent our whole user ID space, 2147483647.
If we pre-allocate 2 million IDs for each shard, we can have 1073 shards. Then we can create 1073 shards with a prefix for each shard and a range of IDs.
Our map looks like this:

{
	"field": "email",
	"pfx1": "aa",
	"pfx2": "ab",
	"id1": 0x10000,
	"id2": 0x20000,
	"shard": "AA"
}

On query by email, if it falls in pfx1 to pfx2, then select shard AA. Likewise,
if query by ID which falls in between 0x10000 and 0x20000, shard AA is also selected.

Virtual shards

We do not want to allocate so many shards from the start when it’s not necessary. Borrowing from Instagram’s “logical shards” design, we also use a virtual sharding solution. In other words, we let multiple virtual shards share one physical shard to save cost.

Because different shards use different ID ranges, it is easy to save multiple virtual shards in the same database. And it is also easy to migrate a whole virtual shard to a new physical server.

ID generator

Saving multiple ID ranges in the same database, we can no longer use the auto-increment feature by MySQL. We have to use an global ID generator for all shards, like the Flickr ticket server, and the generator must also guarantee that the generated IDs fall inside the specified range.
The generator relies on a small database table residing on a primary server.

CREATE TABLE IF NOT EXISTS shardids (
    shard VARCHAR(2) NOT NULL PRIMARY KEY,
    shid INT NOT NULL,
    cap INT NOT NULL
);

When getting a new ID for a shard, we use MySQL INSERT ON DUPLICATE KEY statement like this:

INSERT INTO shardids VALUES('AA', 10000, 20000) ON DUPLICATE KEY UPDATE shid=shid+1;
SELECT shid FROM shardids WHERE shard = sh;

The first time, shid will be 10000 for shard AA, and cap 20000, and the next time shid=10001.

Range checker

To prevent shid from reaching the cap (20000 in the above example) which may start another shard, we must let the generator fail when shid reaches cap. This can be done by a simple database trigger like this.

DELIMITER //
CREATE TRIGGER check_cap BEFORE UPDATE ON shardids
FOR EACH ROW
BEGIN
 DECLARE msg VARCHAR(100);
 IF NEW.shid >= NEW.cap THEN
 SET msg = CONCAT('IDs used up for shard ', NEW.shard);
 SIGNAL SQLSTATE VALUE '45000'
 SET MESSAGE_TEXT = msg;
 END IF;
END //
DELIMITER ;

MySQL triggers are not allowed to commit or rollback the update, so we use the SIGNAL command to raise an error and fail the update.

The Bottle-neck

The trigger and the ID table stays on the primary database, which is obviously a bottle-neck because all shards rely on the ID generator. However, this generator is only used for inserting new user data into a shard, which is not a frequently used operation.

Over-engineering or Premature optimisation?

At the end of the planning meeting, our CEO, Eddie Robb, asked whether sharding is necessary at this stage, and he intentionally pronounced the “d” as “t” indicating “over-engineering”.
Even so, we still think it is worthwhile to plan ahead, because a system design without scalability planning would require big changes later on when it could be too late to do a re-engineering.

Development

Docker-based Development and Unit-testing

Because our GAPI2 will be deployed on Amazon ECS, it would be helpful to use Docker development environment locally so that the team can share the same environment easily.
Firstly, GAPI2 is a web service to be running inside a Docker container, and thus we need to create a Docker image for Golang which is the language that GAPI2 uses.
A Dockerfile like the following will be enough for creating our image.

FROM golang:latest

RUN go get github.com/gin-gonic/gin && \
	go get gopkg.in/redis.v3 && \
	go get github.com/go-sql-driver/mysql && \
	go get github.com/guregu/null

ADD . /go/src/bitbucket.org/makeitsocial/misauth
ADD ./public /go/public

RUN go install bitbucket.org/makeitsocial/misauth

EXPOSE 8080

Secondly, we need to set up a database for the webservice. While on the AWS platform we will use RDS to host the database, for development, we’d better have a local database server for easy testing.
And we are going to use a ready-made Docker image for MySQL server.

Another Docker image we need is a Redis server which we use for caching purposes. When deployed on AWS, the caching service will be ElastiCache, locally we can also use a Docker image for it.

To link them together easily, we use Docker-Compose for the task. Here is the docker-compose.yml file.

web:
  build: .
  volumes:
    - .:/go/src/bitbucket.org/makeitsocial/misauth
    - ./public:/go/public
  ports:
    - "3000:8080"
  links:
    - db
    - cache
db:
  image: mysql/mysql-server
  ports:
    - "3306:3306"
  volumes:
    - ./rds:/data
cache:
  image: redis
  ports:
    - "6379:6379"

Once the environment is setup, we can start our web service using docker-compose and the other two containers will also be started automatically.

docker-compose up

Before we can do a unit-testing, we need to prepare the databases. This can be done with a simple SQL script that can be executed by a Docker exec command.

docker exec -it misauth_db_1 bash -c 'mysql --password=my-secret-pw <setup.sql'

Now we can run the unit-testing, also with docker exec command.

docker exec -it misauth_web_1 bash -c "go test bitbucket.org/makeitsocial/misauth/tests -v"