Auto-sharding for datacenter applications Adya et al. Slicer is a general purpose sharding service. I normally think of sharding as something that happens within a typically data service, not as a general purpose infrastructure service.

What exactly is Slicer then? It has two key components: In this way, the decisions regarding how to balance keys across Paper slicer instances can be outsourced to the Slicer service rather than building this logic over and over again for each individual back-end service.

Experience taught us that Paper slicer is hard to get right: Rebuilding a sharder for every application wastes engineering effort and often produces brittle results. Slicer is used by over 20 different services at Google, where it balances M requests per second from overconnected application client processes.

Slicer has two modes of key assignment, offering both eventually consistent and strongly consistent models. In eventually consistent mode, Slicer may allow overlapping eventually consistent key assignments when adapting to load shifts.

In strong consistency mode no task can ever believe a key is assigned to it if Slicer does not agree. All of the production use to date uses the eventually consistent model. The unit of sharding in Slicer is a key, chosen by the application.

Slicelet enables a task to learn when a slice is assigned to it, or when a slice is removed from it. The Slicer Service itself monitors load and task availability to generate new key-task assignments and thus manage availability of all keys.

Slicer hashes each application key into a bit slice key; each slice in an assignment is a range in this hashed keyspace. As a result, there is no limit on the number of keys nor must they be enumerated.

Slicer will honour a minimum level of redundancy per-key to protect availability, and automatically increases replication for hot slices.

The weighted-move sharding algorithm We balance load because we do not know the future: Maintaining the system in a balanced state maximizes the buffer between current load and capacity for each task, buying the system time to observe and react.

The overall objective is to minimize load imbalance, the ratio of the maximum task load to the mean task load. When making key assignments Slicer must also consider the minimum and maximum number of tasks per key specified in configuration options, and should attempt to limit key churn — the fraction of keys impacted by reassignment.

Key churn itself is a source of additional load and overhead. In order to scale to billions of keys, Slicer represents assignments using key ranges, aka slices.

Thus sometimes it is necessary to split a key range to cope with a hot slice, and sometimes existing slices are merged. The sharding algorithm proceeds in five phases: Reassign keys away from tasks that are no longer part of the job e. Repeat this step so long as: Pick a sequence of moves with the highest weight described below and apply them.

Split hot slices without changing their task assignments. This will open new move options in the next round. Repeat splitting so long as i the split slice is at least twice as hot as the mean slice, and ii there are fewer than slices per task in aggregate.

During step 4, the weight of a move under consideration is defined as the reduction in load imbalance for the tasks affected by the move, divided by the key churn cost. Experience suggests the system is not very sensitive to these values, but we have not measured sensitivity rigourously.

A variant of consistent hashing with load balancing support yielded both unsatisfactory load balancing, and large, fragmented assignments.

Slicer is conceptually a centralized service, but its implementation is highly distributed. Assigner components run in several Google datacenters around the world and generate assignments using the weighted-move sharding algorithm we just looked at.

Slicer: Auto-sharding for datacenter applications Adya et al. (Google) OSDI Another piece of Google’s back-end infrastructure is revealed in this paper, ready to spawn some new open source implementations of the same ideas no doubt.

