Service Fabric Handbook (UPD): Stateful Service Life Cycle

Obsolete
This post no longer contains latest information. Please refer to Service Fabric Handbook.

Hello,

Today I am happy to say – here is the first update to Service Fabric Handbook!

This update is about Stateful Service life-cycle.

Stateful Service Life Cycle


Stateful Service implementation isn’t just an instance of class derived from StatefulServiceBase. When replica is build it’s implementation object is created and has to be initialized and correctly registered in the SF runtime.

There are three major life-cycle routines: Startup, Shutdown, Promotion and Demotion.

Pay attention that described life-cycles are slightly contradicts with the documentation. Please see issue on GitHub for details (why) and gist for code sample.

Author’s note

Startup


This routine is invoked when new replica is built.

Here is the sequence of events when implementation object is initialized as primary replica:

  1. Service’s implementation object is created.
  2. Service’s OpenAsync method is called and awaited.
  3. Service’s CreateServiceReplicaListeners method is called and awaited.
  4. All ServiceReplicaListener‘s returned from CreateServiceReplicaListeners are used to create instances of ICommunicationListener and have their OpenAsync methods called. The methods are called and awaited in sequence.
  5. Service’s ChangeRoleAsync method is called (with newRole = Primary) and awaited.
  6. Service’s RunAsync method is called.

When implementation object is initialized as secondary replica then the sequence of events is slightly different:

  1. Service’s implementation object is created.
  2. Service’s OpenAsync method is called and awaited.
  3. Service’s ChangeRoleAsync method is called (with newRole = IdleSecondary) and awaited.
  4. Service’s CreateServiceReplicaListeners method is called and awaited.
  5. All ServiceReplicaListener‘s where ServiceReplicaListener.ListenOnSecondary == true return from CreateServiceReplicaListeners are used to create instances of ICommunicationListener and have their OpenAsync methods called. The methods are called and awaited in sequence.
  6. Service’s ChangeRoleAsync method is called (with newRole = ActiveSecondary) and awaited.

The differences in startup sequences can be explained according to replica roles and method purposes.

RunAsync isn’t invoked on secondary replica
The RunAsync method is designed to allows replica to perform some sort of background job. Because in Stateful Service only primary replica has write access to the reliable state then it doesn’t make sense to invoke this method in secondary replicas because primary and secondary replicas share the same implementation and would expect write access.
ChangeRoleAsync is invoked twice on secondary replica
New secondary replica is always created in the IdleSecondary role because it should receive copy of the reliable state from the primary replica before this replica can be of any use. When copy of reliable state is received then replica continues the same startup sequence as primary replica with exceptions that final replica role is set to ActiveSecondary and RunAsync method isn’t invoked.

The important moment to understand is when the startup routines are executed – when primary or secondary replica are initialized from scratch. The secondary replica can be initialized from scratch at any time when CM decides. In contrary to this primary replica is initialized from scratch only when partition is being initialized – in all other cases when CM requires to move primary replica then either existing secondary replica is promoted to primary or new secondary replica is initialized from scratch and then promoted to primary.

The partition initialization sequence is illustrated on the picture below:

Illustration of Stateful Service partition initialization sequence.
Illustration of Stateful Service partition initialization sequence.

Important to note that primary replica doesn’t call RunAsync method until secondary replicas are built (i.e. have a copy of state). The amount of secondary replicas to await before running RunAsync in primary replica is determined by MinReplicaSetSize.

Promotion and Demotion


Promotion and demotion are natural part of Stateful Service partition life-cycle. These processes can happen for various reasons: primary replica movement (manual or automatic during resource balancing), primary replica failure, etc.

When secondary replica is promoted the following sequence of events is performed:

  1. All previously created ICommunicationListener have their CloseAsync method called and awaited. The methods are called and awaited in sequence.
  2. All ServiceReplicaListener‘s returned from CreateServiceReplicaListeners (during secondary replica initialization) are used to create instances of ICommunicationListener and have their OpenAsync methods called and awaited. The methods are called and awaited in sequence.
  3. Service’s ChangeRoleAsync method is called (with newRole = Primary) and awaited.
  4. Service’s RunAsync method is called.

When primary replica is demoted the following sequence of events is performed:

  1. All previously created ICommunicationListener have their CloseAsync method called and awaited. The methods are called and awaited in sequence.
  2. CancellationToken passed to RunAsync method is canceled.
  3. RunAsync method is awaited.
  4. All ServiceReplicaListener‘s returned from CreateServiceReplicaListeners (during primary replica initialization) where ServiceReplicaListener.ListenOnSecondary == true are used to create instances of ICommunicationListener and have their OpenAsync methods called and awaited. The methods are called and awaited in sequence.
  5. Service’s ChangeRoleAsync method is called (with newRole = ActiveSecondary) and awaited.

Pay attention that CreateServiceReplicaListeners method isn’t called during promotion-demotion routines. The code uses the same ServiceReplicaListener‘s returned from the first initialization.

Author’s note

If the promotion-demotion cycle is graceful (i.e. it was triggered by CM) then at first primary replica is demoted and then secondary replica is promoted. In case of primary replica failure SF tires (if possible) to perform graceful shutdown or primary replica and in parallel executes promotion sequence on secondary replica.

Shutdown


Shutdown routine can be triggered by various reasons including replica restart or replica failure. The shutdown routine describes the so-called graceful shutdown (i.e. the process isn’t crashed and SF controls the process).

When primary replica is shutdown the following sequence of events is performed:

  1. All previously created ICommunicationListener have their CloseAsync method called and awaited. The methods are called and awaited in sequence.
  2. CancellationToken passed to Service’s RunAsync method is canceled.
  3. Service’s RunAsync method is awaited.
  4. Service’s OnCloseAsync method is called and awaited.
  5. Service’s implementation object is destroyed.

When secondary replica is shutdown the following sequence of events is performed:

  1. All previously created ICommunicationListener have their CloseAsync method called and awaited. The methods are called and awaited in sequence.
  2. Service’s OnCloseAsync method is called and awaited.
  3. Service’s implementation object is destroyed.

Conclusion


This information is included into main Service Fabric Handbook blog post. Hope you enjoy reading 🙂

See you next time!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s