Policing Software Service Processing – Part 2

Part 2 in a series of articles on effective ways of controlling concurrent code execution using techniques such as quality of service (QoS), adaptive control valves, circuit breakers and back-pressure as promoted by proponents of reactive programming.

Part 1 in the series looked at the points of influence where some degree of control could be inserted and exerted into the software execution of code by a thread. In this post, I look at how to introduce some sort of control policy into a system that is manageable as it would not be practical to define resource management controls for each and every actor and activity within a system such as an enterprise application, which could easily have +10K possible activities, assuming that instrumentation was applied dynamically at runtime to the entire class bytecode loaded. We need some form of grouping of activities into policy profiles; which is where classification comes in. A classification is a mapping of multiple named activities to some control mechanism guided in its intervention by a policy.

To effectively model the management of a system, actors are identified as too are the possible activities they can perform, which when actioned cause consumption to occur for one or more shared resources. In mapping, this proposed system model to an application runtime such as the JVM an actor becomes a thread, an activity becomes a method, an action becomes a call, and a resource some underlying accounting counter.

pssp-classification-table

A classification of an activity is used to map the execution of an activity by an actor to a resource management. Typically this association mapping uses partial activity names (packages) rather than listing each and every possible activity (method), though the granularity in the targeting is in no way limited in either direction. With classification, the application of control is selective in that if an activity is not matched against one or more possible classifications then it becomes a non-managed activity when actioned. This might seem strange at first but remember rarely does one manage without also monitoring. To know where exactly management (or control) should be employed there needs be to observation followed by analysis and understanding of the system execution nature, including call dependencies (caller-to-callee chains).

pssp-part-2-basic-control-flow

When an activity is actioned, a method called, a lookup of the classification is performed (if not previously done). When a classification is matched execution control passes over to the association control policy or control mechanism. It is possible for an activity to match multiple classifications though it is expected that most managed runtimes, such as Sentris, will select one classification based on some ordering or the degree of matching of the activity name(space). Sentris also supports multiple classifications but only across different control extensions. An example of a control extension would be Quality of Service (QoS) with the extension classification being a service. Another example would be a supervisory extension based on metering quotas or adaptive control valves.

pssp-part-2-basic-control-passing

Most of the control extensions would utilize one, or more, lower level forms of concurrency control such as monitors, semaphores, atomics, and latches. These themselves could very well be introduced directly into the code by a developer but it would convolute the code with multiple (crosscutting) concerns – shared state control, performance monitoring and resource management control. Whether a developer would have the knowledge, skill, and experience to apply resource management at the appropriate points in the execution in a uniform and safe manner is debatable, to say the least. Late binding of resource management at runtime using dynamic instrumentation and externally defined and managed policy configuration is a far better approach if not the only viable approach.

The next article in the series will look at how to create and define versatile control mechanisms using virtual resource pools assigned to QoS classifications.

Delay is not always introduced, only when some form of intervention is required by the managed such as when multiple activities, in-progress or beginning, match to a shared classification. Whilst a delay might negatively impact performance locally it can potentially improve the global performance of the system in terms of response time and throughput as well as increased resilience.