Erlang C queue model

The Erlang C distribution
Erlang C queue model calculator
Erlang C queue model input form
Interpreting the results
Optimisation
FAQ

The Erlang C distribution

The Erlang C distribution is used for dimensioning server pools where requests for service wait on a first in, first out (FIFO) queue until an idle server is available. It is based on the following assumptions:

There are an infinite number of sources;
Calls arrive at random;
Calls are served in order of arrival;
Blocked calls are delayed; and
Holding times are exponentially distributed.

The Erlang C formula is used to predict the probability that a call will be delayed, and can be used to predict the probability that a call will be delayed more than a certain time. From that other key queue performance metrics can be calculated. The Erlang C formula is:

                 erlangc.gif (2939 bytes)
where:
     P(>0)=Probability of delay greater than zero
     N=Number of servers in full availability group
     A=Traffic offered to group in Erlangs

Erlang C queue model calculator

Tables of Erlang C have been commonly published, but are unwieldly to use. This convenient calculator will find the number of servers needed to deliver a specified service level given the transaction times and call rate. It will then tabulate the expected performance for numbers of servers (agents) around that optimal value to indicate how sensitive some performance metrics are to slight changes in resourcing in some (most) scenarios.

Erlang C queue model input form

Interpreting the results

Overview

All of the results in the table are averages for all calls.

Probability of delay

This is the (per unit) probability on average that an incoming call will be delayed (ie that it won't go straight to a server (agent).

Speed of answer

Speed of answer is the average of the delay over all calls.

Delayed delay

Delayed delay is the average of the delay over only delayed calls.

Queue length

Queue length is the average length of the queue considering all calls.

Delayed queue length

Queue length is the average length of the queue considering only delayed calls.

Service level

Service level is the per unit proportion of calls that were answered within the wait objective.

Agent utilisation

Agent utilisation is the per unit utilisation of the agents having regard for both the talk time and after call (wrap) work.

Trunk intensity

Trunk intensity is the traffic level (in Erlangs) on the trunks having account of the wait time. If you pay time charges on incoming calls, trunk intensity is important, it can be used to calculate the cost of incoming calls.

Optimisation

Assumptions

Calls don't usually arrive at uniformly distributed times, and don't usually require a constant time for performance. Indeed for most traffic situations calls arrive randomly and require variable service, most calls requiring short service and fewer requiring long service. The assumption of exponentially distributed service times is a good estimator of many real situations, and if anything is a little conservative where service times approach a constant. We also assume a very large pool of potential callers, and that when they call they will wait if there are no free servers. These are the assumptions that need to be valid to apply the Erlang C model to a particular scenario.

Model behaviour

The call rate and transaction times determine the raw agent workload, and set the minimum number of agents that can perform the work on average at the rate at which it arrives. If we configured a server pool on this basis, the servers would be 100% utilised (very efficient) and wait times for service would be unacceptably large.

As we add servers, wait times reduce dramatically at first, but then with diminishing returns. Hand in hand with this, agent utilisation falls off.

This model allows you to calculate the number of servers required to deliver a specified service level object at a given traffic level.

Fragmentation of server pools

For a number of technology and business reasons, many organisations installed multiple call centres handling the same type of work, but not working as a single server pool. Telecommunications cost structures and improved call centre technology allows the pooling of resources to act as a single large call centre independent of the physical topology.

Try this workload profile with the total call rate allocated to a single 'virtual call centre' and allocated evenly to 10 call centres and look at the difference in total numbers of agents: talk=180, wrap=15,SLO=85% answered within 10 seconds, total call rate=0.2777 (1000 calls per hour) (just press Example 2 on the form above, and Example 3 for the 100 calls per hour scenario). (Answer: 62 agents in the virtual call centre (one queue model) and 45% more at 90 in the 10 discrete queues model.)

There are no rules of thumb here. Pooling of adequately skilled resources always reduces the number of agents needed, often (but not limited to) 10% to 15% for large call centre operations and more for small operations. Whether that and other benefits offset the additional costs of the technology platform and call charges needs to be examined in the context of the particular scenario, and it is often worthwhile.

Least cost

Although a service level objective may be struck from a business service perspective (say 80 percent of callers will talk to a rep within 10 seconds) and you can calculate the minimum number of servers required to achieve that objective, this might not be the point of least total cost. Depending on the cost structure of your incoming calls and labour, the savings in reduced call wait time might offset the cost of additional agents.

It might be cheaper to give better service!

FAQ

See my Erlang C FAQ page. If your question is not covered, ask me and if it is of general interest I will try to answer it and post it to the FAQ.

Other resources

Return to Telephony Traffic Modelling