Queuing Theory

Lesson 1 - Modeling the problem

System definition

The system of a queue is characterized by:

  • the customer arrival process (e.g. poisson process)
  • the service duration (e.g. exponential law)
  • the number of servers
  • the maximum number of customers in the system (buffer)
  • service discipline (e.g. FIFO), possible service classes (priorities), etc.

Arrival times

\(T_i\) is the arrival time of customer number \(i\).

\(U_i = T_{i+1} - T_1\) is the interarrival-time between customers number i and (i+1).

In simple models, we assume that successive inter-arrival times \(U_1, U_2, U_3, ...\):

  • are independent
  • follow the same probabibility distribution

The customer arrival process is then fully characterized by the law of inter-arrival times.

Poisson arrivals

A particular case of arrival processes is the Poisson process in which the successive inter-arrival times \(U_1, U_2, U_3, ...\) are independent, and distributed according to the exponential law with parameter \(\lambda\). Equivalently , the arrival process is a Poisson process with parameter \(\lambda\).

\[ P(U_i \geq u) = e^{-\lambda u} \]

The arrival rate \(\lambda\) is the the average number of arrivals per time unit (e.g. patients per hour). The average inter-arrvial time u equals \(1/\lambda\) (time units, e.g. hours)

import math
import numpy as np
import plotly.express as px

def tail_dist_u(l, u): # Tail distribution of interarrival times for patients
  return math.exp(-l*u)

l = 6 # l patients per hour
mean_u = 1/l # mean interarrival times in fraction of an hour
test = [0.25, 0.75, 1, 1.25]
u_list = [t*mean_u for t in test]

for u in u_list:
  print(f'The probablity that U_i is larger than {u} = {tail_dist_u(l, u)}')
The probablity that U_i is larger than 0.041666666666666664 = 0.7788007830714049
The probablity that U_i is larger than 0.125 = 0.4723665527410147
The probablity that U_i is larger than 0.16666666666666666 = 0.36787944117144233
The probablity that U_i is larger than 0.20833333333333331 = 0.2865047968601901
u_list = np.linspace(0.01, 1, num=1000)
probs = [tail_dist_u(l, u) for u in u_list]
px.line(x=u_list, y=probs, title='Exponential inter-arrival times', labels={'x': 'u', 'y': 'P'})

Lesson 2 - Service durations

Service times

The service time \(S_i\) of customer \(i\) is the time during which a server is busy servicing customer \(i\).

Service times

we assume that service times \(S_1, S_2, S_3, ...\):

  • are independent and **identically distributer* (idd).
  • are characterized by their probability distribution.

The exponential distribution is an important particular case of service duration.

\[ P(S_i \geq s) = e^{-\mu s} \]

mu = 10 # mu patients per hour
mean_s = 1/mu # mean service time in fraction of hour
test = [0.25, 0.75, 1, 1.25, 3]
s_list = [t*mean_s for t in test]

for s in s_list:
  print(f'The probablity that S_i is larger than {s} = {tail_dist_u(mu, s)}')
The probablity that S_i is larger than 0.025 = 0.7788007830714049
The probablity that S_i is larger than 0.07500000000000001 = 0.47236655274101463
The probablity that S_i is larger than 0.1 = 0.36787944117144233
The probablity that S_i is larger than 0.125 = 0.2865047968601901
The probablity that S_i is larger than 0.30000000000000004 = 0.049787068367863924
s_list = np.linspace(0.01, 1, num=1000)
probs = [tail_dist_u(mu, s) for s in s_list]
px.line(x=s_list, y=probs, title='Exponential service times', labels={'x': 's', 'y': 'P'})

Offered load

  • The arrival rate \(\lambda\) is the average number of arrivals per time unit
  • The service rate \(\mu\) is the average number of customers a server is able to handle if it is always busy. Equivalently, \(1/{\mu}\) is the average service duration.

The average performance of the system depends on the offered load \(\rho\), defined as

\[ \rho = \frac{\lambda}{\mu} \ (unit: Erlang) \]

Eg. If on average 10 clients arrive during one hour and the average service time per client is 5 minutes, then the offered load equals 50 minutes.

Lesson 3 - Servers and waiting buffer

Number of servers

In a system with multiple servers, there cannot be any customers waiting if at least one server is idle. As soon as one server is freed, a customer among those waiting uses it (if the waiting buffer is not empty).

It is generally assumed that servers are identical, meaning that the distribution of service durations is the same for all servers.

Waiting buffers

The maximum number of customers in the system is the sum of the number of positions in the waiting buffer and the number of servers.

If a customer arrives to a finite buffer that is full, this customer is lost (blocked). This is called a blocking queue.

Finite buffer

If a customer arrives to a infinite buffer and all the servers are busy, this customer lines up in the waiting buffer. This is called a pure waiting queue.

Infinite buffer

Lesson 4 - A simple example of a queue

The evolution over time of the number of customers in the system (waiting or being served) is modeled by a random process.

The stationary state of this random process is studied in order to estimate average performance (blocking probability, mean delay, mean utilization rate, etc.)

Lesson 5 - Exponential distribution

Tail distribution

\[ P(X \geq x) = e^{-\lambda x} \]

s_list = np.linspace(0.01, 1, num=1000)
probs = [tail_dist_u(mu, s) for s in s_list]
px.line(x=s_list, y=probs, title='Exponential service times', labels={'x': 'x', 'y': 'P'})
Mean \(E(X) = 1/\lambda\)
Variance \(var(X) = E(X^2) - (E(X))^2\)
\(= E((X-E(X))^2)=1/\lambda^2\)
Squared Coefficient of Variation \(Cv^2=\frac{var(x)}{E(x)^2}=1, \forall \ \lambda\)

Memoryless property

Information of the time already spent by customer 1 in the system does not influence the distribution of the inter-arrival time of customer 2.

\[ P(U_i\geq u+u_0\ |\ U_i \geq u_0) = P(U_i\geq u) = e^{-\lambda u} \]

where \(P(A\ |\ B) = \frac{P(A \cap B)}{P(B)}\) when A and B are two events such that \(P(B) \neq 0\)

Minimum of two exponential random variables

Let \(U \sim e^ \lambda\) and \(S \sim e^ \mu\). Assume that \(U\) and \(S\) are independent.

\[Z = min(U,S) \text{, so that}\] \[Z = U\ \text{ if } \ U \leq S \] \[Z = S \ \text{ if } \ U > S \] \[Z = min(U,S) \sim e^{\lambda + \mu}\]

Departure after arrival \(P(Z=U) = P(U \leq S) = \frac{\lambda}{\lambda + \mu}\)
Departure before arrival \(P(Z=S) = P(U > S) = \frac{\mu}{\lambda + \mu}\)

Lesson 6 - Poisson Process

Characterizing Arrivals with the Counting Process

\[ A(t) = \sum_i^\infty \unicode{x1D7D9}_{T_i \leq t} \]

def A(t, T):
    """
    Counting function A(t).

    Parameters:
    - t : The value up to which to count.
    - T : A list or array of T_i values.

    Returns:
    - The count of T_i values less than or equal to t.
    """
    count = sum(1 for Ti in T if Ti <= t)
    return count

# Example usage:
# Suppose we have a list of T_i values
T_values = [1, 3, 5, 7, 9, 10]  # Replace this with the actual T_i values you have

# Now, we can compute A(t) for a specific t value
t = 6  # Replace this with the t value you want to use for counting
count = A(t, T_values)

print(f'A({t}) = {count}')
A(6) = 3

\[ \lim_{{\Delta t \to 0}} \frac{o(\Delta t)}{\Delta t} = 0 \]

This expression is used in calculus to describe the behavior of a function as the change in the variable \(\Delta t\) approaches zero. The function \(o(\Delta t)\) is a “little-o notation” which is used to describe a function that grows much slower than \(\Delta t\) as \(\Delta t\) approaches zero. In other words, \(o(\Delta t)\) becomes insignificant compared to \(\Delta t\).

The little-o notation \(o(\Delta t)\) is often used to describe an error term or a remainder term in Taylor series expansions, and it means that the ratio \(\frac{o(\Delta t)}{\Delta t}\) goes to zero as \(\Delta t\) goes to zero. This is a way of saying that \(o(\Delta t)\) is negligible compared to \(\Delta t\) in the limit where \(\Delta t\) is very small.

This is an important concept in differential calculus and is used to formalize the definition of the derivative, among other things. It essentially says that whatever the function \(o(\Delta t)\) is, it becomes irrelevant as \(\Delta t\) becomes very small, because it grows much slower than \(\Delta t\) itself.

\[P(A(t + \Delta t) = n + 1 \mid A(t) = n) = \lambda \Delta t + o(\Delta t)\]

\[P(A(t + \Delta t) = n \mid A(t) = n) = 1 - \lambda \Delta t + o(\Delta t)\]

The expressions described above ensure that it is not possible for two or more customers to arrive at exactly the same time. Ergo: there will always be a time difference, however small, between two consecutive arrivals.