Advanced Upstreams Configuration

EdgePeak can serve content it obtains from upstream servers (i.e., origins) and cache it as needed to reduce the load on the upstream networks and servers.

EdgePeak can declare multiple groups of upstream servers, each with its load-balancing policy, retry policy, and dedicated connection pool. If you need distinct settings for two contexts on the same set of servers, declare multiple origin groups: a server may be declared in numerous groups.

Each server in a group has its own connection pool on each worker. The number of connections that can be kept alive can be configured, and the maximum number of concurrently opened connections can also be limited by limiting the number of concurrent inflight requests. Those settings are essential to optimizing the performance of origins, whose performance may collapse if overloaded.

We can automatically exclude each server in a group if we encounter specific errors or obtain a configurable status code from it.

📘

IMPORTANT

Pay attention that you shall not enable denial on upstreams when the upstream is actually a load balancer: multiple servers are actually available behind the IP address load balancer. Notice that you should also allow retry on the same IP address (rather than switch to next server) in this specific case as the load balancer has multiple servers behind it's single IP address, and one may be down but the others working.


Naming an Upstream Endpoint

Requests can then be proxied to any of those upstream groups using the set_proxy directive. The request and headers can be altered before being processed, the response of the upstream server can also be altered (header) before being stored into the cache (shared across all consumer clients), and finally the response (coming from the cache) can be altered, both headers and body, before being sent to the client.

You need to declare an upstream group by using a name. This name is being used in the set_proxy directive and will appear in statistics reporting.

config.upstreams["myname"] = {
      .load_balancing = balancing::rendez_vous ,
      .endpoints = {"http ://192.168.1.1:8100","http ://192.168.1.2:8100","http ://192.168.1.3:8100"}
};

Advanced Upstream Endpoint

The endpoints property of upstream_configuration objects contains a list of URLs (scheme, host or IP addresses, and port), together with metadata such as the weight and distance that influence load-balancing algorithm behavior.

🚧

IMPORTANT

Implement and describe something about "Host" header.

The endpoint is thus specified as scheme://address:port or scheme://address. The address may be any of:

  • IPv4 address as in https://192.168.1.1
  • IPv6 address as in https://[fd00::33]
  • FQDN (Fully Qualified Domain Name) as in https://shield1.my.domain.com. In this case, whenever a new connection is established, the FQDN will be resolved. Resolutions are cached internally according to their TTLs.
  • Domain name identical to the above, except that search domains from /etc/resolv.conf will be honored

The optional weight and distance can be specified using one of the following syntaxes.

config.upstreams["origins"] = {
      .endpoints = {"http ://127.0.0.1:8080", "http ://127.0.0.1:8080"}
};

config.upstreams["origins2"] = {
      .endpoints = {{.host = "http ://127.0.0.1:8080", .weight = 0},
                    {.host = "http ://127.0.0.1:8080", .weight = 16}}
};

auto & origins3 = config.upstreams["origins3"] = {
   .endpoints = {{.host = "http ://127.0.0.1:8080", .weight = 0},
                 {.host = "http ://127.0.0.1:8080", .weight = 12}}
};

origins3.endpoints.emplace_back ({.host = "http ://127.0.0.1:8080"});
origins3.endpoints.emplace_back("http ://127.0.0.1:8080");

🚧

IMPORTANT

If an FQDN resolves to multiple IP addresses, the traffic won't be properly balanced, and retry and denylisting mechanism won't work when an endpoint is declared directly with the FQDN as it will be considered as a single server (similarly to the issue when a VIP/IP anycast load balancer is used).
In this case, the configuration must declare that the list of endpoints can be extracted from DNS. This will dynamically (and at each expiration of the DNS record) update the list of endpoints to match the A / AAAA records of the DNS.


DNS resolution

To have a list of endpoints matching DNS records, one can use the dns_resolve_ipv4, dns_resolve_ipv6 or dns_resolve_any functions. These functions take a list of endpoints (with scheme, port, weight, and distance as in the above syntax).

config.upstreams["origins"] = {
      .endpoints = dns_resolve_ipv4({"http://shields :8080"})
};
config.upstreams["origins2"] = {
      .endpoints = dns_resolve_any(
                 {{.host = "http:// shields_set1 :8080", .weight = 8},
                  {.host = "http:// shield_set2 :8080", .weight = 16}})
};

📘

NOTE

Kubernetes exposes the IP address of the pods through DNS service. Using dns_resolve is thus the idiomatic way to integrate with Kubernetes service discovery.