[Infra] HTTP load-balancing using HaProxy
If someone who has spent many hours with servers and network part may be familiar with the term load-balancing. However, applying load-balancing into reality 's never an easy task because it requires deep knowledge about systems and network traffic. This post will introduce tool to support load ...
If someone who has spent many hours with servers and network part may be familiar with the term load-balancing. However, applying load-balancing into reality 's never an easy task because it requires deep knowledge about systems and network traffic. This post will introduce tool to support load balancing for HTTP traffic named HaProxy and provide a small lab based on MacOS and Virtualbox.
Glance at HaProxy
HAProxy standing for High Availability Proxy is a open source software which is built for load balancing purpose from Layer 4 to Layer 7. HAProxy works almost based on Linux, Solaris and FreeBSD. The main role of HAProxy is increasing performance of distributed system over numerous server. Today, there’re many sites use HAProxy as an load balancer such as Github, Imgur, Instagram and Twitter.
HAProxy’s integrated many useful features but I just want to concentrate on three keys function which are background of HAProxy
Access Control List (ACL)
Access Control List defines rules for switch server based on some characters of incoming traffic. ACL can help the system to divide the main application with bundle of function into different kinds of services which are served in separated servers. An ACL contains two crucial parts:
*Define criterion with sets of values
*Perform actions accompany with sets of values when its valid
To make an ACL, we need to follow their syntax:
acl <aclname> <criterion> [flags] [operator] <value> ...
-
acl: keyword for access control list
-
<aclname>: name specific for each ACL and using case-sensitive to distinguish others
-
<criterion>: define the portion need to match with request/response
-
<flags>: the main action when matching. (I) -i ignore case during matching, (II) -f load matching pattern from file and (III) — force end of flags, use when a string’s similar to one of the flag
-
<operators>: comparing operaters, depend on type of matching (Integers, String, Regex, network and IP address). E.g: eq true if equal with value, ge true of value is greater or equal at least one value
-
<value>: The name has revealed everything
For more detail about Access Control List, let visit documentation page of HAProxy
Frontend
Frontend is a postion receiving incoming request then forward it to suitable backends. In HAProxy configuration, Frontend has to contains three components:
-
a set of IP address and port (e.g 10.1.2.3:80, *:22, etc.)
-
ACLs
-
use_backend rules which define backends for each ACLs condition if it’s matched or use default_backend for remaining case
We can set configuration in Frontend to be suitable with various kind of network traffic from Layer 4 to Layer 7 of OSI model.
Backend
Backend contains list of server for forwarded requests. Fundamentally, Backend defined by:
-
Load balancing algorithm
-
List of servers and ports
For example, here is a sample of backend configuration:
backend web-backend balance roundrobin server web-1 web1domain.com:80 check server web-2 web2domain.com:80 check
About algorithms, there’re roughly different types:
-
Round Robin: The most common one, the servers will resolve forwarded request follow turns and server list will be ordered based on their weights, suitable for HTTP
-
Least Connections: The server with lowest connection number will be chosen. Recommend for request of long session such as LDAP, SQL, TSE...
-
Source: the request will be served based on their original IP address.
-
URI/URL: similar to source but input params are URI or URL partern
Laboratory
Experiencing infrastructure’s never easy for everyone, especially with high performance devices like servers. This part I will introduce how to create a small lab using HAProxy using VirtualBox on OSX Elcapital
Setup environment
To implement HAProxy system, I need to have at least 3 servers: 2 HTTP servers and 1 HAProxy with connections like this
All of servers in here are running Ubuntu 14.04 server and the IP address should be statically config like above diagram. For someone doesn’t familiar with Virtualbox on OSX, I recommend to install at least 2 network interfaces/each servers:
-
An interface running NAT: to connect the Internet, for installing packages
-
An interface connected to internal or host-only network: connect to 192.168.56.0/24 network, just for private of HAProxy. For more detail about setup internal network of Virtualbox, let refer this link
Web servers
Assign IP address
auto eth0 iface eth0 inet dhcp # The host only network interface auto eth2 iface eth2 inet static address 192.168.56.102 # For server 2: 192.168.56.103 netmask 255.255.255.0
Install Apache2
sudo apt-get install apache2
To distinguish two web server, let do some modification in /var/www/html/index.html
<body> <div class="main_page"> <div class="page_header floating_element"> <img src="/icons/ubuntu-logo.png" alt="Ubuntu Logo" class="floating_element"/> <span class="floating_element"> Web server 1(2) </span> </div> <div class="content_section floating_element"> </div> </div> </body>
This html code is content for web page when accessing HTTP service of each servers and we need to mark the pages from each of them.
HAProxy server
Install HAProxy’s so simple with
apt-get install haproxy
To enable service, we need to set a flag in /etc/default/haproxy and set ENABLED to 1
Configuration
The crucial part of HAProxy places in /etc/haproxy/haproxy.cfg where define behaviors of HAProxy. This is the main configuration of HAProxy
global log 127.0.0.1 local2 chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin stats timeout 30s user haproxy group haproxy daemon # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private # Default ciphers to use on SSL-enabled listening sockets # For more information, see ciphers(1SSL) ssl-default-bind-ciphers kEECDH+aRSA+AES:kRSA+AES:+AES256:RC4-SHA:!kEDH:!LOW:!EXP:!MD5:!aNULL:!eNULL defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 20 timeout queue 86400 timeout connect 86400 timeout client 86400 timeout server 86400 timeout http-keep-alive 30 timeout check 20 maxconn 50000 frontend LB bind 192.168.56.101:80 reqadd X-Forwarded-Proto: http default_backend LB backend LB 192.168.56.101:80 mode http stats enable stats hide-version stats uri /stats stats realm Haproxy Statistics stats auth haproxy:admin # Credentials for HAProxy Statistic report page. balance roundrobin # Load balancing will work in round-robin process. option httpchk option httpclose option forwardfor cookie LB insert server web1-srv 192.168.56.102:80 cookie web1-srv check # backend server. server web2-srv 192.168.56.103:80 cookie web2-srv check # backend server.
The global and defaults part contains some basic configuration for incoming traffic and request. We can easily detect some rule define for fronted and backend that we mentioned above. However, there’s another tool of HAProxy to monitor and tracking the system is statistics. The stats ’s enabled and the portal to access through a web page with credentials haproxy/admin. We can see detail information of request comming and forwarding to the backend
Now everything’s done, let try by access the frontend IP address 192.168.56.101 by different browsers and we can see the web pages comming from different servers
Safari
Firefox
The bottom line
It’s very hard to compare HAProxy with others methods because it depends on abilities of administrators as well as the strength of system. However, HAProxy has soften the challenges for us when scaling the large application and now, we can forget all the nightmares about server configuration when adding/removing new one.