服务注册发现与nginx upstream pool自动扩缩容结合

Summary

在更新nginx upstream pool时,一般方案是更新nginx配置文件,然后reload nginx, 在nginx繁忙时会导致worker process长久处于shutdowning, 
在频繁reload ng时,可能会导致nginx挂掉. 

本文用的方案是:consul + cosnul-template + tengine dyups + python script , 当服务有变动时,触发consul-template渲染模板并持久化,同时
consul-template触发执行python脚本,python脚本先调用consul api获取对应服务信息,然后将新的服务信息调用dyups管理接口写入nginx upstream pool. 

这个方案有两个好处, 一是将nginx upstream pool持久化了  二是在服务出现变动时,不需要再reload nginx了。

简略流程如下:

Tengine编译安装

tengine是支持加载动态库(*.so)的 , 但ngx_http_dyups_module模块是不能编译为动态库的, 需要重新编译tengine,且在编译时启用“--with-http_dyups_module” 
Tengine编译参数
1
2
3
4
--prefix=/usr/local/tengine \
--with-http_ssl_module --with-http_stub_status_module --with-http_gzip_static_module --with-http_gunzip_module \
--with-http_realip_module --add-module=modules/nginx-module-vts --add-module=modules/ngx_cache_purge-2.3 \
--with-http_dyups_module
nginx.conf
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
daemon on;
error_log logs/error.log debug;
events {
}
dso {
load ngx_http_echo_module.so;
}
http {

vhost_traffic_status_zone;

dyups_upstream_conf upstream_conf/*.conf;
include upstream_conf/*.conf;

server {
listen 8081;

location / {
dyups_interface;
}

location /ng_status {
access_log off;
vhost_traffic_status_display;
vhost_traffic_status_display_format html;
allow 192.168.1.0/24;
deny all;
}
}
}
其它
1
mkdir -p /usr/local/tengine/conf/upstream_conf/ , 在该目录下不要放置任何conf文件, 该目录下的conf文件都将由consul-template生成.
题外话: 制作动态库并加载它,在此以“echo-nginx-module”模块为例.
1
2
3
4
5
6
7
8
9
10
11
模块:
https://github.com/openresty/echo-nginx-module
http://tengine.taobao.org/document/dso.html

编译:
/usr/local/tengine/sbin/dso_tool --add-module=/tmp/nginx/modules/echo-nginx-module-0.61/

nginx.conf:
dso {
load ngx_http_echo_module.so;
}

服务注册

本文的测试环境是同先前的consul文章中的环境一样的,需要注意的是,注册的服务名不能含有“.” , prod.cloudiya.com这样的服务名在
consul-template是解析不了的,consul官方也提到了这点.
IP ROLE 其上注册的服务
192.168.1.4 consul server prod-cloudiya-com
192.168.1.5 consul server prod-cloudiya-com
192.168.1.6 consul server prod-cloudiya-com
192.168.1.100 consul client test-cloudiya-com
192.168.1.101 consul client , consul-template , nginx test-cloudiya-com
服务定义文件内容

prod-cloudiya-com.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
"service":
{
"name": "prod-cloudiya-com",
"tags": ["web","prod"],
"port": 8090,
"check":
{
"DeregisterCriticalServiceAfter": "90m",
"http": "http://127.0.0.1:8090",
"interval": "10s",
"timeout": "1s"
}
}
}

test-cloudiya-com.json

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{
"service":
{
"name": "test-cloudiya-com",
"tags": ["web","test"],
"port": 8090,
"check":
{
"DeregisterCriticalServiceAfter": "90m",
"http": "http://127.0.0.1:8090",
"interval": "3s",
"timeout": "1s"
}
}
}
在如上每台consul主机上将对应服务注册进consul ,并在每台主机“ /usr/bin/nohup python -m SimpleHTTPServer 8090 > /dev/null 2>&1 &” 。

consul-template

/opt/consul-template/template/prod.cloudiya.com.conf.ctmpl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# ----------prod.cloudiya.com------------------
server {
listen 80;
server_name prod.cloudiya.com;
location / {
proxy_pass http://$host;
}
}

upstream prod.cloudiya.com {
{{range service "prod-cloudiya-com"}}
server {{.Address}}:{{.Port}};
{{else}}
server 127.0.0.1:65535;
{{end}}
}
/opt/consul-template/template/test.cloudiya.com.conf.ctmpl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# ---------test.cloudiya.com------------------
server {
listen 80;
server_name test.cloudiya.com;
location / {
proxy_pass http://$host;
}
}

upstream test.cloudiya.com {
{{range service "test-cloudiya-com"}}
server {{.Address}}:{{.Port}};
{{else}}
server 127.0.0.1:65535;
{{end}}
}
/opt/consul-template/conf/nginx.hcl
1
2
3
4
5
6
7
8
9
10
11
12
13
consul {
address = "192.168.1.101:8500"
}
template {
source = "/opt/consul-template/template/prod.cloudiya.com.conf.ctmpl"
destination = "/usr/local/tengine/conf/upstream_conf/prod.cloudiya.com.conf"
command = "python /opt/consul-template/script/UpdateNgUpstreamPool.py prod.cloudiya.com"
}
template {
source = "/opt/consul-template/template/test.cloudiya.com.conf.ctmpl"
destination = "/usr/local/tengine/conf/upstream_conf/test.cloudiya.com.conf"
command = "python /opt/consul-template/script/UpdateNgUpstreamPool.py test.cloudiya.com "
}
run consul-template
1
consul-template -config "/opt/consul-template/conf/nginx.hcl"

UpdateNgUpstreamPool.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
#!/usr/bin/python

import sys
import json
import urllib
import httplib

consul_addr = "192.168.1.101"
consul_port = 8500
nginx_manage_addr = "192.168.1.101"
nginx_manage_port = 8081

def CallApiGetData(Host,Port,ReqUrlSuffix):
"""
Call api interface for get docker and container information
"""
httpClient = None
try:
httpClient = httplib.HTTPConnection(Host,Port,timeout=2)
httpClient.request('GET',ReqUrlSuffix)
response = httpClient.getresponse()
if response.status == 200:
resp_string = response.read()
resp_format_data = json.loads(resp_string)
return resp_format_data
else:
print("return code %s"%response.status)
sys.exit(1)
except Exception,e:
print("call api error!,%s"%e)
sys.exit(1)
finally:
if httpClient:
httpClient.close()


def SendDataToApi(Host,Port,ReqUrlSuffix,method,Data):
httpClient = None
try:
headers = {"Content-type": "application/x-www-form-urlencoded","Accept": "text/plain"}
httpClient = httplib.HTTPConnection(Host,Port,timeout=2)
httpClient.request(method,ReqUrlSuffix,Data,headers)
response = httpClient.getresponse()
print response.status , response.read()
except Exception,e:
print ("Error Info : %s"%e)
finally:
if httpClient:
httpClient.close()

def ServiceDataFormat(Data):
upstream_hosts = []
for service in Data:
node = service["Node"]["Address"]
port = service["Service"]["Port"]
upstream_host = "server %s:%s"%(node,port)
upstream_hosts.append(upstream_host)
return ";".join(upstream_hosts)+";"

if __name__ == "__main__":

service_domain_name = sys.argv[1]
service_name = service_domain_name.replace(".","-")

#get health service
#http://192.168.1.101:8500/v1/health/service/test-cloudiya-com?passing

ret_data = CallApiGetData(consul_addr,consul_port,"/v1/health/service/%s?passing"%service_name)
if ret_data:
upstream_hosts_str = ServiceDataFormat(ret_data)
method = "POST"
else:
upstream_hosts_str = ""
method = "DELETE"

#send upstream data to nginx
#curl -d "server 192.168.1.100:8090;server 192.168.1.101:8090;" 127.0.0.1:8081/upstream/test.cloudiya.com
#或curl -i -X DELETE 127.0.0.1:8081/upstream/test.cloudiya.com
SendDataToApi(nginx_manage_addr,nginx_manage_port,"/upstream/%s"%service_domain_name,method,upstream_hosts_str)

nginx监控页面

1
http://192.168.1.101:8081/ng_status

关于调用tengine dyups管理接口的示例

==在将upstreams信息写往dyups接口时,其对weight , max_fails , down等都是支持和生效的,==

==就如同你先前在nginx配置文件里加upstream条目一样。==

1
2
3
4
5
6
7
curl -d "server 192.168.1.4:8090;server 192.168.1.5:8090;server 192.168.1.6:8090 weight=2 max_fails=2 fail_timeout=10s;" 127.0.0.1:8081/upstream/prod.cloudiya.com

curl -d "server 192.168.1.100:8090;server 192.168.1.101:8090;" 127.0.0.1:8081/upstream/test.cloudiya.com

curl -d "server 192.168.1.100:8090;server 192.168.1.101:8090 down;" 127.0.0.1:8081/upstream/test.cloudiya.com

curl -i -X DELETE 127.0.0.1:8081/upstream/test.cloudiya.com 删除test.cloudiya.com

参考

1
2
3
4
http://tengine.taobao.org/document/http_dyups.html
https://github.com/openresty/echo-nginx-module
http://dockone.io/article/667 (服务发现:Zookeeper vs etcd vs Consul)
http://www.jianshu.com/p/59d7173c78a9