最近機(jī)房剛上了一批機(jī)器(有 100 臺左右),需要使用 Nagios 對這一批機(jī)器進(jìn)行監(jiān)控。領(lǐng)導(dǎo)要求兩天時間完成所有主機(jī)的監(jiān)控。從原來的經(jīng)驗來看,兩天時間肯定完成不了。那怎么辦?按照之前的想法,肯定是在 nagios 配置文件逐一添加每臺客戶端的監(jiān)控信息,工作量巨大。突然,想到一個想法,是否可以通過腳本來實現(xiàn)批量對主機(jī)進(jìn)行監(jiān)控,也就是運維自動化。
寫腳本,最重要的就是思路。思路壓倒一切,經(jīng)過思考最終決定就這么做了。先貼出來一張網(wǎng)路拓?fù)鋱D:
http://wiki.jikexueyuan.com/project/python-actual-combat/images/3.jpg" alt="pic" />
整個過程可以分為三部分。
這三部分最重要的應(yīng)該是 CMDB 端。接下來通過安裝 django 和編寫 API 接口實現(xiàn) cmdb 可以正常工作??梢詫?cmdb 端分為三個步驟來完成:
首先來進(jìn)行安裝 django:
在安裝 django 之前首先應(yīng)該安裝 python (版本建議 2.7.)
1.下載 django 軟件包
可以到 django 官方網(wǎng)站下載最新 django 軟件包(https://www.djangoproject.com).
2.解壓縮并安裝軟件包
tar -zxvf Django-1.5.1.tar.gz
cd Django-1.5.1
python setup.py install
創(chuàng)建項目和應(yīng)用:
1.創(chuàng)建一個項目
python startproject simplecmdb
2.創(chuàng)建一個應(yīng)用
python startapp hostinfo
配置 django:
1.修改 setting.py
DATABASES = {'ENGIN':'django.db.backends.sqlite','name':path.join('CMDB.db')} #使用的數(shù)據(jù)庫及數(shù)據(jù)庫名
INSTALLED_APPS =(hostinfoINSTALLED_APPS = ('hostinfo')
INSTALLED_APPS = ('hostinfo') #應(yīng)用的名稱
2.修改urls.py
url(r'^api/gethost\.json$','hostinfo.views.gethosts'), #nagios客戶端訪問API接口地址
url(r'^api/clooect$','hostinfo.views.collect'), #客戶端訪問API進(jìn)行上傳數(shù)據(jù)的API
url(r'^admin/',include(admin.site.urls)), #django后臺管理登入url
from django.contrib import admin
admin.autodiscover()
3.修改項目 hostinfo 下的 views.py
代碼如下:
# Create your views here.
#包含以下模塊
from django.shortcuts import render_to_response
from django.http import HttpResponse
from models import Host, HostGroup
#包含json模塊
try:
import json
except ImportError,e:
import simplejson as json
#用來接收客戶端服務(wù)器發(fā)送過來的數(shù)據(jù)
def collect(request):
req = request
if req.POST:
vendor = req.POST.get('Product_Name')
sn = req.POST.get('Serial_Number')
product = req.POST.get('Manufacturer')
cpu_model = req.POST.get('Model_Name')
cpu_num = req.POST.get('Cpu_Cores')
cpu_vendor = req.POST.get('Vendor_Id')
memory_part_number = req.POST.get('Part_Number')
memory_manufacturer = req.POST.get('Manufacturer')
memory_size = req.POST.get('Size')
device_model = req.POST.get('Device_Model')
device_version = req.POST.get('Firmware_Version')
device_sn = req.POST.get('Serial_Number')
device_size = req.POST.get('User_Capacity')
osver = req.POST.get('os_version')
hostname = req.POST.get('os_name')
os_release = req.POST.get('os_release')
ipaddrs = req.POST.get('Ipaddr')
mac = req.POST.get('Device')
link = req.POST.get('Link')
mask = req.POST.get('Mask')
device = req.POST.get('Device')
host = Host()
host.hostname = hostname
host.product = product
host.cpu_num = cpu_num
host.cpu_model = cpu_model
host.cpu_vendor = cpu_vendor
host.memory_part_number = memory_part_number
host.memory_manufacturer = memory_manufacturer
host.memory_size = memory_size
host.device_model = device_model
host.device_version = device_version
host.device_sn = device_sn
host.device_size = device_size
host.osver = osver
host.os_release = os_release
host.vendor = vendor
host.sn = sn
host.ipaddr = ipaddrs
host.save() #將客戶端傳過來的數(shù)據(jù)通過POST接收,存入數(shù)據(jù)庫
return HttpResponse('OK') #如果插入成功,返回'ok'
else:
return HttpResponse('no post data')
#提供給NAGIOS 的API
def gethosts(req):
d = []
hostgroups = HostGroup.objects.all()
for hg in hostgroups:
ret_hg = {'hostgroup':hg.name,'members':[]}
members = hg.members.all()
for h in members:
ret_h = {'hostname':h.hostname, #API接口返回的數(shù)據(jù)
'ipaddr':h.ipaddr
}
ret_hg['members'].append(ret_h)
d.append(ret_hg)
ret = {'status':0,'data':d,'message':'ok'}
return HttpResponse(json.dumps(ret))
4.修改 model.py 文件
代碼如下:
from django.db import models
# Create your models here.
#插入數(shù)據(jù)庫的Host表,主要存儲客戶端主機(jī)的信息
class Host(models.Model):
"""store host information"""
vendor = models.CharField(max_length=30,null=True)
sn = models.CharField(max_length=30,null=True)
product = models.CharField(max_length=30,null=True)
cpu_model = models.CharField(max_length=50,null=True)
cpu_num = models.CharField(max_length=2,null=True)
cpu_vendor = models.CharField(max_length=30,null=True)
memory_part_number = models.CharField(max_length=30,null=True)
memory_manufacturer = models.CharField(max_length=30,null=True)
memory_size = models.CharField(max_length=20,null=True)
device_model = models.CharField(max_length=30,null=True)
device_version = models.CharField(max_length=30,null=True)
device_sn = models.CharField(max_length=30,null=True)
device_size = models.CharField(max_length=30,null=True)
osver = models.CharField(max_length=30,null=True)
hostname = models.CharField(max_length=30,null=True)
os_release = models.CharField(max_length=30,null=True)
ipaddr = models.IPAddressField(max_length=15)
def __unicode__(self):
return self.hostname
#主機(jī)組表,用來對主機(jī)進(jìn)行分組
class HostGroup(models.Model):
name = models.CharField(max_length=30)
members = models.ManyToManyField(Host)
5.修改 admin.py 文件
#from models import Host, IPaddr
from models import Host, HostGroup
from django.contrib import admin
#設(shè)置在django在admin后天顯示的名稱
class HostAdmin(admin.ModelAdmin):
list_display = ['vendor',
'sn',
'product',
'cpu_model',
'cpu_num',
'cpu_vendor',
'memory_part_number',
'memory_manufacturer',
'memory_size',
'device_model',
'device_version',
'device_sn',
'device_size',
'osver',
'hostname',
'os_release'
]
#在django后臺amdin顯示的組名稱
class HostGroupAdmin(admin.ModelAdmin):
list_display = ['name',]
#將如上兩個類的數(shù)據(jù)展示到django的后臺
admin.site.register(HostGroup,HostGroupAdmin)
admin.site.register(Host, HostAdmin)
6.創(chuàng)建數(shù)據(jù)庫
python manager.py syncdb #創(chuàng)建數(shù)據(jù)庫
7.啟動應(yīng)用
python manager.py runserver 0.0.0.0:8000
8.測試
http://132.96.77.12:8000/admin
http://wiki.jikexueyuan.com/project/python-actual-combat/images/4.jpg" alt="pic" />
http://wiki.jikexueyuan.com/project/python-actual-combat/images/5.jpg" alt="pic" />
通過上圖可以看到,django 已經(jīng)配置成功。
接下來可以在客戶端編寫收集主機(jī)信息的腳本了,主要抓取 cpu、內(nèi)存、硬盤、服務(wù)器型號、服務(wù)器 sn、ip 地址、主機(jī)名稱、操作系統(tǒng)版本等信息,共 7 個腳本:
1.cpu 抓取腳本:
#!/usr/local/src/python/bin/python
#-*- coding:utf-8 -*-
from subprocess import PIPE,Popen
import re
def getCpuInfo():
p = Popen(['cat','/proc/cpuinfo'],shell=False,stdout=PIPE)
stdout, stderr = p.communicate()
return stdout.strip()
def parserCpuInfo(cpudata):
pd = {}
model_name = re.compile(r'.*model name\s+:\s(.*)')
vendor_id = re.compile(r'vendor_id\s+:(.*)')
cpu_cores = re.compile(r'cpu cores\s+:\s([\d]+)')
lines = [line for line in cpudata.split('\n')]
for line in lines:
model = re.match(model_name,line)
vendor = re.match(vendor_id,line)
cores = re.match(cpu_cores,line)
if model:
pd['Model_Name'] = model.groups()[0].strip()
if vendor:
pd['Vendor_Id'] = vendor.groups()[0].strip()
if cores:
pd['Cpu_Cores'] = cores.groups()[0]
else:
pd['Cpu_Cores'] = int('1')
return pd
if __name__ == '__main__':
cpudata = getCpuInfo()
print parserCpuInfo(cpudata)
2.硬盤抓取腳本:
#!/usr/local/src/python/bin/python
#-*- coding:utf-8 -*-
from subprocess import PIPE,Popen
import re
def getDiskInfo():
disk_dev = re.compile(r'Disk\s/dev/[a-z]{3}')
disk_name = re.compile(r'/dev/[a-z]{3}')
p = Popen(['fdisk','-l'],shell=False,stdout=PIPE)
stdout, stderr = p.communicate()
for i in stdout.split('\n'):
disk = re.match(disk_dev,i)
if disk:
dk = re.search(disk_name,disk.group()).group()
n = Popen('smartctl -i %s' % dk,shell=True,stdout=PIPE)
stdout, stderr = n.communicate()
return stdout.strip()
def parserDiskInfo(diskdata):
ld = []
pd = {}
device_model = re.compile(r'(Device Model):(\s+.*)')
serial_number = re.compile(r'(Serial Number):(\s+[\d\w]{1,30})')
firmware_version = re.compile(r'(Firmware Version):(\s+[\w]{1,20})')
user_capacity = re.compile(r'(User Capacity):(\s+[\d\w, ]{1,50})')
for line in diskdata.split('\n'):
serial = re.search(serial_number,line)
device = re.search(device_model,line)
firmware = re.search(firmware_version,line)
user = re.search(user_capacity,line)
if device:
pd['Device_Model'] = device.groups()[1].strip()
if serial:
pd['Serial_Number'] = serial.groups()[1].strip()
if firmware:
pd['Firmware_Version'] = firmware.groups()[1].strip()
if user:
pd['User_Capacity'] = user.groups()[1].strip()
return pd
if __name__ == '__main__':
diskdata = getDiskInfo()
print parserDiskInfo(diskdata)
3.內(nèi)存抓取腳本:
#!/usr/local/src/python/bin/python
#-*- coding:utf-8 -*-
from subprocess import PIPE,Popen
import re
import sys
def getMemInfo():
p = Popen(['dmidecode'],shell=False,stdout=PIPE)
stdout, stderr = p.communicate()
return stdout.strip()
def parserMemInfo(memdata):
line_in = False
mem_str = ''
pd = {}
fd = {}
for line in memdata.split('\n'):
if line.startswith('Memory Device') and line.endswith('Memory Device'):
line_in = True
mem_str+='\n'
continue
if line.startswith('\t') and line_in:
mem_str+=line
else:
line_in = False
for i in mem_str.split('\n')[1:]:
lines = i.replace('\t','\n').strip()
for ln in lines.split('\n'):
k, v = [i for i in ln.split(':')]
pd[k.strip()] = v.strip()
if pd['Size'] != 'No Module Installed':
mem_info = 'Size:%s ; Part_Number:%s ; Manufacturer:%s' % (pd['Size'],pd['Part Number'],pd['Manufacturer'])
for line in mem_info.split('\n'):
for word in line.split(';'):
k, v = [i.strip() for i in word.split(':')]
fd[k] = v.strip()
yield fd
if __name__ == '__main__':
memdata = getMemInfo()
for i in parserMemInfo(memdata):
print i
4.抓取服務(wù)器信息腳本:
#!/usr/local/src/python/bin/python
# -*- coding:utf-8 -*-
from subprocess import PIPE,Popen
import urllib, urllib2
def getDMI():
p = Popen('dmidecode',shell=True,stdout=PIPE)
stdout, stderr = p.communicate()
return stdout
def parserDMI(dmidata):
pd = {}
fd = {}
line_in = False
for line in dmidata.split('\n'):
if line.startswith('System Information'):
line_in = True
continue
if line.startswith('\t') and line_in:
k, v = [i.strip() for i in line.split(':')]
pd[k] = v
else:
line_in = False
name = "Manufacturer:%s ; Serial_Number:%s ; Product_Name:%s" % (pd['Manufacturer'],pd['Serial Number'],pd['Product Name'])
for i in name.split(';'):
k, v = [j.strip() for j in i.split(':')]
fd[k] = v
return fd
if __name__ == '__main__':
dmidata = getDMI()
postdata = parserDMI(dmidata)
print postdata
5.抓取主機(jī)信息
#!/usr/local/src/python/bin/python
#-*- coding:utf-8 -*-
import platform
def getHostInfo():
pd ={}
version = platform.dist()
os_name = platform.node()
os_release = platform.release()
os_version = '%s %s' % (version[0],version[1])
pd['os_name'] = os_name
pd['os_release'] = os_release
pd['os_version'] = os_version
return pd
if __name__ == '__main__':
print getHostInfo()
6.抓取 ip 地址:
#!/usr/local/src/python/bin/python
#-*- coding:utf-8 -*-
from subprocess import PIPE,Popen
import re
def getIpaddr():
p = Popen(['ifconfig'],shell=False,stdout=PIPE)
stdout, stderr = p.communicate()
return stdout.strip()
def parserIpaddr(ipdata):
device = re.compile(r'(eth\d)')
ipaddr = re.compile(r'(inet addr:[\d.]{7,15})')
mac = re.compile(r'(HWaddr\s[0-9A-Fa-f:]{17})')
link = re.compile(r'(Link encap:[\w]{3,14})')
mask = re.compile(r'(Mask:[\d.]{9,15})')
for lines in ipdata.split('\n\n'):
pd = {}
eth_device = re.search(device,lines)
inet_ip = re.search(ipaddr,lines)
hw = re.search(mac,lines)
link_encap = re.search(link,lines)
_mask = re.search(mask,lines)
if eth_device:
if eth_device:
Device = eth_device.groups()[0]
if inet_ip:
Ipaddr = inet_ip.groups()[0].split(':')[1]
if hw:
Mac = hw.groups()[0].split()[1]
if link_encap:
Link = link_encap.groups()[0].split(':')[1]
if _mask:
Mask = _mask.groups()[0].split(':')[1]
pd['Device'] = Device
pd['Ipaddr'] = Ipaddr
pd['Mac'] = Mac
pd['Link'] = Link
pd['Mask'] = Mask
yield pd
if __name__ == '__main__':
ipdata = getIpaddr()
for i in parserIpaddr(ipdata):
print i
7.對這些信息進(jìn)行合并,并通過API形式將數(shù)據(jù)發(fā)送給cmdb端
#!/usr/local/src/python/bin/python
import urllib, urllib2
from cpuinfo import *
from diskinfo import *
from meminfo import *
from product import *
from hostinfo import *
from ipaddress import *
def getHostTotal():
ld = []
cpuinfo = parserCpuInfo(getCpuInfo())
diskinfo = parserDiskInfo(getDiskInfo())
for i in parserMemInfo(getMemInfo()):
meminfo = i
productinfo = parserDMI(getDMI())
hostinfo = getHostInfo()
ipaddr = parserIpaddr(getIpaddr())
for i in ipaddr:
ip = i
for k in cpuinfo.iteritems():
ld.append(k)
for i in diskinfo.iteritems():
ld.append(i)
for j in meminfo.iteritems():
ld.append(j)
for v in productinfo.iteritems():
ld.append(v)
for x in hostinfo.iteritems():
ld.append(x)
for y in ip.iteritems():
ld.append(y)
return ld
def parserHostTotal(hostdata):
pg = {}
for i in hostdata:
pg[i[0]] = i[1]
return pg
def urlPost(postdata):
data = urllib.urlencode(postdata)
req = urllib2.Request('http://132.96.77.12:8000/api/collect',data)
response = urllib2.urlopen(req)
return response.read()
if __name__ == '__main__':
hostdata = getHostTotal()
postdata = parserHostTotal(hostdata)
print urlPost(postdata)
到目前為止,cmdb 系統(tǒng)已經(jīng)可以將所有客戶端的主機(jī)信息寫入到數(shù)據(jù)庫,并且可以通過 nagios 端的 API 接口直接調(diào)到數(shù)據(jù):
http://132.96.77.12:8000/api/gethosts.json
http://wiki.jikexueyuan.com/project/python-actual-combat/images/6.jpg" alt="pic" />
通過圖可以看到,已經(jīng)成功調(diào)用到 API 接口的數(shù)據(jù)。
接下來可以在 nagios 端進(jìn)行調(diào)用 API 接口的數(shù)據(jù),對數(shù)據(jù)進(jìn)行格式化。并寫入文件。
1.nagios腳本
#!/opt/data/py/bin/python
#!-*- coding:utf-8 -*-
import urllib, urllib2
import json
import os
import shutil
CURR_DIR = os.path.abspath(os.path.dirname(__file__))
HOST_CONF_DIR = os.path.join(CURR_DIR,'hosts')
HOST_TMP = """define host {
use linux-server
host_name %(hostname)s
check_command check-host-alive
alias %(hostname)s
address %(ipaddr)s
contact_groups admins
}
"""
def getHosts():
url = 'http://132.96.77.12:8000/api/gethosts.json'
return json.loads(urllib2.urlopen(url).read())
def initDir():
if not os.path.exists(HOST_CONF_DIR):
os.mkdir(HOST_CONF_DIR)
def writeFile(f,s):
with open(f,'w') as fd:
fd.write(s)
def genNagiosHost(hostdata):
initDir()
conf = os.path.join(HOST_CONF_DIR,'hosts.cfg')
hostconf = ""
for hg in hostdata:
for h in hg['members']:
hostconf+=HOST_TMP %h
writeFile(conf,hostconf)
return "ok"
def main():
result = getHosts()
if result['status'] == 0:
print genNagiosHost(result['data'])
else:
print 'Error: %s' % result['message']
if os.path.exists(os.path.join(HOST_CONF_DIR,'hosts.cfg')):
os.chdir(HOST_CONF_DIR)
shutil.copyfile('hosts.cfg','/etc/nagios/objects/hosts.cfg')
if __name__ == "__main__":
main()
現(xiàn)在已經(jīng)生成 nagios 主機(jī)的配置文件,并 copy 到 nagios/objects 目錄下 hosts.cfg。接下來可以測試是否 nagios 配置有問題,如果沒有問題,就可以啟動 nagios 服務(wù)
[root@yetcomm-v2 bin]# ./nagios -v /etc/nagios/nagios.cfg
通過測試,nagios 沒有發(fā)生錯誤或警告信息,現(xiàn)在可以啟動 nagios 服務(wù):
[root@yetcomm-v2 bin]# service nagios restart
最后,可以通過瀏覽器查看 nagios 的監(jiān)控界面:
http://wiki.jikexueyuan.com/project/python-actual-combat/images/7.jpg" alt="pic" />
通過上圖,可以看到已經(jīng)將一臺主機(jī)加入到監(jiān)控組。由于是生產(chǎn)環(huán)境,所有只能拿測試服務(wù)器進(jìn)行測試。其實測試環(huán)境和生產(chǎn)環(huán)境的代碼完全一致。