Pour remplacer graphite-web (django) par plus léger
Ça c'est une très bonne nouvelle !
TIL :
#
#
#
export CFLAGS="-I$HOME/include -I$HOME/include/python2.7"
export CPPFLAGS="-I$HOME/include -I$HOME/include/python2.7"
export PKG_CONFIG_PATH="$HOME/lib/pkgconfig"
export LDFLAGS="-L$HOME/lib -L$HOME/lib/python2.7"
export LD_LIBRARY_PATH="$HOME/lib:$HOME/lib/python2.7:$HOME/lib/python2.7/site-packages/cairo"
export PYTHONPATH="$HOME/lib/python:$HOME/lib/python2.7/site-packages:$HOME/.local/bin/usr/local/lib/python2.7/site-packages"
export PATH="$HOME/.local/bin:$PATH"
cd $HOME/graphite-install
wget http://zlib.net/zlib-1.2.8.tar.gz
wget ftp://ftp.simplesystems.org/pub/libpng/png/src/libpng-1.6.2.tar.gz
wget http://www.sqlite.org/2013/sqlite-autoconf-3071700.tar.gz
tar zxf zlib-1.2.8.tar.gz
tar zxf libpng-1.6.2.tar.gz
tar zxf sqlite-autoconf-3071700.tar.gz
cd zlib-1.2.8
./configure --prefix=$HOME
make
make install
cd ../libpng-1.6.2
./configure --prefix=$HOME
make
make install
cd ../sqlite-autoconf-3071700.tar.gz
./configure --prefix=$HOME
make
make install
cd $HOME/graphite-install
wget http://www.python.org/ftp/python/2.7.5/Python-2.7.5.tgz
tar zxf Python-2.7.5.tgz
cd Python-2.7.5
./configure --enable-shared --prefix=$HOME
make
make install
cd $HOME/graphite-install
wget http://cairographics.org/releases/pixman-0.26.2.tar.gz
wget ftp://sourceware.org/pub/libffi/libffi-3.0.11.tar.gz
wget http://ftp.gnome.org/pub/GNOME/sources/glib/2.31/glib-2.31.22.tar.xz
wget http://cairographics.org/releases/cairo-1.12.2.tar.xz
wget http://cairographics.org/releases/py2cairo-1.10.0.tar.bz2
tar xzf pixman-0.26.2.tar.gz
tar xzf libffi-3.0.11.tar.gz
unxz glib-2.31.22.tar.xz
unxz cairo-1.12.2.tar.xz
tar xjf py2cairo-1.10.0.tar.bz2
cd libffi-3.0.11
./configure --prefix=$HOME
make
make install
cd ../glib-2.31.22
./configure --prefix=$HOME
make
make install
cd ../pixman-0.26.2
./configure --prefix=$HOME
make
make install
cd ../cairo-1.12.2
./configure --prefix=$HOME
make
make install
cd ../py2cairo-1.10.0
~/bin/python ./waf configure --prefix=$HOME
~/bin/python ./waf build
~/bin/python ./waf install
~/bin/python -c 'import cairo; print cairo.version'
cd $HOME/graphite-install
wget https://django-tagging.googlecode.com/files/django-tagging-0.3.1.tar.gz
wget https://www.djangoproject.com/m/releases/1.5/Django-1.5.1.tar.gz
wget https://pypi.python.org/packages/source/z/zope.interface/zope.interface-4.0.5.zip#md5=caf26025ae1b02da124a58340e423dfe
wget http://twistedmatrix.com/Releases/Twisted/11.1/Twisted-11.1.0.tar.bz2
unzip zope.interface-4.0.5.zip
tar zxf django-tagging-0.3.1.tar.gz
tar zxf Django-1.5.1.tar.gz
tar jxf Twisted-11.1.0.tar.bz2
cd zope.interface-4.0.5
~/bin/python setup.py install --user
cd ../Django-1.5.1
~/bin/python setup.py install --user
cd ../django-tagging-0.3.1
~/bin/python setup.py install --user
cd ../Twisted-11.1.0
~/bin/python setup.py install --user
cd $HOME/graphite-install
wget https://launchpad.net/graphite/0.9/0.9.10/+download/graphite-web-0.9.10.tar.gz
wget https://launchpad.net/graphite/0.9/0.9.10/+download/carbon-0.9.10.tar.gz
wget https://launchpad.net/graphite/0.9/0.9.10/+download/whisper-0.9.10.tar.gz
tar -zxvf graphite-web-0.9.10.tar.gz
tar -zxvf carbon-0.9.10.tar.gz
tar -zxvf whisper-0.9.10.tar.gz
cd whisper-0.9.10
~/bin/python setup.py install --home=$HOME/graphite
cd ../carbon-0.9.10
~/bin/python setup.py install --home=$HOME/graphite
cd ../graphite-web-0.9.10
~/bin/python check-dependencies.py
~/bin/python setup.py install --home=$HOME/graphite
cd $HOME/graphite/conf
cp carbon.conf.example carbon.conf
cp storage-schemas.conf.example storage-schemas.conf
vim storage-schemas.conf
cd $HOME/graphite/lib/python/graphite
cp local_settings.py.example local_settings.py
vim local_settings.py
~/bin/python manage.py syncdb
cd $HOME/graphite
~/bin/python carbon-cache.py status
~/bin/python carbon-cache.py start
screen
~/bin/python manage.py runserver
curl localhost:8000
cd $HOME/graphite/examples
~/bin/python example-client.py
curl -X POST "http://localhost/events/" -d '{"what": "Web Service", "tags": "production deploy", "data":"version 1.1.7"}'
On envoie une valeur aléatoire toutes les 10 secondes et on observe le résultat sur un dashboard grafana qui se refresh tout seul toutes les 10 secondes ;-)
for i in {1..60}; do line=$(echo "toto $(( ( RANDOM % 10 ) + 1 )) $(date +%s)"); echo $line; echo $line | nc -q0 localhost 2003; sleep 10; done
La base de la métrologie avec les différents types de metric : gauge, rate, histogram.. Le rôle de statsd (buffer..)
cd /opt/graphite/storage/whisper
find . -name "*.wsp" | xargs -I {} sh -c "echo -n '{} ' ; whisper-info.py {}|grep aggre|cut -d' ' -f2" > aggreg.dump
82024 average
2784 max
1378 min
8732 sum
59682 average
Donc oui, tous les whispers collectd sont en average.
Golang implementation of Graphite/Carbon server with classic architecture: Agent -> Cache -> Persister
used by Doo 1M metric/sec (12 CPU)
Scaling graphite
via arnaudb
[07:27] < torkelo>| matejz: I have managed to get about 140~ bytes per measurement (ES) asyd
[07:27] < matejz>| and was thinking of using it for metrics as well aviau
[07:27] < torkelo>| which is 12x the size requirement of Graphite (12 bytes per measurement)
[10:16] < torkelo> | agree, if you store more than 100 000 metrics/s I think ES is not a good option. But for short term performance logging the new metric features for flat_white
percentile and moving average, etc are looking very good
Un get la dessus : /api/datasources/proxy/1/metrics/find/?query=collectd.*
Il existe plusieurs alternatives à carbon, la plus connu étant surement influxdb. Celle ci est basée sur cassandra!
+https://github.com/kairosdb/kairos-carbon
via arnaudb
Encore une discussion intéressante sur les perf de graphite. Ce qu'il faut retenir c'est que le botteneck peut se situer au niveau du CPU ou au niveau du disque (la RAM en général ce n'est pas un probleme, meme si bien sur, il faut la surveiller)
Pour connaitre l'utilisation du CPU de carbon-cache, une metric est envoyé par le daemon dans carbon.agents.graphite-x.cpuUsage
Pour connaitre l'utilisation du disk, on se sert de iostat -dmx 1 2 (merci arnaud)
Si le disque est trop haut (entre 50 et 75), il faut le soulager en baissant dans la conf de carbon le max update par seconde.
Ce qui aura pour effet d'augmenter la taille du cache et donc de faire plus travailler le CPU..
Au contraire si le CPU est chargé mais que le disque ne fait rien, il faut augmenter le max update par seconde.
En trouvant le bon équilibre on peut exploiter au maximum le hardware disponible
Un screenshot d'un dashboard qu'il est bien
à lire pour le monitoring applicatif
front end graphite avec un backend cassandra (carrément)
via arnaudb
#
#
#source /lib/lsb/init-functions
. /lib/lsb/init-functions
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
GRAPHITE_HOME=/opt/graphite
NAME=carbon-cache
DESC=carbon-cache
#Carbon has its own logging facility, by default in /opt/graphite/storage/log/carbon-cache-*
DAEMON=$GRAPHITE_HOME/bin/carbon-cache.py
PIDFILE=/opt/graphite/storage/carbon-cache-a.pid
SCRIPTNAME=/etc/init.d/$NAME
if [ ! -x "$DAEMON" ]; then {
echo "Couldn't find $DAEMON or not executable"
exit 99
}
fi
[ -f /etc/default/rcS ] && . /etc/default/rcS
#
#
do_start()
{
# 0 if daemon has been started
# 1 if daemon was already running
# 2 if daemon could not be started
# Test to see if the daemon is already running - return 1 if it is.
start-stop-daemon --start --pidfile $PIDFILE \
--exec $DAEMON --test -- start > /dev/null || return 1
# Start the daemon for real, return 2 if failed
start-stop-daemon --start --pidfile $PIDFILE \
--exec $DAEMON -- start > /dev/null || return 2
}
#
#
do_stop() {
# 0 if daemon has been stopped
# 1 if daemon was already stopped
# 2 if daemon could not be stopped
# other if a failure occurred
log_daemon_msg "Stopping $DESC" "$NAME"
start-stop-daemon --stop --signal 2 --retry 5 --quiet --pidfile $PIDFILE
RETVAL="$?"
[ "$RETVAL" = 2 ] && return 2
# Delete the exisitng PID file
if [ -e "$PIDFILE" ]; then {
rm $PIDFILE
}
fi
return "$RETVAL"
}
case "$1" in
start)
[ "$VERBOSE" != no ] && log_daemon_msg "Starting $DESC" "$NAME"
do_start
case "$?" in
0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
esac
;;
stop)
[ "$VERBOSE" != no ] && log_daemon_msg "Stopping $DESC" "$NAME"
do_stop
case "$?" in
0|1) [ "$VERBOSE" != no ] && log_end_msg 0 ;;
2) [ "$VERBOSE" != no ] && log_end_msg 1 ;;
esac
;;
restart)
log_daemon_msg "Restarting $DESC" "$NAME"
do_stop
case "$?" in
0|1)
do_start
case "$?" in
0) log_end_msg 0 ;;
1) log_end_msg 1 ;; # Old process is still running
) log_end_msg 1 ;; # Failed to start
esac
;;
)
log_end_msg 1
;;
esac ;; status) if [ -s $PIDFILE ]; then pid= cat $PIDFILE kill -0 $pid >/dev/null 2>&1 if [ "$?" = "0" ]; then echo "$NAME is running: pid $pid." RETVAL=0 else echo "Couldn't find pid $pid for $NAME." RETVAL=1 fi else echo "$NAME is stopped (no pid file)." RETVAL=1 fi ;; *) echo "Usage: $SCRIPTNAME {start |
stop | restart | status}" >&2 exit 3 ;; esac |
---|
à regarder à la rentrée
via arnaudb
apply function > Special > draw non zero as infinite
Aide à comprendre pas mal de chose concernant graphite/statsd
Suite d'outils bien pratique pour tester/vérifier/debug ses whisper files, en particulier whisper-dump et whisper-fetch
def archive_to_bytes(archive):
def to_seconds(s):
SECONDS_IN_A = {
's': 1,
'm': 1 60,
'h': 1 60 60,
'd': 1 60 60 24,
'y': 1 60 60 24 365,
}
return int(s[:-1]) * SECONDS_IN_A[s[-1]]
archive = [map(to_seconds, point.split(':'))
for point in args.archive.split(',')]
SIZE_METADATA = 2 * 4 + 4 + 4 # 16 [!2LfL]
SIZE_ARCHIVE_INFO = 3 * 4 # 12 [!3L]+
SIZE_POINT = 4 + 8 # 12 [!Ld]+
size = 0
for resolution, retention in archive:
size += SIZE_ARCHIVE_INFO + SIZE_POINT * retention/resolution
if size:
size += SIZE_METADATA
return size
if name == 'main':
import argparse
parser = argparse.ArgumentParser(
description="Calculates the size of the whisper storage for the given \
archive (in resolution:retention format, e.g. 1m:24h,5m:3m)"
)
parser.add_argument(
'archive',
help="Archive in storage-schemas.conf format (resolution:retention)"
)
args = parser.parse_args()
print "{} >> {} bytes".format(args.archive, archive_to_bytes(args.archive))
import os
import mmap
import struct
import signal
import optparse
try:
import whisper
except ImportError:
raise SystemExit('[ERROR] Please make sure whisper is installed properly')
signal.signal(signal.SIGPIPE, signal.SIG_DFL)
option_parser = optparse.OptionParser(usage='''%prog path''')
(options, args) = option_parser.parse_args()
if len(args) != 1:
option_parser.error("require one input file name")
else:
path = args[0]
def mmap_file(filename):
fd = os.open(filename, os.O_RDONLY)
map = mmap.mmap(fd, os.fstat(fd).st_size, prot=mmap.PROT_READ)
os.close(fd)
return map
def read_header(map):
try:
(aggregationType,maxRetention,xFilesFactor,archiveCount) = struct.unpack(whisper.metadataFormat,map[:whisper.metadataSize])
except:
raise CorruptWhisperFile("Unable to unpack header")
archives = []
archiveOffset = whisper.metadataSize
for i in xrange(archiveCount):
try:
(offset, secondsPerPoint, points) = struct.unpack(whisper.archiveInfoFormat, map[archiveOffset:archiveOffset+whisper.archiveInfoSize])
except:
raise CorruptWhisperFile("Unable to read archive %d metadata" % i)
archiveInfo = {
'offset' : offset,
'secondsPerPoint' : secondsPerPoint,
'points' : points,
'retention' : secondsPerPoint * points,
'size' : points * whisper.pointSize,
}
archives.append(archiveInfo)
archiveOffset += whisper.archiveInfoSize
header = {
'aggregationMethod' : whisper.aggregationTypeToMethod.get(aggregationType, 'average'),
'maxRetention' : maxRetention,
'xFilesFactor' : xFilesFactor,
'archives' : archives,
}
return header
def dump_header(header):
print 'Meta data:'
print ' aggregation method: %s' % header['aggregationMethod']
print ' max retention: %d' % header['maxRetention']
print ' xFilesFactor: %g' % header['xFilesFactor']
print
dump_archive_headers(header['archives'])
def dump_archive_headers(archives):
for i,archive in enumerate(archives):
print 'Archive %d info:' % i
print ' offset: %d' % archive['offset']
print ' seconds per point: %d' % archive['secondsPerPoint']
print ' points: %d' % archive['points']
print ' retention: %d' % archive['retention']
print ' size: %d' % archive['size']
print
def dump_archives(archives):
for i,archive in enumerate(archives):
print 'Archive %d data:' %i
offset = archive['offset']
for point in xrange(archive['points']):
(timestamp, value) = struct.unpack(whisper.pointFormat, map[offset:offset+whisper.pointSize])
print '%d: %d, %10.35g' % (point, timestamp, value)
offset += whisper.pointSize
print
if not os.path.exists(path):
raise SystemExit('[ERROR] File "%s" does not exist!' % path)
map = mmap_file(path)
header = read_header(map)
dump_header(header)
dump_archives(header['archives'])
In order to save graphs under 'My Graphs' and 'User Graphs' you simply need to log into Graphite and the Save/Delete buttons will then appear on the top of the Composer window. To log in you need to either create an account for yourself in the Django database or setup LDAP authentication. LDAP authentication is setup through local_settings.py, check out the example file's comments for details on doing that (or just ask if you run into any issues). However if you're not using LDAP its pretty easy to create a local user in Django's database using the Django admin interface.
When you first installed Graphite there was a step where you had to initialize the database by running 'manage.py syncdb'. That prompted you to create an admin user, if you did that already you're good to go. Otherwise you need to create an admin user by running 'manage.py createsuperuser'.
Once your admin user is setup you can go to the django admin interface by visiting http://my-graphite-server/admin/ (note: the trailing slash is required!) and logging in as your admin user. From there you need to create a new user account by clicking 'Add' under the 'Users' section. After that, go back to the main Graphite page. You may need to logout if it thinks you're still the admin user. Then log back in as you're new user and you should be able to save graphs! Also once logged in you have the ability to set profile options. Currently there is only 1 option, the "advanced UI features" option. Enabling that puts a '*' element in every folder that has more than one element, making it easier to use wildcards when you build graphs.
Hope that helps.
dash auto via collectd dashgen
via arnaudb
Je me mets ça de côté : la retention par défaut des RRD dans munin au format carbon :
[munin_schema]
pattern = ^munin.
retentions = 5m:2d,30m:10d,2h:45d,1d:1y
Sur les 2 derniers jours on a une mesure toute les 5 minutes
Sur les 10 derniers jours, une mesure toute les 30 minute est conservée
etc
/opt/graphite/conf/storage-schemas.conf
http://grey-boundary.com/the-architecture-of-clustering-graphite/
http://adminberlin.de/graphite-scale-out/
http://graphite.readthedocs.org/en/latest/carbon-daemons.html
http://blog.xebia.fr/2013/05/29/graphite-les-bases/
http://fr.slideshare.net/AnatolijDobrosynets/graphite-cluster-setup-blueprint
via arnaudb
If you have lots of metric names that change (new servers etc) in a defined pattern it is irritating to constantly have to create new dashboards.
With scripted dashboards you can dynamically create your dashboards using javascript. In the folder grafana install folder app/dashboards/ there is a file named scripted.js. This file contains an example of a scripted dashboard. You can access it by using the url:
http://grafana_url/#/dashboard/script/scripted.js?rows=3&name=myName
If you open scripted.js you can see how it reads url paramters from ARGS variable and then adds rows and panels.
Avec ça possible de faire des templates de graph
Pourquoi utiliser statsd en plus de graphite ?
Pour scale ? car stats reçoit en udp, attend un moment, puis envoie à graphite.
Mais Carbon semble pouvoir recevoir lui aussi directement en udp..
Pour répartir la charge tout simplement ? à voir..
Un article du tech blog de flickr sur la metrologie (qui a servi d'inspiration à l'écriture de statsd)
Le combo statsd/graphite a l'air vraiment puissant
Repo github de statsd
Une présentation chez M6Tech concernant le monitoring (graphite inside)
Une vm vagrant pour tester rapidement Graphite
encore un frontend graphite
via arnaub
Un autre front end graphiti
via arnaudb
Un frontend pour graphite, sympa!
via Skunnyk
Un script python qui va éxécuter les plugins munin puis envoyer les datas vers un carbon. Exemple de metric utilisée :
servers.localhost.system.users.tty
[prefix].[hostname].[group].[service].[courbe]
Pour tester rapidement sur un node munin :
git clone https://github.com/jforman/munin-graphite
vim +151 m2g-poller.py : remplacer logging.debug par logging.info
Puis lancer : ./m2g-poller.py --carbon localhost:6969
Ah oui, si vous n'avez pas de server carbon, vous pouvez en "simuler un" : while true; do nc -l -p 6969; done;
En plus il utilise ça : https://graphite.readthedocs.org/en/latest/feeding-carbon.html#the-pickle-protocol
Plugin collectd pour envoyer des data vers graphite
via arnaub
statsd + graphite
Mais statsd en nodejs..
Un retour d'expérience sur graphite.
Meilleur que munin car rendu dynamique, plus d'options.