12/08/2018, 15:31

Tracing your apps with OpenTracing API

One of the most important things to monitor on any application is the timeline. Without it it is impossible to detect which parts might cause delays and which services or libraries require performance improvements. And most troubles start to appear with micro-service architectures — when your ...

One of the most important things to monitor on any application is the timeline. Without it it is impossible to detect which parts might cause delays and which services or libraries require performance improvements. And most troubles start to appear with micro-service architectures — when your application grows to numerous separated services, it might be very hard to collect all performance data in one place.

In late 2015 OpenTracing API specification was developed to help with such issues.

The concept of OpenTracing is to create a common vendor-neutral standard for tracing application execution and connections.

Basics

In order to implement your tracing system, you need to choose one of available tracers and set up instrumentation on your services.

I am going to describe the process on the example of Zipkin as a tracer, Node.js frontend application and PHP backend with connections to Redis and abstract database.

Setup Zipkin

The best way is to use Zipkin docker image. It is very simple to run

docker run -d -p 9411:9411 openzipkin/zipkin

The application itself is now available at http://localhost:9411 and API endpoint is http://localhost:9411/api. It includes basic UI to view your traces.

Zipkin User Interface

There is also advanced experimental UI available at Github repository

By default, Zipkin uses temporary storage, which will be emptied when you turn off the server. In order to save the data permanently you need to set up one of available storages (MySQL, Cassandra and ElasticSearch are supported).

Check out Docker Zipkin repository for advanced configurations.

Tracing concept

Tracing includes 2 basic terms: a trace and a span.

Trace is the begin-to-end operation, which involves all the application workflow. Span is a single operation which you need to measure.

Spans may have parent-child or following relationships.

Causal relationships between Spans in a single Trace


        [Span A]  ←←←(the root span)
            |
     +------+------+
     |             |
 [Span B]      [Span C] ←←←(Span C is a `ChildOf` Span A)
     |             |
 [Span D]      +---+-------+
               |           |
           [Span E]    [Span F] >>> [Span G] >>> [Span H]
                                       ↑
                                       ↑
                                       ↑
                         (Span G `FollowsFrom` Span F)
Temporal relationships between Spans in a single Trace


––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–> time

 [Span A···················································]
   [Span B··············································]
      [Span D··········································]
    [Span C········································]
         [Span E·······]        [Span F··] [Span G··] [Span H··]

Every trace includes Endpoint information, which, in general, consists of service name, IP address and port.

Zipkin accepts trace and span information at any time — you don't need to pass related data in one request.

Every trace has an unique 16-char or 32-char ID. All spans within the trace MUST include the same trace ID. Your services may be trace ID providing (clients), consuming (servers) or both. Providing service generates the ID and sends it with HTTP header to all connected services. Consuming services parse HTTP headers and include existing trace ID into requests to Zipkin.

Every span MUST include it's name (e. g. GET /login), endpoint information and start/end timestamps. Zipkin uses microseconds, so great precision can be achieved.

Usage in your Node.js code

There are 2 popular instrumentations available for Node applications:

  • Zipkin JS
  • OpenTracing JavaScript

Both support Zipkin as a tracer and we are going to use Zipkin JS.

Add the library to your dependencies

$ npm install --save zipkin-js

Instantiate tracer and recorder instances

const { BatchRecorder } = require('zipkin')
const { HttpLogger } = require('zipkin-transport-http')

// Send spans to Zipkin asynchronously over HTTP
const zipkinBaseUrl = 'http://localhost:9411'
const recorder = new BatchRecorder({
  logger: new HttpLogger({
    endpoint: `${zipkinBaseUrl}/api/v1/spans`
  })
});

module.exports = recorder;
// tracer.js
const CLSContext = require('zipkin-context-cls')
const { Tracer, Annotation, InetAddress, Request } = require('zipkin')

const recorder = require('./recorder')

const ctxImpl = new CLSContext('zipkin')
const tracer = new Tracer({ ctxImpl, recorder })

module.exports = tracer

Making client requests to other servers

Now the tracer is ready for use. The tracer has quite simple API, which allows to wrap any executed code into a span.

// app.js
const axios = require('axios')
const tracer = require('./tracer')
const { Annotation, InetAddress, Request } = require('zipkin')

const serviceName = 'server1'
const host = new InetAddress('127.0.0.1')
const port = 4001

// ... Further in your code

tracer.letChildId((traceId) => {
    // Now this code will be executed inside child-span scope.
    // Every span requires special annotations to store

    // First write client information (our current service)
    tracer.recordServiceName(serviceName)
    tracer.recordAnnotation(new Annotation.LocalAddr({
        host,
        port,
    }))

    // Annotation.ServerAddr stores information about the service we are requesting
    // This might be HTTP, Redis, SQL or any other server you are connecting to
    tracer.recordAnnotation(new Annotation.ServerAddr({
        serviceName: 'server2',
        host: new InetAddress('127.0.0.1'),
        port: 4002,
    }))

    // Rpc data stores operation name
    // For HTTP requests might be just method or readable operation name (e. g. get_posts)
    tracer.recordRpc('get')

    // Binary data may include any additional request information
    tracer.recordBinary('http.url', 'http://localhost:4002')

    // Inform tracer that current app is a client
    tracer.recordAnnotation(new Annotation.ClientSend());

    // Add trace header for server-side app to consume trace data
    const zipkinOpts = Request.addZipkinHeaders({}, traceId);

    axios.get('http://localhost:4002', zipkinOpts).then((response) => {
        tracer.scoped(() => {
            // After promise has been resolved, write response data to tracer 
            // with the ID we received before
            tracer.setId(traceId);
            tracer.recordBinary('http.status_code', response.status.toString());

            // Inform tracer that we have received request as a client
            tracer.recordAnnotation(new Annotation.ClientRecv());
        });
        resolve(response.data)
    })
});

Of course, you can simplify the code by making wrapper functions to any code executed.

Storing server-side spans

If your current service acts as a server (or both server and client), you may need to store server-side span info. This will help you to trace connection delays. If the span has been sent from both client and server, you will see it in a tree view.

[-] serverA                     [100.000 ms: get_posts========================]
    [-] serverB                       [90.000 ms: get_posts===============]
                                ^- request started                            ^- response received
                                      ^- request received                 ^- response sent

In previous example we provided special Zipkin headers into request. On our server app we can parse them:

X-B3-TraceId: Main trace ID
X-B3-SpanId: Current operation span ID
X-B3-ParentSpanId: Parent span ID if present

Now you only need to enable middleware, which will pass data to recorder.

// app.js
const app = require('express')()
const tracer = require('./tracer')
const expressMiddleware = require('zipkin-instrumentation-express').expressMiddleware

const serviceName = 'server1'
const host = new InetAddress('127.0.0.1')
const port = 4001

app.use(expressMiddleware({
    tracer,
    serviceName,
    port,
}))

app.listen(port, host.addr)

That's all you need. Of course, you can take a look at middleware source code and other instrumentations for more details.

Usage in PHP code

There is not much difference with PHP. The most popular library is whitemerry/phpkin.

$ composer require whitemerry/phpkin

As PHP code is not async, you need to instantiate tracer instance as early as possible.

// app/tracer.php
$endpoint = new whitemerryphpkinEndpoint(
    'server2', // Application name
    '127.0.0.1', // Current application IP address
    '4002' // Current application port (default 80)
);

$logger = new whitemerryphpkinLoggerSimpleHttpLogger([
     'host' => 'http://localhost:9411' // Zipkin's API host with schema (http://) and without trailing slash
]);

$traceSpanId = null;
if (!empty($_SERVER['HTTP_X_B3_SPANID'])) {
    $traceSpanId = new whitemerryphpkinIdentifierSpanIdentifier($_SERVER['HTTP_X_B3_SPANID']);
}

$isSampled = null;
if (!empty($_SERVER['HTTP_X_B3_SAMPLED'])) {
    $isSampled = (bool) $_SERVER['HTTP_X_B3_SAMPLED'];
}

$tracer = new whitemerryphpkinTracer(
    'GET',
    $endpoint,
    $logger,
    $isSampled,
    $traceId,
    $traceSpanId
);

$tracer->setProfile(whitemerryphpkinTracer::BACKEND);

return $tracer;

In this code I use Silex as an example app.

// public/index.php
require __DIR__.'/../vendor/autoload.php';
$tracer = require(__DIR__.'../app/tracer.php';

$app = require __DIR__.'/../app/app.php';
$app['tracer'] = $tracer; // Put the tracer into DI container for later use.

$app->finish(function () use ($tracer) {
    // After request has been sent, we can store the data to Zipkin.
    $tracer->trace();
});

$app->run();

This is all you need for your service to work as a server app.

SimpleHttpLogger is using file_get_contents function, which might cause issues in your code. Here is a logger implementation using Guzzle:

In case you want to trace further requests to client services, use the tracer instance previously created.

// app/app.php

use SymfonyComponentHttpFoundationJsonResponse;

$app = new SilexApplication();
$app['debug'] = true;

$app->get('/', function () use ($app) {
    $start = zipkin_timestamp();

    // DO SOME QUERY HERE

    $meta = new whitemerryphpkinMetadata();

    $meta->set('some_key', 'some_value');

    $span = new whitemerryphpkinSpan(
        new whitemerryphpkinIdentifierSpanIdentifier(),
        'connect',
        new whitemerryphpkinAnnotationBlock(
            new whitemerryphpkinEndpoint(
                'database',
                '127.0.0.1',
                5000
            ),
            $start,
            zipkin_timestamp(),
            whitemerryphpkinAnnotationBlock::CLIENT
        ),
        null,
        whitemerryphpkinTracerInfo::getTraceId()
    );

    $app['tracer']->addSpan($span);

    return new JsonResponse([
        'status' => 'ok',
    ]);
});

return $app;

Other instrumentations

Currently there is Zipkin or Opentracing instrumentation available for most popular languages.

Official Zipkin list includes Go, Java, JavaScript, Ruby and Scala, and numerous third-party libraries.

0