Motivation
- current diagnostic tools limited to one particular proctocol, e.g. traceroute
- need for comprehensive view of the system's behavior
- complex systems: e.g. wikipedia has different sites, web caches, DNS round-robin, load balancers, web servers, database servers (and memcached)
- tracing across different administrative domains needed
- integrated tracing framework
- network protocols modified to propagate X-Trace metadata
- works inter-layer
- works inter-Administrative Domains
- decouples client of application and recipient of tracing data (Design principle 3), destination part of the X-Trace metadata
- trace initiated by inserting X-Trace metadata by user application or network operator
- trace identified by task identifier
- X-Trace data send to report server (can be client application or delegated server)
- X-Trace constructs task tree offline, two axis: one across "layers" (an event causes another event in lower layer), one across "time" (an event causes another in the same layer), each node in the task tree has an ID, children link to their parents
- Design principle 1: trace request are sent in-band
- Design principle 2: trace data are sent out-of-band
- ASCII report format
- report library, report collection thorugh e.g. Postgres
- visualization of task tree
- API for application has pushNext() and pushDown() to propagate X-trace MetaData across the two axis, device reports information accessible at its own layer, can include additional information like load
- gradual deployment: for legacy clients, devices in the network can add X-Trace metadata
- retrofitting X-Trace into exisiting applications faces difficulties: change to various protocols (IP options, TCP, HTTP headers, SQL), partial deployment impairs ability to trace parts of the network, lost trace reports can be interpreted as false positives
- certain request topologies cannot be captured, e.g. requests spreads through the network and rendezvous at a node
- unique() function returning identifier for task tree not specified in paper
- low performance overhead
- Web request and recursive DNS queries
- Web hosting site (LAMP), user could intiate traces through JavaScript/PHP library
- overlay network
- Tunnels, ISP connectivity
The only thing I would add to this framework is the ability to send encrypted X-Trace data.
No comments:
Post a Comment