!!! Introduction

The idea is to monitor running computerized systems in a seamless way and to collect historical information and allow to analyze it and graph it easily.

!!! Design
!! Java Unix Agent
An API to make Unix (and other) commands easily accessible in a more homogeneous way.
The source code is [here|http://spacepirates.com/ro/jua/]


The internal API is used to produce a connected agent that collect information.

!! Recording/Logging
JUA produces JSON data that can be stored for future retrieval.
This could be implemented in a NOSQL database, either document-oriented or big table.
It should be easy to extract a subset of the data to export it so it can be loaded in another instance and analyzed.
Compression is important if we want to be able to store old logs.

!! Reports/Viewserver
The view server runs reports and produces summaries from the recorded information and caches it so that it can be retrieved fast.

!! Client/Visualization
There is a need to show the information and be able to look at views and drill down on low-level information in the database (individual OS-specific data, log file contents etc.)

! Navigation
[Eksplane|http://spacepirates.com/ro/experiments/eksplane/], a simplistic Django prototype that was excluding reports and recording and focusing on real time navigation and links between different aspects of a live Unix sysem (process, files, sockets, ports etc.). The idea was to render the output of Unix commands as HTML enriched with hyperlinks specific to the type of the information in that output (e.g. a PID in the output is clickable so you get the choice of pstack, lsof, arguments etc.).

!!! Background information
!! Parsing text

Parsing text is key as most important commands output text and most log files are text based.
Some tools output HTML and parsing HTML (probably bad HTML) is important too. [TagSoup|http://home.ccil.org/~cowan/XML/tagsoup/] and [Jericho|http://jericho.htmlparser.net/docs/index.html] may be useful.

!! Unix commands

There are many unix commands and other tools that are good sources of information for this project:
* **ps**: list processes, arguments, resource usage etc.
* pstack, gstack, procstack, dbx/gdb + where : show the call stack(s) of a given process
* pfiles, procfiles, lsof : show files used by one or multiples processes
* pmap, procmap, jmap: memory usage of a given process
* netstat, ifconfig
* df: show disk utilisation of partitions etc.
* iostat, iotop etc
* sp_sysmon: show Sybase activity summary
* uname
* prtdiag, 
* vmstat, free: system memory usage
* lpr 
* dmesg, cat /var/log/messages