Tens of thousands of nodes means millions of lines of syslog data per day. Making sense of data at this scale is a distributed computing problem on it’s own.

  • Is the system healthy?
  • Is there a failure imminent?

Baler imports text data using a set of configurable input parsers. The out put of these parsers is a set of tokens and token types, e.g. IP Address : These tokens are then indexed and stored in the OGC Structured Object Store. Strings of tokens are recognized as patterns based on their token types and these patterns are stored along with the messages. The Baler software is very similar in function to Splunk, however, it is designed for higher performance and scalability.

HPC Machine Data Mining