Stephen E. Arnold: CyberTap (Engineering Centric) Kicks the Crap Out of “Advanced” Search (Sales Centric)

Advanced Cyber/IO, IO Sense-Making
Stephen E. Arnold
Stephen E. Arnold

THE HONK (Email Subscription)

As I was working through updates to the search vendor profiles on, I ran across a reference to Cybertap LLC. The company’s name rang a bell. I recalled an interview one of the goslings conducted with its founder in 2012. The point that caught my attention last week was a reference to a US patent document (8,406,141 B1) that seemed to explain some of the capabilities of “enhanced search.” The title of the patent is “Network Search Methods and Systems.”

If you are not familiar with patent documents, these are available without charge from the US Patent and Trademark Office at The syntax required for the antiquated system is tricky. Please, check the USPTO site for the explanation of how the system processes queries.

The abstract for the invention filed a number of years ago states:

Methods, systems, devices and computer program code products for enabling searches of digital communications network traffic to identify information transmitted by, received by, or exchanged with a given human or non-human entity, include, or include elements for, translating Pcap [network traffic packet capture] files or streams of IP [Internet protocol] network packets obtained from the network into a scalable form suitable for query by search engine functionality, thereby to enable scalable, text-based search of network information contained in the Pcap files, and providing scalable search engine functionality to enable a user to execute text-based searches on textual or human relationship-identifying information derived from the Pcap files or streams of IP network packets, thereby to identify information transmitted by, received by, or exchanged with the given human or non-human entity, wherein the scalable search engine functionality is capable of scaling to search massive quantities of Pcap file or IP network packet data.

The invention uses Lucene/Solr. The Cybertap technology uses an open source information retrieval system to permit the user of the system to find needed information. The content processing system as I understand it does not need humans to manipulate the content acquired or pushed into the system. The invention provides software that can take packet data and convert it to XML. Once in a structured form, the system supports a query by user name. The metadata tagging system, also built in, adds rich indexing from the packet stream and from the contents of the information passed on the network. A query, therefore, can return email and email attachments. Because the system uses XML, other processes can be run across the indexes; for example, relationship maps can be generated.

Click on Image to Enlarge
Click on Image to Enlarge

The illustration (Figure 5 in US 8,406,141 B1) below identifies some of the features that the search system can make available to a user:

The low resolution of the image in the patent document may reflect a need to obscure details of the sample record.

The invention is the work of Dr. Russell L. Courturier, Patrick V. Johnstone, and John H. Ricketson. Dr. Courturier told Search Wizards Speak:
Recon has exceptional comprehensiveness and power from its indexing method. Therefore, the ease and speed at which it can extract exactly what the analyst is looking for is excellent. Other products may be able to find strings of characters embedded in small amounts of captured network files. Recon indexes everything contained within the captured network traffic. Recon processes content, embedded files, attachments, attributes, network protocol data, metadata, and entities. And our system indexes every bit. Using a standard search engine interface, analysts can quickly search all of the captured network traffic to find precisely what they are looking for. Once found, Recon presents the information as it was originally seen so analysts can follow conversations and threads in context.

Three observations:

First, Lucene/Solr delivers useful functionality. The need to involve a third party version of Lucene/Solr may not be necessary. Second, the Cybertap system offers functionality that goes beyond the enterprise search vendors offering search within a content management system or a customer relationship management system. Third, the analytics that some vendors build their differentiation argument on appears to be part of the Cybertap framework.

My point is that the search vendors who market the heck out of advanced capabilities may be not able to match the functionality of lower-profile, true engineering centric organizations. The Cybertap Web site offers minimal information, but you can visit the public face of the company at

Stephen E Arnold, February 25, 2014