IndexTables for Spark

A new open table format for fast retrieval and full-text search across large data


Key Features

Open Source

IndexTables is in active development and licensed under the Apache License, Version 2.0

No Infrastructure

IndexTables run directly within Spark executors providing distributed execution with horizontal scaling and first-class DataFrame and SQL API support

Fast Execution

IndexTables utilize hybrid row and columnar storage format, combined with advanced indexing which delivers extremely fast keyword searches, aggregates, and filtered retrieval across massive datasets

Cost Effective

IndexTables are fully compatible with Amazon S3 and Azure Blob storage


Use Cases

IndexTables are great for searching across application logs, security logs, and audit trails. Given the increasing volume of generated logs, fast search retrieval is critical to allowing your company to quickly respond to its data.

Log Analysis

With its fast, full-text search capabilities, IndexTables allows users to find all events with a specific key word or phrase like ‘ERROR’.

IndexTables also support SQL aggregations that can help visualize trends over time.

Security Investigation and SIEM

To better protect their sensitive data, companies need to respond quickly to potential cyber threats. IndexTables allow security analysts to quickly search all security logs for anomalous activity to quickly identify bad actors.

Customer Support Management

With full text search, quickly identify support tickets with natural language. While many organizations use tags, placing the correct tag relies on the judgment of each individual. By searching through all notes in the tickets, you can find all open tickets where a ‘refund’ was mentioned