Search Indexing with Entity Extraction

Improve search relevance with entity extraction. Extract and index entities to power faceted search and better search results.

The Problem

Traditional keyword search misses semantic connections and fails to understand that "Tim Cook" and "Apple CEO" refer to the same entity. Without entity awareness, search results lack precision and faceting capabilities.

The Solution

Extract entities during indexing to enable semantic search, entity-based facets, and improved relevance ranking. Power "People," "Companies," and "Places" filters with structured entity data.

Key Benefits

  • Enable entity-based faceted search
  • Improve search relevance with semantic understanding
  • Power "People mentioned" and "Companies" filters
  • Support entity-aware autocomplete suggestions
  • Build knowledge panels for key entities
  • Connect related documents through shared entities

Code Example

javascript
// Index documents with extracted entities
async function indexDocumentWithEntities(doc) {
  // Extract entities from document
  const response = await fetch('https://api.entity-detector.com/v1/analyze', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ text: doc.content })
  });

  const { entities, relations } = await response.json();

  // Create search index document
  const indexDoc = {
    id: doc.id,
    title: doc.title,
    content: doc.content,
    // Structured entity fields for faceting
    persons: entities.persons || [],
    organizations: entities.organizations || [],
    locations: entities.locations || [],
    // Flattened for full-text search
    all_entities: [
      ...(entities.persons || []),
      ...(entities.organizations || []),
      ...(entities.locations || [])
    ].join(' '),
    relations: relations
  };

  await searchIndex.add(indexDoc);
}

Example Output

json
{
  "id": "article_5678",
  "title": "Tech Giants Report Q4 Earnings",
  "persons": ["Sundar Pichai", "Satya Nadella"],
  "organizations": ["Google", "Alphabet", "Microsoft"],
  "locations": ["Mountain View", "Redmond"],
  "facets": {
    "companies": ["Google", "Alphabet", "Microsoft"],
    "people": ["Sundar Pichai", "Satya Nadella"],
    "regions": ["California", "Washington"]
  }
}

Ready to get started?

Try entity extraction for your search indexing workflow.

Related Use Cases