Announcing our HTTP/S Elasticsearch Ingestion Plugin for Integrating with LLMs

AIElasticsearchNewsSemantic Search

May 15, 2024

Adding Content Enrichment to Elasticsearch Pipelines

MC+A is pleased to announce our new HTTP Ingest Plugin for Elasticsearch. Developers working with Elasticsearch know that one of the long missing features in Elasticsearch is the ability to make remote HTTP/S calls to external services. In content ingestion workflows, these external calls are used to perform enrichment tasks and actions such as Named Entity Recognition. With our new plugin, you can now do this from an Elasticsearch Ingest Pipeline.

By closing this feature gap in Elasticsearch, our plugin simplifies working with out-of-the-box features, like the web crawler, when using Large Language Models and custom prompts or other external services in an ingestion workflow within an Elasticsearch cluster.

MC+A's innovation in Search (and AI) continues

Earlier this year, MC+A was recognized as a KMWorld 100 companies that matters in Knowledge Management. For the past 8 years, we have integrated AI and Machine Learning into Search Applications and workflows, having built up sets of reference libraries which we are bringing to market to our Trusted Advisor’s clients so they can focus on their use case and not the platform. MC+A continues to develop solutions that bridge the gaps in technologies, whether it is a custom connector to a knowledge repository or a missing piece in a technology stack or pipeline.

Leveraging Intelligent Functions like those brought by Bookend AI

Along the way we’ve worked with companies like Bookend AI whose Safe AI Platform helping to secure enterprise RAG applications from data leakage, jailbreaks and prompt injection. With a service like Bookend, you can weave together a pipeline that does something like:

Web Crawl Your Internal Knowledge Base
Call a Bookend Intelligent Function of choice to enrich your content
Add the Bookend Response into your mappings
Leverage all the out-of-the-box features of Elasticsearch

This is a fantastic development for customers working on integrating advanced AI functionalities. By enabling seamless integration of intelligent functions like those offered by Bookend, this plugin unlocks a powerful new layer of automation and efficiency within ingestion workflows. We're excited to see how this collaboration empowers developers to leverage the full potential of MC+A’s offerings.
Vivek Sriram, Chief Commercial Officer @ Bookend.

Build your prompt, Call your end point

Elasticsearch has a mechanism to transform documents before they are indexed. This is called the Index Pipeline and Elasticsearch comes with many out of the box plugins. In order to use a plugin you can simply make a couple of API calls. For our plugin, we require that you setup your prompt in a prior stage. Once this is done you can call it in the following manner:

				
					PUT _ingest/pipeline/search-knowledge@custom
{
    "description": "Adds call to Bookend for Inference",
    "processors": [
      {
        "set": {
          "field": "bookend_input",
          "value": """{   "text": "{{body_content}}",   "question": "", "context": "","instruction": "Extract entities from the model's output"}"""
        }
      },
      {
        "ingest_rest": {
          "field": "bookend_input",
          "target_field": "ner",
          "endpoint": "https://api.bookend.ai/mcplusa/models/predict?model_id=c9318fab-1526-1974-86c7-cat627cdd7d3&task=named_entity_recognition",
          "content_type": "application/json",
          "method": "POST",
          }
      }
    ]
  }

In this example, we are calling a Named Entity Recognition task on the Bookend AI platform. This results in a response from the LLM which can be modified as needed.

				
					{
  "docs": [
    {
      "doc": {
        "_index": "_index",
        "_version": "-3",
        "_id": "http://www.mcplusa.local/example/ner",
        "_source": {
          "bookend_input": """{"text": "Japanese pitcher and freshly minted Dodgers team member Shohei Ohtani’s decade-long $700 million contract with the team — .", "question": "", "context": "","instruction": "Extract entities from the model's output"}""",
          "ner": """["Named entities in the text:nn1. Shohei Ohtani (Japanese pitcher and freshly minted Dodgers team member)n2. Dodgersn3. Contractn4. $700 million"]"""
        },
        "_ingest": {
          "timestamp": "2024-03-07T17:27:03.636898717Z"
        }
      }
    }
  ]
}

While calling LLMs for inference is what is demonstrated in this code snippet, it really can be combined to leverage any external service based on HTTP/HTTPS. Now the real challenge is getting the “AI Magic” to work. Happy Searching, and big thanks again to Steph van Schalkwyk for their collaboration on this project! Engage with MC+A

Getting Started

Need Assistance Deploying LLM Pipelines?

Launch your technology project with confidence. Our experts allow you to focus on your project’s business value by accelerating the technical implementation with a best practice approach. We provide the expert guidance needed to enhance your users’ search experience, push past technology roadblocks, and leverage the full business potential of search technology.