Table of documentation contents

How to perform a semantic search?

How to perform a semantic search in Weaviate?

Introduction

  • Weaviate has RESTful API endpoints to query data, but Weaviate’s query language is GraphQL.
  • You can query a Weaviate after you’ve created a schema and populated it with data.
  • You can perform simple Get{} queries to easily retrieve data, learn how here.
  • To narrow down search results from a Get{} query based on semantics, use the explore filter in the Get{} query. Read how in this tutorial and on the GraphQL filter reference page.
  • To search and find for data objects in a fuzzy manner, you can use the GraphQL Explore{} function, read how in this tutorial, and on the reference page.

Prerequisites

  1. Connect to a Weaviate instance.
    If you haven’t set up a Weaviate instance yet, check the Getting started guide. In this guide we assume your instance is running at http://localhost:8080.
  2. Upload a schema.
    Learn how to create and upload a schema here. In this guide we assume to have a similar schema uploaded with the classes Publication, Article and Author.
  3. Add data.
    Make sure there is data available in your Weaviate instance, you can read how to do this in the previous guide. In this tutorial we assume there are data objects of Publications, Articles and Authors present.

Explore filter

If you want to perform semantic search on data objects in a known class, you can use the explore filter in a Get{} query. Let’s say we want to find Publications that are related to “fashion”, we can do the following query:

  {
  Get{
    Things{
      Publication(
        explore: {
          concepts: ["fashion"],
          certainty: 0.7,
          moveAwayFrom: {
            concepts: ["finance"],
            force: 0.45
          },
          moveTo: {
            concepts: ["haute couture"],
            force: 0.85
          }
        }
      ){
        name
      }
    }
  }
}

🟢 Click here to try out this graphql example in the Weaviate Console.

This will result in something like the following. Vogue was selected as only result, because there were no other publications related enough to “fashion” and “haute couture”.

{
  "data": {
    "Get": {
      "Things": {
        "Publication": [
          {
            "name": "Vogue"
          }
        ]
      }
    }
  },
  "errors": null
}

Explore GraphQL function

If you are not sure what classes you want to query, or want to perform a fuzzy search through your whole dataset, then you can use the Explore{} function instead of the Get{} function. In the Explore{} function you don’t specify which classes you perform the semantic search on, so the semantic search will be performed on all the data objects. Since this search is fuzzy, the only fields you can return are the beacon, certainty, className; you cannot request property values of the data objects, since the property value names depend on the data object, defined in the schema.

Let’s search for data object about fashion again, but now we are not only interested in Publication data objects, but in all data objects that have something to do with “fashion”.

  {
  Explore (
    concepts: ["fashion"],
    certainty: 0.7,
    moveAwayFrom: {
      concepts: ["finance"],
      force: 0.45
    },
    moveTo: {
      concepts: ["haute couture"],
      force: 0.85
    }
  ) {
    beacon
    certainty
    className
  }
}

🟢 Click here to try out this graphql example in the Weaviate Console.

As you can see, the same arguments are applied in the “explore” filter and the Explore{} function. There is however no class specified. Instead, the className is returned as one of the GraphQL fields. The result of this query contains both data objects from the class Publication and Article:

{
  "data": {
    "Explore": [
      {
        "beacon": "weaviate://localhost/things/65010df4-da64-333d-b1ce-55c3fc9174ab",
        "certainty": 0.8257073,
        "className": "Article"
      },
      {
        "beacon": "weaviate://localhost/things/ac884d35-ccb4-3937-81f8-8474a4d7a549",
        "certainty": 0.79948425,
        "className": "Publication"
      },
      {
        "beacon": "weaviate://localhost/things/f2b7c189-9183-3095-a5bb-b619d7fe9703",
        "certainty": 0.7862817,
        "className": "Article"
      },
      {
        "beacon": "weaviate://localhost/things/21239ca9-8f09-3747-b369-ff41e0dfebdd",
        "certainty": 0.7857753,
        "className": "Article"
      },
      {
        "beacon": "weaviate://localhost/things/8f2cb04e-c3bb-344f-8fbd-f742bf36e653",
        "certainty": 0.77738225,
        "className": "Article"
      }
    ]
  },
  "errors": null
}

If you are interested in property values of the returned objects, you will need to do a second query to to retrieve data of the beacon:

$ curl -s http://localhost:8080/v1/things/{id}

So querying all property values of the first result can be done as follows:

$ curl -s http://localhost:8080/v1/things/65010df4-da64-333d-b1ce-55c3fc9174ab

Next steps

  • Look for more ways to query your dataset in Weaviate with GraphQL queries, semantic search and other filters in the GraphQL references guide.
  • Stay tuned for new tutorials, for example on interpretation of the semantic search results or how to set up a classification model!

More resources

If you can’t find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.
Tags
  • how to
  • perform a semantic search