Table of documentation contents

How to import data?

How to import data in Weaviate?

Introduction

Data is added through the RESTful API. Python and JavaScript clients are available. The syntax of a data object is as follows:

{
  "class": "<class name>",  // as defined during schema creation
  "id": "<UUID>",     // optional, should be in UUID format.
  "schema": {
    "<property name>": "<property value>", // specified in dataType defined during schema creation
  }
}

Prerequisites

  1. Connect to a Weaviate instance.
    If you haven’t set up a Weaviate instance yet, check the Getting started guide. In this guide we assume your instance is running at http://localhost:8080.
  2. Upload a schema.
    Learn how to create and upload a schema here. In this guide we assume to have a similar schema uploaded with the classes Publication, Article and Author.

Add a data object

Let’s add a Publication with the name New York Times to your Weaviate instance. Not all properties have to be filled when adding a data object, so we will skip the hasArticles property for now, since we don’t have any Article objects yet. Note that the UUID is given in the id parameter now, this is optional.

  import weaviate

client = weaviate.Client("http://localhost:8080")

data_schema = {
    "name": "New York Times"
}

client.data_object.create(data_schema, "Publication", "f81bfe5e-16ba-4615-a516-46c2ae2e5a80")

Add a data object with reference

If you want to add data object with a reference in a property, you need to use the UUID of the reference data object. Let’s add the Author named Jodi Kantor, who writes for the New York Times:

  import weaviate

client = weaviate.Client("http://localhost:8080")

data_schema = {
    "name": "Jodi Kantor",
    "writesFor": [{
      "beacon": "weaviate://localhost/things/f81bfe5e-16ba-4615-a516-46c2ae2e5a80"
    }]
}

client.data_object.create(data_schema, "Author", "36ddd591-2dee-4e7e-a3cc-eb86d30a4303")

You can also add references later, when the data object is already created. The following example first creates the Author with name, and later adds the reference to a Publication. This comes in handy when you need to create data objects first before you can add references.

  import weaviate
import time

client = weaviate.Client("http://localhost:8080")

data_schema = {
    "name": "Jodi Kantor"
}

client.data_object.create(data_schema, "Author", "36ddd591-2dee-4e7e-a3cc-eb86d30a4303")
time.sleep(1)
client.data_object.reference.add("36ddd591-2dee-4e7e-a3cc-eb86d30a4303", "writesFor", "f81bfe5e-16ba-4615-a516-46c2ae2e5a80")

Next steps

  • Take a look at How to query data to learn how to interact with the data you just added.
  • See the RESTful semantic kinds and Batching API reference pages for all API operations to add, modify and delete data.

More Resources

If you can’t find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.
Tags
  • how to
  • import data