Table of documentation contents

Example datasets

There is currently one example dataset available, but more are coming! If you have a dataset you want to add, please contact us, we're happy to help you setting it up.

News publications

This dataset contains +/- 1000 random news articles from; Financial Times, New York Times, Guardian, Wallstreet Journal, CNN, Fox News, The Economist, New Yorker, Wired, Vogue, Game Informer.

It includes a schema with classes for Article, Publication, Category and Author.

Run with Docker Compose

If you want to run this dataset locally, you can run it in one go with Docker Compose.

The Docker Compose files below contain both Weaviate and the dataset.

Download the Docker Compose file

$ curl -o docker-compose.yml

Run Docker (optional: run with -d to run Docker in the background)

$ docker-compose up

Weaviate will be available and preloaded with the News Articles demo dataset on:

Run manually

If you have your own version of Weaviate running on an external host or localhost without Docker Compose;

# WEAVIATE ORIGIN (e.g.,, note paragraph basics for setting the local IP
# Make sure to replace WEAVIATE_ORIGIN with the Weaviate origin as mentioned in the basics above
$ docker run -i -e weaviate_host=$WEAVIATE_ORIGIN semitechnologies/weaviate-demo-newspublications:latest

Usage with Docker on local with Docker Compose;

Note: run this from the same directory where the Weaviate Docker Compose files are located

# WEAVIATE ORIGIN (e.g., http://localhost:8080), note the paragraph "basics" for setting the local IP
$ export WEAVIATE_ORIGIN="http://$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' ${PWD##*/}_weaviate_1):8080"
# WEAVIATE NETWORK (see paragraph: Running on the localhost)
$ export WEAVIATE_NETWORK=$(docker inspect -f '{{range .NetworkSettings.Networks}}{{.NetworkID}}{{end}}' ${PWD##*/}_weaviate_1)
# Run docker
$ docker run -i --network=$WEAVIATE_NETWORK -e weaviate_host=$WEAVIATE_ORIGIN semitechnologies/weaviate-demo-newspublications:latest

More Resources

If you can’t find the answer to your question here, please look at the:

  1. Frequently Asked Questions. Or,
  2. Knowledge base of old issues. Or,
  3. For questions: Stackoverflow. Or,
  4. For issues: Github. Or,
  5. Ask your question in the Slack channel: Slack.
  • example datasets