The way that objects and the inverted index are stored within Weaviate are migrated from a B+Tree-based approach to an LSM-Tree approach. This can speed up import times up to 50%. Also addresses import times degrading over time.
status: done, to be released with next milestone
A monolithic index (one index per class) can be broken up into smaller independent shards. This allows utilizing resources on large (single) machines better and allows for tweaking storage settings for specific large-scale cases.
An index, comprised of many shards, can be distributed among multiple nodes. A search will touch multiple shards on multiple nodes and combine the results. Major benefit: If a use case does not fit on a single node, you can use *n* nodes to achieve *n* times the use case size. At this point every node in the cluster is still a potential single point of failure.
Replication shards distributed across nodes
A node can contain shards which are already present on other nodes as well. This means if a node goes down, another node can take up the load without the loss of availability or data. Note that the design plans for a leaderless replication, so there is no distinction between primary and secondary shards. Removes all single point of failures.
Instead of starting out with a cluster with *n* nodes, the cluster size can be increased or shrank at runtime. Weaviate automatically distributes the existing shards accordingly.