Getting Started with REM

Store, search, and manage vector embeddings at scale with the REM decentralized network.

1. Install the SDK

bash
pip install rem-vectordb

2. Initialize the Client

python
from rem import REM

client = REM(api_key="rem_your_api_key_here")

3. Create a Collection

python
collection = client.create_collection(
    name="my-collection",
    dimension=1536,        # Match your embedding model
    metric="cosine"        # cosine | euclidean | dot_product
)

4. Upsert Vectors

python
collection.upsert([
    {"id": "doc-1", "values": [0.1, 0.2, ...], "metadata": {"source": "wiki"}},
    {"id": "doc-2", "values": [0.3, 0.4, ...], "metadata": {"source": "blog"}},
])

5. Query

python
results = collection.query(
    vector=[0.1, 0.2, ...],
    top_k=10,
    filter={"source": "wiki"}
)

for match in results.matches:
    print(f"{match.id}: {match.score:.4f}")

6. Fetch & Delete

python
# Fetch vectors by ID (for RAG source retrieval)
fetched = collection.fetch(ids=["doc-1", "doc-2"])

# Delete vectors by ID (for GDPR compliance)
deleted = collection.delete(ids=["doc-1"])

Authentication

All API requests require an API key. You can generate keys from the dashboard.

Using the SDK

python
from rem import REM

client = REM(api_key="rem_your_api_key_here")

Using the REST API

bash
curl -X GET https://api.getrem.online/v1/collections \
  -H "X-API-Key: rem_your_api_key_here"

Keep your API keys secure. Never expose them in client-side code or public repositories. Use environment variables in production.

Collections

A collection is a named group of vectors with a fixed dimension and distance metric. Collections are automatically distributed across miners for redundancy.

Create a Collection

python
collection = client.create_collection(
    name="products",
    dimension=384,
    metric="cosine",                     # "cosine" | "euclidean" | "dot_product"
    encrypted_fields=["email", "pii"],   # Optional: AES-256-GCM encryption
)

Supported Dimensions

DimensionCommon Models
384all-MiniLM-L6-v2, BGE-small
768all-mpnet-base-v2, BGE-base, E5-base
1024Cohere embed-v3, BGE-large
1536OpenAI text-embedding-3-small, text-embedding-ada-002
3072OpenAI text-embedding-3-large

Any integer dimension from 1 to 4096 is supported. The table above shows common embedding model dimensions.

Distance Metrics

MetricBest ForRange
cosineText similarity, semantic search, RAG0 to 1 (higher = more similar)
euclideanSpatial data, image features0 to infinity (lower = more similar)
dot_productPre-normalized vectors, recommendations-infinity to infinity (higher = more similar)

List Collections

python
collections = client.list_collections()
for col in collections:
    print(f"{col.name} — {col.dimension}d, {col.metric}")

Delete a Collection

python
client.delete_collection("collection_id")

Vectors

Vectors are the core data unit. Each vector has an ID, a list of float values, and optional metadata for filtering.

Upsert Vectors

Insert or update vectors. If a vector with the same ID exists, it will be overwritten.

python
collection.upsert([
    {
        "id": "prod-001",
        "values": [0.12, -0.34, 0.56, ...],  # Must match collection dimension
        "metadata": {
            "category": "electronics",
            "price": 299.99,
            "in_stock": True
        }
    },
    {
        "id": "prod-002",
        "values": [0.78, 0.91, -0.23, ...],
        "metadata": {
            "category": "clothing",
            "price": 49.99,
            "in_stock": False
        }
    }
])

Fetch Vectors by ID

Retrieve vectors by their IDs. Useful for RAG source document retrieval and citations.

python
result = collection.fetch(ids=["prod-001", "prod-002"])
for vec in result.vectors:
    print(f"{vec.id}: {len(vec.values)} dimensions")

Delete Vectors

Delete vectors by their IDs. Essential for GDPR compliance and data lifecycle management.

python
result = collection.delete(ids=["prod-001"])
print(f"Deleted {result.deleted_count} vectors")

Querying

Find the most similar vectors to a given query vector. Queries are automatically routed to miners for lowest latency.

Basic Query

python
results = collection.query(
    vector=[0.12, -0.34, 0.56, ...],
    top_k=10
)

Query with Metadata Filter

python
results = collection.query(
    vector=[0.12, -0.34, 0.56, ...],
    top_k=5,
    filter={
        "category": "electronics",
        "price": {"$lte": 500},
        "in_stock": True
    }
)

Query with include_values

Return the original vector values alongside scores. Useful for debugging and re-ranking.

python
results = collection.query(
    vector=[0.12, -0.34, 0.56, ...],
    top_k=5,
    include_values=True  # Returns original (deobfuscated) vectors
)

Filter Operators

OperatorDescriptionExample
$eqEqual to (default){"field": "value"}
$neNot equal to{"field": {"$ne": "x"}}
$gtGreater than{"price": {"$gt": 100}}
$gteGreater than or equal{"price": {"$gte": 100}}
$ltLess than{"price": {"$lt": 50}}
$lteLess than or equal{"price": {"$lte": 50}}
$inIn array{"cat": {"$in": ["a","b"]}}
$ninNot in array{"cat": {"$nin": ["x"]}}
$andLogical AND{"$and": [{...}, {...}]}
$orLogical OR{"$or": [{...}, {...}]}

Response Format

json
{
  "matches": [
    {
      "id": "prod-001",
      "score": 0.9542,
      "metadata": {"category": "electronics", "price": 299.99}
    },
    {
      "id": "prod-003",
      "score": 0.8891,
      "metadata": {"category": "electronics", "price": 149.99}
    }
  ]
}

Batch Query

Execute multiple queries in a single API call. Queries run in parallel on the server for maximum throughput. Up to 10 queries per batch.

python
results = collection.query_batch([
    {
        "vector": embed("wireless headphones"),
        "top_k": 5,
        "filter": {"category": "electronics"}
    },
    {
        "vector": embed("running shoes"),
        "top_k": 5,
        "filter": {"category": "sports"}
    },
    {
        "query_text": "bluetooth speaker",
        "top_k": 3,
        "hybrid_alpha": 1.0
    }
])

# results is a list of QueryResult, one per query
for i, result in enumerate(results):
    print(f"Query {i}: {len(result.matches)} matches")

Use cases: Recommendation engines (multiple feeds in one call), AI agents (search across memory types), and any pipeline that needs parallel retrieval.

Encryption

REM provides two layers of data protection: per-field metadata encryption with AES-256-GCM and vector obfuscation using distance-preserving transformations.

Encrypted Metadata Fields

Specify which metadata fields should be encrypted. Encrypted fields are invisible to miners but are automatically decrypted when you query.

python
# Create collection with encrypted fields
collection = client.create_collection(
    name="user-data",
    dimension=384,
    encrypted_fields=["email", "phone", "ssn"]  # AES-256-GCM
)

# Upsert — encrypted fields are handled automatically
collection.upsert([{
    "id": "user-1",
    "values": embed("John Doe profile"),
    "metadata": {
        "name": "John Doe",       # Stored as plaintext (filterable)
        "email": "john@example.com",  # Encrypted on miners
        "phone": "+1-555-0100",       # Encrypted on miners
        "ssn": "123-45-6789",         # Encrypted on miners
        "role": "admin"           # Stored as plaintext (filterable)
    }
}])

# Query — encrypted fields are auto-decrypted in results
results = collection.query(
    vector=embed("admin users"),
    filter={"role": "admin"},  # Only plaintext fields can be filtered
    top_k=10
)
# results include decrypted email, phone, ssn

Vector Obfuscation

All vectors are automatically obfuscated before being sent to miners using distance-preserving transformations (permutation + sign flip). This prevents miners from reconstructing your original embeddings while preserving similarity search accuracy.

How it works: Per-namespace keys generate deterministic permutation and sign-flip seeds. Cosine, Euclidean, and dot-product distances are perfectly preserved — search quality is identical to unobfuscated vectors.

Framework Integrations

Native integrations with popular AI frameworks. Drop-in vector stores that work with your existing pipelines.

LangChain

Drop-in VectorStore for LangChain RAG pipelines.

bash
pip install rem-vectordb[langchain]
python
from rem.integrations.langchain import REMVectorStore
from langchain_openai import OpenAIEmbeddings

# Create vector store
store = REMVectorStore(
    api_key="rem_xxx",
    collection_name="docs",
    embedding=OpenAIEmbeddings()
)

# Add documents
store.add_texts(
    texts=["REM is a decentralized vector database", "Powered by 2000+ miners"],
    metadatas=[{"source": "docs"}, {"source": "marketing"}]
)

# Similarity search
results = store.similarity_search("What is REM?", k=5)
for doc in results:
    print(doc.page_content)

# Search with scores
results_with_scores = store.similarity_search_with_score("What is REM?", k=5)
for doc, score in results_with_scores:
    print(f"{score:.4f}: {doc.page_content}")

# Search with metadata filter
results = store.similarity_search(
    "vector database",
    k=5,
    filter={"source": "docs"}
)

# Delete documents
store.delete(ids=["doc-id-1", "doc-id-2"])

# Create from texts (one-liner)
store = REMVectorStore.from_texts(
    texts=["Hello", "World"],
    embedding=OpenAIEmbeddings(),
    api_key="rem_xxx",
    collection_name="quickstart"
)

LlamaIndex

Native BasePydanticVectorStore for LlamaIndex index pipelines.

bash
pip install rem-vectordb[llamaindex]
python
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from rem.integrations.llamaindex import REMVectorStore

# Create vector store
vector_store = REMVectorStore(
    api_key="rem_xxx",
    collection_name="docs",
    dimension=1536
)

# Build index from documents
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(
    documents,
    vector_store=vector_store
)

# Query
query_engine = index.as_query_engine()
response = query_engine.query("What is REM Network?")
print(response)

# Use existing index
index = VectorStoreIndex.from_vector_store(vector_store)
retriever = index.as_retriever(similarity_top_k=5)
nodes = retriever.retrieve("vector database")

Python SDK Reference

Full reference for the rem-vectordb Python package.

Installation

bash
pip install rem-vectordb              # Core SDK
pip install rem-vectordb[langchain]   # + LangChain integration
pip install rem-vectordb[llamaindex]  # + LlamaIndex integration

Client

python
from rem import REM

client = REM(
    api_key="rem_xxx",                          # Required
    base_url="https://api.getrem.online",       # Optional (default)
    timeout=30,                                  # Request timeout in seconds
)

Async Client

python
from rem import AsyncREM

async with AsyncREM(api_key="rem_xxx") as client:
    collection = await client.create_collection("docs", dimension=384)
    await collection.upsert([...])
    results = await collection.query(vector=[...], top_k=10)

Client Methods

create_collection(name, dimension, metric?, encrypted_fields?)Create a new collection. Returns Collection object.
get_collection(id)Get an existing collection by ID
list_collections()List all collections in the namespace
delete_collection(id)Delete a collection and all its vectors

Collection Methods

upsert(vectors)Insert or update vectors. Accepts list of dicts or Vector objects.
query(vector?, top_k, filter?, query_text?, hybrid_alpha?, include_values?)Search for similar vectors. Supports hybrid search and metadata filtering.
query_batch(queries)Execute multiple queries in parallel (max 10 per batch).
fetch(ids)Fetch vectors by ID. Returns FetchResult with vector values.
delete(ids)Delete vectors by ID. Returns DeleteResult with count.
stats()Get collection statistics (vector count, storage, etc.).

REST API Reference

Base URL: https://api.getrem.online/v1

All requests must include the X-API-Key header.

POST/collectionsCreate a collection
{"name": "products", "dimension": 384, "metric": "cosine", "encrypted_fields": ["email"]}
GET/collectionsList collections
GET/collections/{id}Get collection details
DELETE/collections/{id}Delete a collection
POST/collections/{id}/vectors/upsertUpsert vectors
{"vectors": [{"id": "v1", "values": [...], "metadata": {...}}]}
POST/collections/{id}/vectors/queryQuery vectors
{"vector": [...], "top_k": 10, "filter": {...}, "query_text": "keywords", "hybrid_alpha": 0.5, "include_values": false}
POST/collections/{id}/vectors/query/batchBatch query (max 10)
{"queries": [{"vector": [...], "top_k": 5}, {"query_text": "...", "top_k": 3}]}
POST/collections/{id}/vectors/fetchFetch vectors by ID
{"ids": ["v1", "v2"]}
POST/collections/{id}/vectors/deleteDelete vectors
{"ids": ["v1"]}

Rate Limits

PlanRequests/minMax Vectors/UpsertBatch Queries
Free6010010 per call
Pro6001,00010 per call
Business6,00010,00010 per call

Need help? Join our Discord or email support@getrem.online