Getting Started with REM
Store, search, and manage vector embeddings at scale with the REM decentralized network.
1. Install the SDK
pip install rem-vectordb2. Initialize the Client
from rem import REM
client = REM(api_key="rem_your_api_key_here")3. Create a Collection
collection = client.create_collection(
name="my-collection",
dimension=1536, # Match your embedding model
metric="cosine" # cosine | euclidean | dot_product
)4. Upsert Vectors
collection.upsert([
{"id": "doc-1", "values": [0.1, 0.2, ...], "metadata": {"source": "wiki"}},
{"id": "doc-2", "values": [0.3, 0.4, ...], "metadata": {"source": "blog"}},
])5. Query
results = collection.query(
vector=[0.1, 0.2, ...],
top_k=10,
filter={"source": "wiki"}
)
for match in results.matches:
print(f"{match.id}: {match.score:.4f}")6. Fetch & Delete
# Fetch vectors by ID (for RAG source retrieval)
fetched = collection.fetch(ids=["doc-1", "doc-2"])
# Delete vectors by ID (for GDPR compliance)
deleted = collection.delete(ids=["doc-1"])Authentication
All API requests require an API key. You can generate keys from the dashboard.
Using the SDK
from rem import REM
client = REM(api_key="rem_your_api_key_here")Using the REST API
curl -X GET https://api.getrem.online/v1/collections \
-H "X-API-Key: rem_your_api_key_here"Keep your API keys secure. Never expose them in client-side code or public repositories. Use environment variables in production.
Collections
A collection is a named group of vectors with a fixed dimension and distance metric. Collections are automatically distributed across miners for redundancy.
Create a Collection
collection = client.create_collection(
name="products",
dimension=384,
metric="cosine", # "cosine" | "euclidean" | "dot_product"
encrypted_fields=["email", "pii"], # Optional: AES-256-GCM encryption
)Supported Dimensions
| Dimension | Common Models |
|---|---|
| 384 | all-MiniLM-L6-v2, BGE-small |
| 768 | all-mpnet-base-v2, BGE-base, E5-base |
| 1024 | Cohere embed-v3, BGE-large |
| 1536 | OpenAI text-embedding-3-small, text-embedding-ada-002 |
| 3072 | OpenAI text-embedding-3-large |
Any integer dimension from 1 to 4096 is supported. The table above shows common embedding model dimensions.
Distance Metrics
| Metric | Best For | Range |
|---|---|---|
| cosine | Text similarity, semantic search, RAG | 0 to 1 (higher = more similar) |
| euclidean | Spatial data, image features | 0 to infinity (lower = more similar) |
| dot_product | Pre-normalized vectors, recommendations | -infinity to infinity (higher = more similar) |
List Collections
collections = client.list_collections()
for col in collections:
print(f"{col.name} — {col.dimension}d, {col.metric}")Delete a Collection
client.delete_collection("collection_id")Vectors
Vectors are the core data unit. Each vector has an ID, a list of float values, and optional metadata for filtering.
Upsert Vectors
Insert or update vectors. If a vector with the same ID exists, it will be overwritten.
collection.upsert([
{
"id": "prod-001",
"values": [0.12, -0.34, 0.56, ...], # Must match collection dimension
"metadata": {
"category": "electronics",
"price": 299.99,
"in_stock": True
}
},
{
"id": "prod-002",
"values": [0.78, 0.91, -0.23, ...],
"metadata": {
"category": "clothing",
"price": 49.99,
"in_stock": False
}
}
])Fetch Vectors by ID
Retrieve vectors by their IDs. Useful for RAG source document retrieval and citations.
result = collection.fetch(ids=["prod-001", "prod-002"])
for vec in result.vectors:
print(f"{vec.id}: {len(vec.values)} dimensions")Delete Vectors
Delete vectors by their IDs. Essential for GDPR compliance and data lifecycle management.
result = collection.delete(ids=["prod-001"])
print(f"Deleted {result.deleted_count} vectors")Querying
Find the most similar vectors to a given query vector. Queries are automatically routed to miners for lowest latency.
Basic Query
results = collection.query(
vector=[0.12, -0.34, 0.56, ...],
top_k=10
)Query with Metadata Filter
results = collection.query(
vector=[0.12, -0.34, 0.56, ...],
top_k=5,
filter={
"category": "electronics",
"price": {"$lte": 500},
"in_stock": True
}
)Query with include_values
Return the original vector values alongside scores. Useful for debugging and re-ranking.
results = collection.query(
vector=[0.12, -0.34, 0.56, ...],
top_k=5,
include_values=True # Returns original (deobfuscated) vectors
)Filter Operators
| Operator | Description | Example |
|---|---|---|
| $eq | Equal to (default) | {"field": "value"} |
| $ne | Not equal to | {"field": {"$ne": "x"}} |
| $gt | Greater than | {"price": {"$gt": 100}} |
| $gte | Greater than or equal | {"price": {"$gte": 100}} |
| $lt | Less than | {"price": {"$lt": 50}} |
| $lte | Less than or equal | {"price": {"$lte": 50}} |
| $in | In array | {"cat": {"$in": ["a","b"]}} |
| $nin | Not in array | {"cat": {"$nin": ["x"]}} |
| $and | Logical AND | {"$and": [{...}, {...}]} |
| $or | Logical OR | {"$or": [{...}, {...}]} |
Response Format
{
"matches": [
{
"id": "prod-001",
"score": 0.9542,
"metadata": {"category": "electronics", "price": 299.99}
},
{
"id": "prod-003",
"score": 0.8891,
"metadata": {"category": "electronics", "price": 149.99}
}
]
}Hybrid Search
Combine vector similarity with BM25 keyword matching for the best of both worlds. Results are merged using Reciprocal Rank Fusion (RRF).
Vector + Keyword Search
results = collection.query(
vector=embed("wireless headphones"),
query_text="noise cancelling", # BM25 keyword search
top_k=10
)Controlling the Blend
Use hybrid_alpha to control the balance between vector and keyword results.
# Pure vector search (default)
results = collection.query(vector=embed("query"), hybrid_alpha=0.0)
# Balanced hybrid (50/50)
results = collection.query(
vector=embed("query"),
query_text="keywords",
hybrid_alpha=0.5
)
# Pure keyword search (no vector needed)
results = collection.query(
query_text="exact phrase match",
hybrid_alpha=1.0
)| hybrid_alpha | Behavior | Best For |
|---|---|---|
| 0.0 | Pure vector similarity | Semantic search, similar items |
| 0.3 | Mostly vector, slight keyword boost | RAG with some keyword precision |
| 0.5 | Equal blend | General-purpose search |
| 0.7 | Mostly keyword, some semantic | Technical docs, code search |
| 1.0 | Pure BM25 keyword matching | Exact phrase matching |
Batch Query
Execute multiple queries in a single API call. Queries run in parallel on the server for maximum throughput. Up to 10 queries per batch.
results = collection.query_batch([
{
"vector": embed("wireless headphones"),
"top_k": 5,
"filter": {"category": "electronics"}
},
{
"vector": embed("running shoes"),
"top_k": 5,
"filter": {"category": "sports"}
},
{
"query_text": "bluetooth speaker",
"top_k": 3,
"hybrid_alpha": 1.0
}
])
# results is a list of QueryResult, one per query
for i, result in enumerate(results):
print(f"Query {i}: {len(result.matches)} matches")Use cases: Recommendation engines (multiple feeds in one call), AI agents (search across memory types), and any pipeline that needs parallel retrieval.
Encryption
REM provides two layers of data protection: per-field metadata encryption with AES-256-GCM and vector obfuscation using distance-preserving transformations.
Encrypted Metadata Fields
Specify which metadata fields should be encrypted. Encrypted fields are invisible to miners but are automatically decrypted when you query.
# Create collection with encrypted fields
collection = client.create_collection(
name="user-data",
dimension=384,
encrypted_fields=["email", "phone", "ssn"] # AES-256-GCM
)
# Upsert — encrypted fields are handled automatically
collection.upsert([{
"id": "user-1",
"values": embed("John Doe profile"),
"metadata": {
"name": "John Doe", # Stored as plaintext (filterable)
"email": "john@example.com", # Encrypted on miners
"phone": "+1-555-0100", # Encrypted on miners
"ssn": "123-45-6789", # Encrypted on miners
"role": "admin" # Stored as plaintext (filterable)
}
}])
# Query — encrypted fields are auto-decrypted in results
results = collection.query(
vector=embed("admin users"),
filter={"role": "admin"}, # Only plaintext fields can be filtered
top_k=10
)
# results include decrypted email, phone, ssnVector Obfuscation
All vectors are automatically obfuscated before being sent to miners using distance-preserving transformations (permutation + sign flip). This prevents miners from reconstructing your original embeddings while preserving similarity search accuracy.
How it works: Per-namespace keys generate deterministic permutation and sign-flip seeds. Cosine, Euclidean, and dot-product distances are perfectly preserved — search quality is identical to unobfuscated vectors.
Framework Integrations
Native integrations with popular AI frameworks. Drop-in vector stores that work with your existing pipelines.
LangChain
Drop-in VectorStore for LangChain RAG pipelines.
pip install rem-vectordb[langchain]from rem.integrations.langchain import REMVectorStore
from langchain_openai import OpenAIEmbeddings
# Create vector store
store = REMVectorStore(
api_key="rem_xxx",
collection_name="docs",
embedding=OpenAIEmbeddings()
)
# Add documents
store.add_texts(
texts=["REM is a decentralized vector database", "Powered by 2000+ miners"],
metadatas=[{"source": "docs"}, {"source": "marketing"}]
)
# Similarity search
results = store.similarity_search("What is REM?", k=5)
for doc in results:
print(doc.page_content)
# Search with scores
results_with_scores = store.similarity_search_with_score("What is REM?", k=5)
for doc, score in results_with_scores:
print(f"{score:.4f}: {doc.page_content}")
# Search with metadata filter
results = store.similarity_search(
"vector database",
k=5,
filter={"source": "docs"}
)
# Delete documents
store.delete(ids=["doc-id-1", "doc-id-2"])
# Create from texts (one-liner)
store = REMVectorStore.from_texts(
texts=["Hello", "World"],
embedding=OpenAIEmbeddings(),
api_key="rem_xxx",
collection_name="quickstart"
)LlamaIndex
Native BasePydanticVectorStore for LlamaIndex index pipelines.
pip install rem-vectordb[llamaindex]from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from rem.integrations.llamaindex import REMVectorStore
# Create vector store
vector_store = REMVectorStore(
api_key="rem_xxx",
collection_name="docs",
dimension=1536
)
# Build index from documents
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(
documents,
vector_store=vector_store
)
# Query
query_engine = index.as_query_engine()
response = query_engine.query("What is REM Network?")
print(response)
# Use existing index
index = VectorStoreIndex.from_vector_store(vector_store)
retriever = index.as_retriever(similarity_top_k=5)
nodes = retriever.retrieve("vector database")Python SDK Reference
Full reference for the rem-vectordb Python package.
Installation
pip install rem-vectordb # Core SDK
pip install rem-vectordb[langchain] # + LangChain integration
pip install rem-vectordb[llamaindex] # + LlamaIndex integrationClient
from rem import REM
client = REM(
api_key="rem_xxx", # Required
base_url="https://api.getrem.online", # Optional (default)
timeout=30, # Request timeout in seconds
)Async Client
from rem import AsyncREM
async with AsyncREM(api_key="rem_xxx") as client:
collection = await client.create_collection("docs", dimension=384)
await collection.upsert([...])
results = await collection.query(vector=[...], top_k=10)Client Methods
create_collection(name, dimension, metric?, encrypted_fields?)Create a new collection. Returns Collection object.get_collection(id)Get an existing collection by IDlist_collections()List all collections in the namespacedelete_collection(id)Delete a collection and all its vectorsCollection Methods
upsert(vectors)Insert or update vectors. Accepts list of dicts or Vector objects.query(vector?, top_k, filter?, query_text?, hybrid_alpha?, include_values?)Search for similar vectors. Supports hybrid search and metadata filtering.query_batch(queries)Execute multiple queries in parallel (max 10 per batch).fetch(ids)Fetch vectors by ID. Returns FetchResult with vector values.delete(ids)Delete vectors by ID. Returns DeleteResult with count.stats()Get collection statistics (vector count, storage, etc.).REST API Reference
Base URL: https://api.getrem.online/v1
All requests must include the X-API-Key header.
/collectionsCreate a collection{"name": "products", "dimension": 384, "metric": "cosine", "encrypted_fields": ["email"]}/collectionsList collections/collections/{id}Get collection details/collections/{id}Delete a collection/collections/{id}/vectors/upsertUpsert vectors{"vectors": [{"id": "v1", "values": [...], "metadata": {...}}]}/collections/{id}/vectors/queryQuery vectors{"vector": [...], "top_k": 10, "filter": {...}, "query_text": "keywords", "hybrid_alpha": 0.5, "include_values": false}/collections/{id}/vectors/query/batchBatch query (max 10){"queries": [{"vector": [...], "top_k": 5}, {"query_text": "...", "top_k": 3}]}/collections/{id}/vectors/fetchFetch vectors by ID{"ids": ["v1", "v2"]}/collections/{id}/vectors/deleteDelete vectors{"ids": ["v1"]}Rate Limits
| Plan | Requests/min | Max Vectors/Upsert | Batch Queries |
|---|---|---|---|
| Free | 60 | 100 | 10 per call |
| Pro | 600 | 1,000 | 10 per call |
| Business | 6,000 | 10,000 | 10 per call |
Need help? Join our Discord or email support@getrem.online