Before you begin
This test requires a Pinecone account on the Standard or Enterprise plan.- New users can sign up for the Standard trial for 21 days and $300 in credits, more than enough to cover the costs of this test.
- Existing users on the Starter plan can upgrade.
1. Understand the test
This test is designed to simulate a production-scale dataset and workload, measuring import time, query throughput, query latency, and associated costs.Dataset
- Records: 10 million records from the Amazon Reviews 2023 dataset
- Embedding model:
llama-text-embed-v2(1024 dimensions) - Similarity metric: cosine
- Total size: 48.8 GB
Workload
- Query load: 10 queries per second (QPS)
- Concurrent users: 10 users querying simultaneously
- Test duration: 1000 queries
Success criteria
This test aims to verify the following success criteria:- Import time: < 30 minutes
- Query latency: p90 latency of less than 100ms
2. Get an API key
Create a new API key in the Pinecone console, or use the widget below to generate a key.3. Install an SDK
Install the Python SDK:4. Create an index
Create an on-demand index that matches the dimensions and similarity metric of the dataset. Choose a cloud provider that you have access to because you’ll need to provision a VM in the same region as your index to run the benchmark.Python
5. Import the dataset
Pinecone’s import feature enables you to load millions of vectors from object storage in parallel. Use the import feature to load 10 million product records into a single namespace within your index.1
Start bulk import
Python
2
Monitor import progress
To track progress, check the status bar in the Pinecone console or use the describe import operation with the import ID:The amount of time required for an import depends on various factors, including dimensionality and metadata complexity.
Python
For this dataset, the import should take less than 30 minutes.
6. Run the benchmark
You’ll use the Vector Search Bench (VSB) tool to simulate realistic query patterns and measure latency and throughput.1
Provision a VM
The VSB tool reports latency as the time from when the tool issues a query to when the query is returned by Pinecone. To minimize the client-side latency between the tool and Pinecone, it’s important to run the test on a dedicated VM on the same cloud provider and region as your Pinecone index. This reduces the client-side latency to sub-millisecond range.See the cloud provider’s documentation for instructions on how to provision a VM instance:
Be sure to create the VM instance in the same region as your Pinecone index. If you don’t, the client-side latency will be higher and you won’t get an accurate sense of Pinecone’s performance.
2
Connect to the VM
Connect to the VM using the cloud provider’s console.
3
Install Vector Search Bench (VSB)
Clone the VSB repository and use Poetry to install the dependencies:
Terminal
4
Install dependencies
Use Poetry to install the dependencies:
Terminal
5
Benchmark Pinecone
Use VSB to simulate 10 concurrent users issuing a total of 1000 queries at 10 queries per second (QPS):
Terminal
By default, VSB populates the target index with a dataset. In this case, you’ve already done that, so
--skip-populate makes sure VSB skips the population phase.7. Analyze performance
At the end of the run, VSB prints an operation summary including the requests per second achieved and latencies at different percentiles. Here’s an example output:Terminal
stats.json file identified in the output.
8. Check costs
You can check the costs for the import, queries, and storage in the Pinecone console at Settings > Usage. Cost data is delayed up to three days, but once it’s available, compare the actual costs to the estimated costs below.For the latest pricing details, see Pricing.
1
Import costs
The current price for import is $1/GB. The dataset size for this test is 48.8 GB, so the import cost should be $48.80.
2
Query costs
A query uses 1 read unit (RU) for every 1 GB of namespace size, and the current price for queries in the
us-east-1 region of AWS is $16 per 1 million read units.This test ran 1000 queries against a namespace size of 48.8 GB, so the query cost should be $16/million RUs * 48.8 GB / 1000000 = $0.0007808.3
Storage costs
The current price for storage is $0.33 per GB per month. The dataset size for this test is 48.8 GB. To estimate the storage cost for one hour: $0.33/GB/month * 48.8 GB / 730 hours = $0.022/hour.
4
Total costs
The total cost for the test is the sum of the import cost, query cost, and storage cost: $48.80 + $0.0007808 + $0.022 = $48.8227808.