-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cost Reduction for Ask Astro #295
Milestone
Comments
davidgxue
added a commit
that referenced
this issue
Feb 28, 2024
### Description - [Original issue discussed here](#286) - [Indirectly related issue about cost reduction (new model 5x cheaper but better accuracy)](#295) ### Technical Changes - Just `schema.json` -- this changes the vectorizer used during ingestion AND during the retrieval - on ingestion a new index is created if it doesn't exist before. During retrieval this index's vectorizer is used for vectorizing user query ### Tests & Evaluation - No significant difference. Small quality improvements for some questions. No quality degradation. - Model is way cheaper than the original V2 ada model - See details here #297 (comment) ### Related Issues closes #286 partially completes #295
Decided to implement bullet point 1 and 2 and the embedding model. After evaluation, the GPT-3.5 costs are very small that wouldn't make an immediate impact |
It maybe important to also explore this: #300 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Main Issues
Other Potential Areas
Migrate to using OpenAI's GPT-3.5 models as they are newer and cheaper by 5X.
OpenAI
The
0125
model is the same as the 16k model in AzureAzure OpenAI
Migrate to new OpenAI's embedding model (5X cheaper embedding cost)
From OpenAI: "
Reduced price
text-embedding-3-small is also substantially more efficient than our previous generation text-embedding-ada-002 model.
Pricing for text-embedding-3-small has therefore been reduced by 5X compared to text-embedding-ada-002, from a price per 1k tokens of $0.0001 to $0.00002.
"
Already tracked by #286
The text was updated successfully, but these errors were encountered: