Cost Reduction for Ask Astro #295

davidgxue · 2024-02-12T21:08:27Z

Main Issues

Reduce reranker from returning 10 documents to 8 documents
clean up data processing steps during ingestion to enforce smaller document length during request

Other Potential Areas

Migrate to using OpenAI's GPT-3.5 models as they are newer and cheaper by 5X.

OpenAI
The 0125 model is the same as the 16k model in Azure

Azure OpenAI

Migrate to new OpenAI's embedding model (5X cheaper embedding cost)

From OpenAI: "

Reduced price

text-embedding-3-small is also substantially more efficient than our previous generation text-embedding-ada-002 model.
Pricing for text-embedding-3-small has therefore been reduced by 5X compared to text-embedding-ada-002, from a price per 1k tokens of $0.0001 to $0.00002.
"

Already tracked by #286

The text was updated successfully, but these errors were encountered:

### Description - [Original issue discussed here](#286) - [Indirectly related issue about cost reduction (new model 5x cheaper but better accuracy)](#295) ### Technical Changes - Just `schema.json` -- this changes the vectorizer used during ingestion AND during the retrieval - on ingestion a new index is created if it doesn't exist before. During retrieval this index's vectorizer is used for vectorizing user query ### Tests & Evaluation - No significant difference. Small quality improvements for some questions. No quality degradation. - Model is way cheaper than the original V2 ada model - See details here #297 (comment) ### Related Issues closes #286 partially completes #295

davidgxue · 2024-03-04T15:56:54Z

Decided to implement bullet point 1 and 2 and the embedding model. After evaluation, the GPT-3.5 costs are very small that wouldn't make an immediate impact

davidgxue · 2024-03-04T16:34:40Z

It maybe important to also explore this: #300

davidgxue self-assigned this Feb 13, 2024

davidgxue added this to the 0.3.0 milestone Feb 13, 2024

davidgxue mentioned this issue Feb 13, 2024

Upgrade/Change Weaviate Schema to Use text-embedding-3-small #297

Merged

davidgxue modified the milestones: 0.3.0, 0.4.0 Feb 23, 2024

davidgxue mentioned this issue Feb 28, 2024

Data Ingestion Improvement/Cleanup/Bug Fix - Part 2 #307

Merged

davidgxue modified the milestones: 0.4.0, 0.3.0 Feb 28, 2024

davidgxue closed this as completed in df80ddc Mar 4, 2024

davidgxue closed this as completed in #307 Mar 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cost Reduction for Ask Astro #295

Cost Reduction for Ask Astro #295

davidgxue commented Feb 12, 2024 •

edited

Loading

davidgxue commented Mar 4, 2024

davidgxue commented Mar 4, 2024

Cost Reduction for Ask Astro #295

Cost Reduction for Ask Astro #295

Comments

davidgxue commented Feb 12, 2024 • edited Loading

Main Issues

Other Potential Areas

Migrate to using OpenAI's GPT-3.5 models as they are newer and cheaper by 5X.

Migrate to new OpenAI's embedding model (5X cheaper embedding cost)

Reduced price

davidgxue commented Mar 4, 2024

davidgxue commented Mar 4, 2024

davidgxue commented Feb 12, 2024 •

edited

Loading