Corey's Bank
Testing Inferencing Platforms
This report details the comprehensive testing of various inference platforms, evaluating their usability, completeness, and overall performance for our banking operations. The goal is to identify the most robust and efficient solutions for our AI-driven analytical needs.
For detailed performance metrics, please refer to the following assessments:
- Performance Report Long Prompt -- A long prompt and long token response
- Performance Report Medium Prompt -- A medium prompt and small token response
- Performance Report Small Prompt -- A small prompt and short response
Upload Your Spending Data
(xlsx file)
Fireworks Text Models
The below options will execute a prompt against the submitted excel (or default data) offering financial analysis.
The first is Illama 405b. The second is a Fine-Tuned 70b using a set of example financial plans. The third uses Reserved capacity on Fireworks (slow cold-start but faster once deployed on dedicated GPUs)
Weave Inference Text Models
The below options will execute a prompt against the submitted excel (or default data) offering financial analysis.
The first is Illama. The second is DeepSeek.
Chatting using RAG
(Contextual and AWS Bedrock)
The default contextual agent has been designed using RAG of sample Q1 2023 financial results from Microsoft, Google, Apple, Intel, and Meta. The user specific agent uses RAG on the uploaded excel. The AWS chat uses RAG on the uploaded excel (but has a small delay once uploaded). Contextual agents are multiturn so you can ask follow-up questions.