Corey's Bank

Testing Inferencing Platforms

This report details the comprehensive testing of various inference platforms, evaluating their usability, completeness, and overall performance for our banking operations. The goal is to identify the most robust and efficient solutions for our AI-driven analytical needs.

For detailed performance metrics, please refer to the following assessments:

Performance Report Long Prompt -- A long prompt and long token response
Performance Report Medium Prompt -- A medium prompt and small token response
Performance Report Small Prompt -- A small prompt and short response

Fireworks Text Models

The below options will execute a prompt against the submitted excel (or default data) offering financial analysis.

The first is Illama 405b. The second is a Fine-Tuned 70b using a set of example financial plans. The third uses Reserved capacity on Fireworks (slow cold-start but faster once deployed on dedicated GPUs)

Chatting using RAG

(Contextual and AWS Bedrock)

The default contextual agent has been designed using RAG of sample Q1 2023 financial results from Microsoft, Google, Apple, Intel, and Meta. The user specific agent uses RAG on the uploaded excel. The AWS chat uses RAG on the uploaded excel (but has a small delay once uploaded). Contextual agents are multiturn so you can ask follow-up questions.

Corey's Bank

Testing Inferencing Platforms

Upload Your Spending Data

(xlsx file)

Fireworks Text Models

Weave Inference Text Models

Chatting using RAG

(Contextual and AWS Bedrock)