Scoring Leads
After the conversation with a user is finished, the system generates a report based on the lead’s engagement and quality. A lead’s engagement is measured by high, medium, and low, while the lead quality is measured by hot, warm and cold. The NineTwoThree team created a place for the Protect Line team to rate the conversation from their perspective to improve the chatbot.
In order to properly score leads, the NineTwoThree team developed an evaluation suite for in depth analysis. We created custom prompts for each of the following aspects of an LLM which include:
Conciseness: Evaluating if the responses are succinct yet informative.
Emotional intelligence: Assessing the chatbot's ability to recognize and respond appropriately to user emotions.
Coherence: Verifying the truthfulness and accuracy of the information provided in responses with the knowledge of ground truth.
Latency: Ensuring the chatbot responds within a reasonable timeframe
Response price and token usage: Making sure the chatbot’s price and token usage stayed within budget
Fluency: Verifying the chatbot speaks as naturally as possible for a positive user experience.
The NineTwoThree team worked iteratively on these prompts and ran multiple experiments to achieve the best results.