Step 6. Iteratively implement & evaluate quality fixes

POC workflow diagram, iteration step

Requirements

  1. Based on your root cause analysis, you have identified a potential fixes to either retrieval or generation to implement and evaluate.
  2. Your POC application (or another baseline chain) is logged to an MLflow run with an Agent Evaluation evaluation stored in the same run.

See the GitHub repository for the sample code in this section.

Expected outcome

GIF to demonstrate an MLflow evaluation agent

Instructions

For all types, use the B_quality_iteration/02_evaluate_fixes notebook to evaluate the resulting chain versus your baseline configuration, your POC, and pick a “winner”. This notebook helps you pick the winning experiment and deploy it to the review app or a production-ready, scalable REST API.

  1. Open the B_quality_iteration/02_evaluate_fixes notebook.
  2. Based on the type of fix you are implementing:
    • For data pipeline fixes:
    • For chain configuration fixes:
      • Follow the instructions in the Chain configuration section of the 02_evaluate_fixes notebook to add chain configuration fixes to the CHAIN_CONFIG_FIXES variable.
    • For chain code fixes:
      • Create a modified chain code file and save it to the B_quality_iteration/chain_code_fixes folder. Alternatively, select one of the provided chain code fixes from that folder.
      • Follow the instructions in the Chain code section of the 02_evaluate_fixes notebook to add the chain code file and any additional chain configuration that is required to the CHAIN_CODE_FIXES variable.
  3. The following happens when you run the notebook from the Run evaluation cell:
    • Evaluate each fix.
    • Determine the fix with the best quality/cost/latency metrics.
    • Deploy the best one to the Review App and a production-ready REST API to get stakeholder feedback.

Next step

Continue with Step 6 (pipelines). Implement data pipeline fixes.

< Previous: Step 5.2. Debug generation quality

Next: Step 6.1. Fix the data pipeline >