Research Dashboard

Dermoscopy LLM Evaluation

Interactive exploration of model performance across 17 multimodal LLMs, 6 prompting strategies, and 8 diagnoses.

This dashboard summarizes aggregate results from a dermoscopy classification evaluation study. It is intended for research exploration only and is not medical advice.

Dermoscopy LLM Evaluation Dashboard

Filter models, compare prompting arms, and explore cost vs accuracy tradeoffs.

OpenAIGemini
Loading dashboard…
Dataset: Dermoscopy LLM evaluation summary.Total trials: