Publications

<!--
  See https://www.debugbear.com/blog/responsive-images#w-descriptors-and-the-sizes-attribute and
  https://developer.mozilla.org/en-US/docs/Learn/HTML/Multimedia_and_embedding/Responsive_images for info on defining 'sizes' for responsive images
-->

  <source
    class="responsive-img-srcset"
    srcset="/assets/img/publication_preview/arena-hard-img_resized-480.webp 480w,/assets/img/publication_preview/arena-hard-img_resized-800.webp 800w,/assets/img/publication_preview/arena-hard-img_resized-1400.webp 1400w,"
    
      sizes="200px"
    
    type="image/webp"
  >

<img
  src="/assets/img/publication_preview/arena-hard-img_resized.png"
  
    class="preview z-depth-1 rounded"
  
  
    width="100%"
  
  
    height="auto"
  
  
  
    alt="arena-hard-img_resized.png"
  
  
  
    data-zoomable
  
  
    loading="eager"
  
  onerror="this.onerror=null; $('.responsive-img-srcset').remove();"
>

</picture>

</figure>

</div>

From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline
Tianle Li, Wei-Lin Chiang , Evan Frick , Lisa Dunlap , and 4 more authors
Under Review, Jun 2024

</div> </li>

  • nexusflow_resized.png
    Athene-70B: Redefining the Boundaries of Post-Training for Open Models
    Evan Frick* , Peter Jin* , Tianle Li*, Karthik Ganesan , and 3 more authors
    Jul 2024
  • PPE.png
    How to Evaluate Reward Models for RLHF
    Evan Frick , Tianle Li, Connor Chen , Wei-Lin Chiang , and 5 more authors
    Under Review, Nov 2024
  • bear.png
    Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
    Wei-Lin Chiang, Lianmin Zheng, Ying Sheng, Anastasios Angelopoulos, Tianle Li, Dacheng Li, Hao Zhang, Banghua Zhu, Michael Jordan, Joseph E. Gonzalez, Ion Stoica
    ICML, Mar 2024
  • </ol>

    2023

    1. vicuna.jpeg
      LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
      Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Tianle Li, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zhuohan Li, Zi Lin, Eric Xing, Joseph E. Gonzalez, Ion Stoica, Hao Zhang
      ICLR Spotlight, Sep 2023

    </div> –>