H2O GenAI Day Atlanta - Training Certification

您的浏览器中未启用 JavaScript，因此无法打开此文件。请启用 JavaScript，然后重新加载。

电子邮件地址 *

Full Name *

What is a main conclusion of TrustLLM? *

10 分

LLMs cannot be trusted.

Trustworthiness is more important than utility.

Utility is more important than trustworthiness.

Trustworthiness and utility were positively correlated.

How can counterfactual analysis be applied in the context of LLMs?
*

10 分

Measuring consistency across prompts and RAG applications.

Ex post analysis using a difference-in-differences estimator.

Measuring accuracy across prompts and RAG applications.

Either a matched control group must be used or an RCT.

What are the four key components of the proposed Gen AI Evaluation Framework?

10 分

Models, Leaderboards, Documents, Tests.

Encoders, Decoders, Transformers, Accuracy Metrics.

Models, Documents, Evaluators, Tests.

RAGAS, Hallucination Index, Answer correctness, Context similarity.

What are influence functions?
*

10 分

A function to measure the influence that the context has an LLM's generated response

A function to understand how important social media influences the training data

A function to measure the influence of including a data point in the training set on model response.

A function to influence how embeddings from a decoder are used in the encoder part of a Transformer-based architecture

What are possible options for addressing problematic behavior in the case of LLMS
*

10 分

Prompt Engineering

Model Monitoring

Choosing a different foundational model

All of the above

Which of the following are common benchmarks for open source leaderboards?

10 分

MMLU, HellaSwag, A12 Reasoning Challenge, Truthful QA

BLEU, ELO, HellaSwag, MMLU

BLEU, ROUGE, ELO, HellaSwag

MMLU, ROUGE, ELO, Truthful QA

Which of the following are not key steps in chain-of-verification?

10 分

Generate Embeddings

Initial Baseline Response

Execute Verification

Verification Question Generation

For Large Language Models, what might constitute "Conceptual Soundness"?

10 分

Model Architecture

Training Data

Explanations for choices of Training Data and Model Architecture

Explanations for why choices of Training Data and Model Architecture are reasonable for the use case that the model will be applied to

All of the above

Which of the following are examples of guardrails?

10 分

Content Filter Guardrails

Privacy Guardrails

Explainability Guardrails

Bias Mitigation Guardrails

All of the above

Which of the following are not types of attacks against a LLM?

10 分

Riddle Response

Hijack Response

Trick Response

Life Threat

Simulation

Why might publicly available leaderboards not be entirely trustworthy?
*

10 分

Benchmarks are not task-specific

Some model entries may be fraudulent

Results aren't reproducible

All of the above

提交

清除表单内容

切勿通过 Google 表单提交密码。

此表单是在 0xdata 内部创建的。举报滥用行为

表单