Question 1

What is the cost of Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory?

Accepted Answer

Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory costs $0.03 per 1M input tokens and $0.09 per 1M output tokens. Cached reads cost $0.0030 per 1M tokens. Cache writes cost $0.03 per 1M tokens.

Question 2

What is the context window of Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory?

Accepted Answer

Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory has a context window of 128K tokens. It supports up to 120K input tokens and can generate up to 4K output tokens.

Question 3

What are the capabilities of Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory?

Accepted Answer

Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory supports tool calling/function calling, structured output, adjustable temperature. It has open weights.

Question 4

What input and output types does Meta-Llama-3.1-8B-Instruct (Fast) support?

Accepted Answer

Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory accepts text as input and can generate text as output.

Question 5

Is Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory open source?

Accepted Answer

Yes, Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory has open weights, meaning the model weights are publicly available for download and self-hosting.

Question 6

Does Meta-Llama-3.1-8B-Instruct (Fast) support function calling?

Accepted Answer

Yes, Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory supports tool calling (also known as function calling), allowing it to interact with external tools and APIs during conversations.

Question 7

What is the knowledge cutoff date for Meta-Llama-3.1-8B-Instruct (Fast)?

Accepted Answer

Meta-Llama-3.1-8B-Instruct (Fast) by Nebius Token Factory has a knowledge cutoff date of 2024-12. This means the model was trained on data available up until that date.

Meta-Llama-3.1-8B-Instruct (Fast) API

Input Modalities

Output Modalities

Standard (per 1M tokens)

Caching (per 1M tokens)

Meta-Llama-3.1-8B-Instruct (Fast) API

Input Modalities

Output Modalities

Standard (per 1M tokens)

Caching (per 1M tokens)