Skip to content
Fast-LLM
Layers
Initializing search
ServiceNow/Fast-LLM
Welcome
Get Started
Recipes
Reference
Contributing
Configuration Reference
About Us
Join Us
Fast-LLM
ServiceNow/Fast-LLM
Welcome
Get Started
Get Started
Quick Start
Help
Success Stories
Success Stories
StarCoder 2
License
Recipes
Recipes
Prepare a dataset
Configure a dataset
Train a model from scratch
Continue training a model
Upcycle Llama 3B to MoE
Instruction Finetuning
Generate
Reference
Reference
User Guide
User Guide
Configuration
Multi-Stage
Parallelism
Evaluators
Developer Guide
Developer Guide
Configuration
Parallelism
Model
Model
Model
Conversion
Contributing
Contributing
Contribution Guide
Style Guide
Development Practices
Testing
How to Release
Configuration Reference
Configuration Reference
Data
Data
Data
Data
DataConfig
Gpt
Gpt
GPTDataConfig
Dataset
Dataset
BlendedDatasetConfig
ConcatenatedDatasetConfig
DatasetConfig
DatasetSliceConfig
IndexedDatasetConfig
RedisConfig
SamplableDatasetConfig
SampledDatasetConfig
SamplingConfig
SamplingConfigBase
StreamingDatasetConfig
Gpt
Gpt
FimConfig
GPTDatasetFromFileConfig
GPTFimSampledDatasetConfig
GPTRandomDatasetConfig
GPTSamplingConfig
GPTTestSlowDatasetConfig
Memmap
Memmap
LanguageModelReaderConfig
MemmapDatasetConfig
MemmapIndexDatasetReaderConfig
MemmapReaderBaseConfig
MemmapReaderConfig
NullReaderConfig
PatchReaderBaseConfig
PatchReaderConfig
RangeReaderBaseConfig
RangeReaderConfig
TokenDataReaderConfig
TokenReaderConfig
Document
Document
BatchPreprocessingConfig
ImageNormalizationConfig
LanguageModelBatchPreprocessingConfig
LengthPreprocessingConfig
PatchPreprocessingConfig
TokenPreprocessingConfig
Preparation
Preparation
DatasetPreparatorConfig
Dataset Discovery
Dataset Discovery
DatasetDiscoveryConfig
Gpt Memmap
Gpt Memmap
ConversationSourceConfig
DatasetPreparatorDistributedConfig
DocumentSourceConfig
GPTHuggingfaceDatasetConfig
GPTMemmapDatasetPreparatorConfig
LanguageModelSourceConfig
Image Patch
Image Patch
ImagePreparationConfig
Tokenizer
Tokenizer
TokenizerConfig
Engine
Engine
Base Model
Base Model
BaseModelConfig
ModuleConfig
Checkpoint
Checkpoint
CheckpointConfigBase
CheckpointLoadConfig
CheckpointLoadMetadataConfig
CheckpointPathConfigBase
CheckpointSaveConfig
CheckpointSaveConfigBase
CheckpointSaveMetadataConfig
CheckpointStateConfigBase
CheckpointStateSaveConfigBase
Config Utils
Config Utils
Initialization
Initialization
DefaultInitializationConfig
FillInitializationConfig
InitializationConfig
NormalInitializationConfig
UniformInitializationConfig
Interval
Interval
IntervalConfig
Logging
Logging
TensorLogsConfig
Parameter
Parameter
OptionalParameterConfig
ParameterConfig
Run
Run
ExperimentConfig
RunConfig
Runnable
Runnable
RunnableConfig
Distributed
Distributed
DistributedConfig
Evaluation
Evaluation
EvaluatorConfig
LmEvalEvaluatorConfig
LossEvaluatorConfig
Multi Stage
Multi Stage
CheckpointMetadata
FastLLMModelConfig
MultiStageConfig
PretrainedFastLLMModelConfig
StageConfig
Optimizer
Optimizer
GradientScalerConfig
LearningRateScheduleConfig
OptimizerConfig
Schedule
Schedule
ScheduleConfig
Training
Training
CallbackConfig
MetricsLogsConfig
ShutdownConfig
StreamingTrainerCallbackConfig
TrainerCallbackConfig
TrainerConfig
TrainingCheckpointBaseConfig
TrainingCheckpointConfig
TrainingConfig
TrainingExportConfig
WandbAlertConfig
WandbConfig
WeightsBroadcastConfig
Layers
Layers
Attention
Attention
AttentionConfig
Rotary
Rotary
DefaultRotaryConfig
Llama3RotaryConfig
NoRotaryConfig
Rotary2DConfig
RotaryConfig
YarnRotaryConfig
Block
Block
BlockConfig
BlockSequenceConfig
FixedBlockSequenceConfig
PatternBlockSequenceConfig
Common
Common
Linear
Linear
AffineLinearBaseConfig
AffineLinearConfig
CausalConv1dConfig
LinearBaseConfig
LinearConfig
Normalization
Normalization
GatedRMSNormalizationConfig
LayerNormalizationBaseConfig
LayerNormalizationConfig
NoNormalizationConfig
NormalizationConfig
RMSNormalizationConfig
Peft
Peft
LoRAConfig
NoPeftConfig
PeftConfig
Decoder
Decoder
BlockWithBiasConfig
DecoderBlockConfig
MLPBaseConfig
MixerConfig
StochasticMixerConfig
Mlp
Mlp
MLPConfig
MoEMLPConfig
Language Model
Language Model
LanguageModelConfig
LanguageModelEmbeddingsConfig
LanguageModelHeadConfig
Loss
Loss
LanguageModelDPOLossConfig
LanguageModelDistillationLossConfig
LanguageModelGRPOLossConfig
LanguageModelLabelEntropyLossConfig
LanguageModelLossConfig
LanguageModelZLossConfig
Vision
Vision
PatchEmbeddingsConfig
VisionEncoderConfig
VisionMultiModalModelConfig
Models
Models
Gpt
Gpt
GPTBaseModelConfig
GPTModelConfig
GPTTrainerConfig
PretrainedGPTModelConfig
Multimodal
Multimodal
MultiModalBaseModelConfig
MultiModalModelConfig
MultiModalTrainerConfig
PretrainedMultiModalModelConfig
Profile
Profile
ProfilingConfig
About Us
Join Us
Table of contents
Sections
Welcome
Configuration Reference
Layers
¶
Sections
¶
Attention
Block
Common
Decoder
Language Model
Vision
Back to top