FalconH1

This model was released on 2025-05-21 and added to Hugging Face Transformers on 2025-05-21.

FalconH1

Overview

The FalconH1 model was developed by the TII Pretraining team. A comprehensive research paper covering the architecture, pretraining dynamics, experimental results, and conclusions is forthcoming. You can read more about this series in this website.

Contributors

This model was contributed by DhiyaEddine, ybelkada, JingweiZuo, IlyasChahed, and MaksimVelikanov. The original code can be found here.

FalconH1Config

Model	Depth	Dim	Attn Heads	KV	Mamba Heads	d_head	d_state	Ctx Len
H1 0.5B	36	1024	8	2	24	64 / 64	128	4K, 16K-SFT
H1 1.5B	24	2048	8	2	48	128 / 64	256	128K
H1 1.5B-d	66	1280	6	2	24	128 / 64	256	128K
H1 3B	32	2560	10	2	32	128 / 128	256	128K
H1 7B	44	3072	12	2	24	128 / 128	256	256K
H1 34B	72	5120	20	4	32	128 / 128	256	256K

[[autodoc]] FalconH1Config

FalconH1ForCausalLM

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon-H1-7B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon-H1-7B-Instruct")

message = ["Mamba is a snake with following properties  "]
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
response = model.generate(**inputs, max_new_tokens=64)
print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])

[[autodoc]] FalconH1ForCausalLM - forward

This HF implementation is contributed by younesbelkada and DhiaEddineRhaiem.