Phi
This model was released on 2023-06-20 and added to Hugging Face Transformers on 2023-11-10.
Phi is a 1.3B parameter transformer model optimized for Python code generation. It focuses on “textbook-quality” training data of code examples, exercises and synthetic Python problems rather than scaling the model size or compute.
You can find all the original Phi checkpoints under the Phi-1 collection.
The example below demonstrates how to generate text with Pipeline, AutoModel and from the command line.
import torchfrom transformers import pipeline
pipeline = pipeline(task="text-generation", model="microsoft/phi-1.5", device=0, dtype=torch.bfloat16)pipeline("pipeline('''def print_prime(n): """ Print all primes between 1 and n"""''')")import torchfrom transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-1")model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1", dtype=torch.float16, device_map="auto", attn_implementation="sdpa")
input_ids = tokenizer('''def print_prime(n): """ Print all primes between 1 and n """''', return_tensors="pt").to(model.device)
output = model.generate(**input_ids, cache_implementation="static")print(tokenizer.decode(output[0], skip_special_tokens=True))echo -e "'''def print_prime(n): """ Print all primes between 1 and n"""'''" | transformers run --task text-classification --model microsoft/phi-1.5 --device 0Quantization reduces the memory burden of large models by representing the weights in a lower precision. Refer to the Quantization overview for more available quantization backends.
The example below uses bitsandbytes to only quantize the weights to 4-bits.
import torchfrom transformers import BitsAndBytesConfig, AutoTokenizer, AutoModelForCausalLM
bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True)tokenizer = AutoTokenizer.from_pretrained("microsoft/phi-1")model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1", dtype=torch.float16, device_map="auto", attn_implementation="sdpa", quantization_config=bnb_config)
input_ids = tokenizer('''def print_prime(n): """ Print all primes between 1 and n """''', return_tensors="pt").to(model.device)
output = model.generate(**input_ids, cache_implementation="static")print(tokenizer.decode(output[0], skip_special_tokens=True))-
If you’re using Transformers < 4.37.0.dev, set
trust_remote_code=Trueinfrom_pretrained. Otherwise, make sure you update Transformers to the latest stable version.import torchfrom transformers import AutoTokenizer, AutoModelForCausalLMtokenizer = AutoTokenizer.from_pretrained("microsoft/phi-1")model = AutoModelForCausalLM.from_pretrained("microsoft/phi-1",dtype=torch.float16,device_map="auto",trust_remote_code=True,attn_implementation="sdpa")input_ids = tokenizer('''def print_prime(n):"""Print all primes between 1 and n"""''', return_tensors="pt").to(model.device)output = model.generate(**input_ids, cache_implementation="static")print(tokenizer.decode(output[0], skip_special_tokens=True))
PhiConfig
Section titled “PhiConfig”[[autodoc]] PhiConfig
PhiModel
Section titled “PhiModel”[[autodoc]] PhiModel - forward
PhiForCausalLM
Section titled “PhiForCausalLM”[[autodoc]] PhiForCausalLM - forward - generate
PhiForSequenceClassification
Section titled “PhiForSequenceClassification”[[autodoc]] PhiForSequenceClassification - forward
PhiForTokenClassification
Section titled “PhiForTokenClassification”[[autodoc]] PhiForTokenClassification - forward