RoCBert
This model was released on 2022-05-27 and added to Hugging Face Transformers on 2022-11-08.
RoCBert
Section titled “RoCBert”RoCBert is a pretrained Chinese BERT model designed against adversarial attacks like typos and synonyms. It is pretrained with a contrastive learning objective to align normal and adversarial text examples. The examples include different semantic, phonetic, and visual features of Chinese. This makes RoCBert more robust against manipulation.
You can find all the original RoCBert checkpoints under the weiweishi profile.
Click on the RoCBert models in the right sidebar for more examples of how to apply RoCBert to different Chinese language tasks.
The example below demonstrates how to predict the [MASK] token with Pipeline, AutoModel, and from the command line.
import torchfrom transformers import pipeline
pipeline = pipeline( task="fill-mask", model="weiweishi/roc-bert-base-zh", dtype=torch.float16, device=0)pipeline("這家餐廳的拉麵是我[MASK]過的最好的拉麵之")import torchfrom transformers import AutoModelForMaskedLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained( "weiweishi/roc-bert-base-zh",)model = AutoModelForMaskedLM.from_pretrained( "weiweishi/roc-bert-base-zh", dtype=torch.float16, device_map="auto",)inputs = tokenizer("這家餐廳的拉麵是我[MASK]過的最好的拉麵之", return_tensors="pt").to(model.device)
with torch.no_grad(): outputs = model(**inputs) predictions = outputs.logits
masked_index = torch.where(inputs['input_ids'] == tokenizer.mask_token_id)[1]predicted_token_id = predictions[0, masked_index].argmax(dim=-1)predicted_token = tokenizer.decode(predicted_token_id)
print(f"The predicted token is: {predicted_token}")echo -e "這家餐廳的拉麵是我[MASK]過的最好的拉麵之" | transformers run --task fill-mask --model weiweishi/roc-bert-base-zh --device 0RoCBertConfig
Section titled “RoCBertConfig”[[autodoc]] RoCBertConfig - all
RoCBertTokenizer
Section titled “RoCBertTokenizer”[[autodoc]] RoCBertTokenizer - build_inputs_with_special_tokens - get_special_tokens_mask - create_token_type_ids_from_sequences - save_vocabulary
RoCBertModel
Section titled “RoCBertModel”[[autodoc]] RoCBertModel - forward
RoCBertForPreTraining
Section titled “RoCBertForPreTraining”[[autodoc]] RoCBertForPreTraining - forward
RoCBertForCausalLM
Section titled “RoCBertForCausalLM”[[autodoc]] RoCBertForCausalLM - forward
RoCBertForMaskedLM
Section titled “RoCBertForMaskedLM”[[autodoc]] RoCBertForMaskedLM - forward
RoCBertForSequenceClassification
Section titled “RoCBertForSequenceClassification”[[autodoc]] transformers.RoCBertForSequenceClassification - forward
RoCBertForMultipleChoice
Section titled “RoCBertForMultipleChoice”[[autodoc]] transformers.RoCBertForMultipleChoice - forward
RoCBertForTokenClassification
Section titled “RoCBertForTokenClassification”[[autodoc]] transformers.RoCBertForTokenClassification - forward
RoCBertForQuestionAnswering
Section titled “RoCBertForQuestionAnswering”[[autodoc]] RoCBertForQuestionAnswering - forward