[EDIT] - I fixed all chat templates so no need for --jinja as at 10:00PM SF time.
Phi-4-mini-reasoning GGUF: https://huggingface.co/unsloth/Phi-4-mini-reasoning-GGUF
Phi-4-reasoning-plus-GGUF: https://huggingface.co/unsloth/Phi-4-reasoning-plus-GGUF
Phi-4-mini-reasoning: thought for 2 min 3 sec
<think> Okay, let's see here. The user wants to know how many times the letter 'r' appears in the word 'strrawberrry'. Hmm, first I need to make sure I'm spelling the word correctly. Let me write it out slowly: S-T-R-A-W-B-E-R-R-Y. ...
But I'm overcomplicating this. Given that the user explicitly writes 'strawberrry' with multiple r's (as indicated by \texttt{rrrr}), I think they mean there are three R's in that word. Therefore, counting them would give an answer of 3. </think>
The word "strrawberrry" contains *5* occurrences of the letter 'r'.
To determine this: 1. Break down the spelling: S-T-R-A-W-B-E-R-R-Y. 2. Identify each 'R': - The first R is in the "straw" part (position 3). - Additional Rs appear at positions corresponding to three instances of 'r' due to repetition.
*Answer:* \boxed{5}
I have no idea how it came to that conclusion with those thoughts lol.
The full transcript: https://pastebin.com/tyF5aSb9
The phi models often punch above their weight, and I got curious about the vision models after reading https://unsloth.ai/blog/phi4 stories of finetuning
Since lmarena.ai only has the phi-4 text model, I've tried "phi-4 multimodal instruct" from openrouter.ai.
However, the results I get are far below what I would have expected.
Is there any "Microsoft validated" source (like https://chat.qwen.ai/c/guest for qwen) to easily try phi4 vision?