Todays AI Summary

AI Developments: Scriptwriting LLMs, Mobile Agents, and Efficiency Optimizations

Today's AI landscape features advancements in language models tailored for creative tasks, mobile automation, and efficiency improvements for on-device deployment. Research focuses on enhancing retrieval-augmented generation and improving reasoning capabilities.

Noteworthy Papers

  • Enhancing Retrieval-Augmented Generation with Entity Linking for Educational Platforms: This paper introduces an enhanced RAG architecture that integrates Entity Linking to improve the accuracy of educational question-answering systems. The system uses a Wikidata-based Entity Linking module and implements three re-ranking strategies to combine semantic and entity-based information. The results show that, in domain-specific contexts, the hybrid schema based on reciprocal rank fusion significantly outperforms both the baseline and the cross-encoder approach.
  • Training-Time Action Conditioning for Efficient Real-Time Chunking: This research proposes a method for training vision-language-action models (VLAs) by simulating inference delay at training time and conditioning on action prefixes directly, which reduces computational overhead and maintains task performance.
  • M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG: This paper introduces a large-scale benchmark for evaluating retrieval-augmented VQA across languages and modalities. The benchmark covers 42 languages and 56 regional dialects and registers, comprising over 80,000 culturally diverse image-question pairs.

Model Highlights

  • FutureMa/Qwen3-8B-Drama-Thinking: This model is a fine-tuned version of Qwen3-8B, specializing in screenwriting with explicit creative reasoning chains. It uses <think>...</think> tags to show internal reasoning, analyzes character psychology, and plans structure. The model was trained on a custom drama thinking dataset and shows significant improvements in output length, thinking depth, and creative reasoning compared to the base model.
  • zai-org/AutoGLM-Phone-9B: This model is a mobile intelligent assistant framework built on AutoGLM, capable of understanding smartphone screens through multimodal perception and executing automated operations to complete tasks. It controls devices via ADB, uses a vision-language model for screen understanding, and leverages intelligent planning to generate and execute action sequences.
  • embedl/Llama-3.2-3B-Instruct-FlashHead: This model is an optimized version of Llama-3.2-3B-Instruct using FlashHead, Embedl’s efficient replacement for the language model head, reducing size while preserving accuracy. It is designed for low-latency inference on NVIDIA RTX GPUs and matches the Llama-3.2-3B-Instruct baseline within rounding error on common benchmarks.

Key Takeaways

  • Specialized LLMs: Models are becoming increasingly specialized for creative tasks like scriptwriting, incorporating reasoning and planning capabilities.
  • Mobile Automation: AI agents are being developed to automate tasks on mobile devices, leveraging multimodal perception and intelligent planning.
  • Efficiency Optimizations: Techniques like FlashHead and quantization are being used to improve the efficiency of language models for on-device deployment, maintaining accuracy while reducing latency.
  • RAG Enhancements: Research is focused on improving the accuracy and effectiveness of retrieval-augmented generation systems through entity linking and multilingual capabilities.

AI Papers for 2026-02-10

No papers available yet. Check back soon!

AI Models

inclusionAI/LLaDA2.1-mini


license: apache-2.0 library_name: transformers tags:

  • dllm
  • diffusion
  • llm
  • text_generation

LLaDA2.1-mini

LLaDA2.1-mini is a diffusion language model of the LLaDA series featuring the editing enhancement. It significantly improves inference speed while delivering strong task performance.

<div align="center"> <img src="https://mdn.alipayobjects.com/huamei_qa8qxu/afts/img/A*uOo8QKQMiBwAAAAAgNAAAAgAemJ7AQ/original" width="800" /> </div> <div align="center"> <img src="https://mdn.alipayobjects.com/huamei_qa8qxu/afts/img/A*biwvQpCmKjEAAAAAULAAAAgAemJ7AQ/original" width="800" /> </div>

Model Performance

<table> <thead> <tr> <th align="left"><b>Benchmark</b></th> <th align="center"><b>Qwen3-8B<br>(no_think)</b><br><sub>(Score)</sub></th> <th align="center"><b>Ling-mini-2.0</b><br><br><sub>(Score)</sub></th> <th align="center"><b>LLaDA2.0-mini</b><br><br><sub>(Score | TPF)</sub></th> <th align="center"><b>LLaDA2.1-mini<br>(S Mode)</b><br><sub>(Score | TPF)</sub></th> <th align="center"><b>LLaDA2.1-mini<br>(Q Mode)</b><br><sub>(Score | TPF)</sub></th> </tr> </thead> <tbody> <tr> <td align="left"><b>Average</b></td> <td align="center">61.59</td> <td align="center">64.72</td> <td align="center">63.39 | 2.60</td> <td align="center">62.24 | 5.34</td> <td align="center">63.90 | 3.12</td> </tr> <tr><td colspan="6" align="center"><b>Knowledge</b></td></tr> <tr> <td align="left">GPQA</td> <td align="center">48.01</td> <td align="center">59.41</td> <td align="center">47.76 | 2.73</td> <td align="center">48.36 | 3.62</td> <td align="center">53.28 | 2.12</td> </tr> <tr> <td align="left">MMLU-Pro</td> <td align="center">65.83</td> <td align="center">67.18</td> <td align="center">64.27 | 2.15</td> <td align="center">63.42 | 4.22</td> <td align="center">64.84 | 2.41</td> </tr> <tr> <td align="left">C-EVAL</td> <td align="center">80.6</td> <td align="center">82.17</td> <td align="center">81.80 | 1.78</td> <td align="center">78.40 | 3.39</td> <td align="center">78.59 | 1.91</td> </tr> <tr> <td align="left">PHYBench</td> <td align="center">9.76</td> <td align="center">14.59</td> <td align="center">11.70 | 2.48</td> <td align="center">12.75 | 4.41</td> <td align="center">13.05 | 2.52</td> </tr> <tr> <td align="left">TriviaQA</td> <td align="center">52.51</td> <td align="center">55.63</td> <td align="center">51.33 | 1.54</td> <td align="center">53.33 | 3.21</td> <td align="center">54.24 | 2.02</td> </tr> <tr><td colspan="6" align="center"><b>Reasoning</b></td></tr> <tr> <td align="left">BIG-Bench Hard</td> <td align="center">79.48</td> <td align="center">83.70</td> <td align="center">78.21 | 2.36</td> <td align="center">78.42 | 5.02</td> <td align="center">80.58 | 2.86</td> </tr> <tr> <td align="left">BIG-Bench Extra Hard</td> <td align="center">18.27</td> <td align="center">14.81</td> <td align="center">16.47 | 2.03</td> <td align="center">15.30 | 3.19</td> <td align="center">15.78 | 1.66</td> </tr> <tr> <td align="left">bbh-zh</td> <td align="center">80.09</td> <td align="center">66.11</td> <td align="center">75.75 | 2.77</td> <td align="center">67.65 | 3.89</td> <td align="center">70.40 | 2.35</td> </tr> <tr> <td align="left">MuSR</td> <td align="center">70.02</td> <td align="center">71.36</td> <td align="center">71.48 | 1.45</td> <td align="center">70.43 | 2.48</td> <td align="center">71.89 | 1.56</td> </tr> <tr> <td align="left">ZebraLogic</td> <td align="center">37.48</td> <td align="center">79.85</td> <td align="center">64.20 | 2.30</td> <td align="center">68.50 | 5.38</td> <td align="center">77.10 | 2.93</td> </tr> <tr> <td align="left">PrOntoQA</td> <td align="center">93.12</td> <td align="center">96.06</td> <td align="center">86.00 | 2.36</td> <td align="center">87.50 | 4.86</td> <td align="center">84.50 | 2.73</td> </tr> <tr> <td align="left">PIQA</td> <td align="center">88.30</td> <td align="center">87.54</td> <td align="center">86.51 | 1.45</td> <td align="center">84.87 | 2.59</td> <td align="center">86.89 | 1.45</td> </tr> <tr> <td align="left">OCNLI</td> <td align="center">61.49</td> <td align="center">60.17</td> <td align="center">64.51 | 4.06</td> <td align="center">61.02 | 1.78</td> <td align="center">61.59 | 1.23</td> </tr> <tr> <td align="left">HellaSwag</td> <td align="center">79.56</td> <td align="center">69.02</td> <td align="center">79.01 | 1.50</td> <td align="center">75.71 | 2.39</td> <td align="center">76.19 | 1.49</td> </tr> <tr> <td align="left">KOR-Bench</td> <td align="center">54.96</td> <td align="center">63.2</td> <td align="center">49.92 | 2.45</td> <td align="center">46.64 | 4.28</td> <td align="center">48.00 | 2.35</td> </tr> <tr> <td align="left">DROP</td> <td align="center">84.56</td> <td align="center">78.80</td> <td align="center">81.89 | 2.02</td> <td align="center">81.55 | 5.84</td> <td align="center">82.37 | 2.87</td> </tr> <tr> <td align="left">SQuAD 2.0</td> <td align="center">85.21</td> <td align="center">75.56</td> <td align="center">86.50 | 2.47</td> <td align="center">84.51 | 4.33</td> <td align="center">85.13 | 3.09</td> </tr> <tr><td colspan="6" align="center"><b>Coding</b></td></tr> <tr> <td align="left">LiveCodeBench</td> <td align="center">26.76</td> <td align="center">42.29</td> <td align="center">31.83 | 3.34</td> <td align="center">28.85 | 6.42</td> <td align="center">30.40 | 3.63</td> </tr> <tr> <td align="left">CRUXEval-O</td> <td align="center">74.06</td> <td align="center">76.12</td> <td align="center">71.62 | 2.78</td> <td align="center">70.62 | 5.85</td> <td align="center">73.75 | 3.35</td> </tr> <tr> <td align="left">MBPP+</td> <td align="center">72.69</td> <td align="center">77.25</td> <td align="center">78.24 | 3.43</td> <td align="center">78.84 | 10.59</td> <td align="center">74.07 | 6.30</td> </tr> <tr> <td align="left">HumanEval+</td> <td align="center">79.5</td> <td align="center">80.03</td> <td align="center">81.71 | 5.16</td> <td align="center">80.49 | 12.32</td> <td align="center">82.93 | 7.77</td> </tr> <tr> <td align="left">MultiPL-E</td> <td align="center">61.70</td> <td align="center">67.09</td> <td align="center">67.46 | 2.78</td> <td align="center">64.16 | 7.23</td> <td align="center">67.17 | 4.01</td> </tr> <tr> <td align="left">BigCodeBench-Full</td> <td align="center">36.05</td> <td align="center">35.00</td> <td align="center">32.89 | 2.87</td> <td align="center">30.18 | 7.33</td> <td align="center">34.39 | 4.09</td> </tr> <tr> <td align="left">Aider</td> <td align="center">55.64</td> <td align="center">49.62</td> <td align="center">39.85 | 3.57</td> <td align="center">43.61 | 8.11</td> <td align="center">45.11 | 4.85</td> </tr> <tr> <td align="left">BIRD-SQL</td> <td align="center">36.11</td> <td align="center">39.67</td> <td align="center">39.34 | 1.96</td> <td align="center">37.32 | 4.48</td> <td align="center">38.40 | 2.42</td> </tr> <tr> <td align="left">Spider</td> <td align="center">72.80</td> <td align="center">76.43</td> <td align="center">76.76 | 3.93</td> <td align="center">75.78 | 7.98</td> <td align="center">77.55 | 5.48</td> </tr> <tr><td colspan="6" align="center"><b>Math</b></td></tr> <tr> <td align="left">AIME 2025</td> <td align="center">22.08</td> <td align="center">47.66</td> <td align="center">36.67 | 2.41</td> <td align="center">36.67 | 6.34</td> <td align="center">43.33 | 3.29</td> </tr> <tr> <td align="left">OlympiadBench</td> <td align="center">55.33</td> <td align="center">72.30</td> <td align="center">67.70 | 2.63</td> <td align="center">64.30 | 7.08</td> <td align="center">66.67 | 3.99</td> </tr> <tr> <td align="left">GSM-Plus</td> <td align="center">85.56</td> <td align="center">87.18</td> <td align="center">86.50 | 2.41</td> <td align="center">85.88 | 6.82</td> <td align="center">86.55 | 3.69</td> </tr> <tr> <td align="left">CMATH</td> <td align="center">95.42</td> <td align="center">96.40</td> <td align="center">95.72 | 1.98</td> <td align="center">95.63 | 4.94</td> <td align="center">94.99 | 2.56</td> </tr> <tr> <td align="left">Omni-MATH</td> <td align="center">33.20</td> <td align="center">48.80</td> <td align="center">41.70 | 2.57</td> <td align="center">41.70 | 6.41</td> <td align="center">43.60 | 3.56</td> </tr> <tr><td colspan="6" align="center"><b>Agent & Alignment</b></td></tr> <tr> <td align="left">IFEval-strict-prompt</td> <td align="center">84.29</td> <td align="center">76.16</td> <td align="center">80.78 | 1.24</td> <td align="center">81.33 | 1.83</td> <td align="center">83.18 | 1.25</td> </tr> <tr> <td align="left">BFCL v3</td> <td align="center">70.12</td> <td align="center">53.75</td> <td align="center">70.72 | 4.26</td> <td align="center">72.06 | 7.39</td> <td align="center">73.61 | 5.14</td> </tr> <tr> <td align="left">CodeIF-Bench</td> <td align="center">50.00</td> <td align="center">46.00</td> <td align="center">46.00 | 2.62</td> <td align="center">42.00 | 6.68</td> <td align="center">48.00 | 3.62</td> </tr> <tr> <td align="left">Nexus FC</td> <td align="center">37.71</td> <td align="center">34.38</td> <td align="center">35.18 | 4.06</td> <td align="center">31.59 | 8.27</td> <td align="center">33.69 | 4.91</td> </tr> </tbody> </table>

🚀 Highlights

  • Error-Correcting Editable: Structural innovation of editable generation for dLLM
  • Speedy vs Quality Mode: The 16B mini model achieves ultra-fast inference under Speed Mode while remaining competitive across various tasks and under Quality Mode.
  • Reinforcement Learning on 100B-scale dLLM: Tailored algorithm and framework to enable reinforcement learning for large dLLM.

🗺️ What's Next

  • Powerful Agentic/Tool Use Capability with LLaDA: Next update will be equipped with powerful Agentic and long-distance tool-use capability.
  • Extreme Editing: Next update will feature stronger and more extensive editing capabilities, aimed at correcting more errors in parallel reasoning.
  • Explore More Training Paradigms: We want to explore more training paradigms than SFT and RL for dLLM.

📦 Model Variants

| Model ID | Description | Hugging Face Link | | --- | --- | --- | | inclusionAI/LLaDA2.1-mini | Instruction-tuned model, ready for downstream applications. | 🤗 Model Card | | inclusionAI/LLaDA2.1-flash | Instruction-tuned model, ready for downstream applications. | 🤗 Model Card |


🔍 Model Overview

LLaDA2.1-mini has the following specifications:

  • Type: Mixture-of-Experts (MoE) Diffusion Language Model
  • Total Parameters (Non-Embedding): 16B
  • Number of Layers: 20
  • Attention Heads: 16
  • Context Length: 32,768 tokens
  • Position Embedding: Rotary (RoPE)
  • Vocabulary Size: 157,184

🤗 Hugging Face Transformers

Make sure you have transformers and its dependencies installed:

import torch
import torch.nn.functional as F
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "/path/to/LLaDA2.1-mini"
device = "auto"
model = AutoModelForCausalLM.from_pretrained(
    model_path, trust_remote_code=True, device_map=device,
)
model = model.to(torch.bfloat16)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

prompt = """Calculate 1+5-28*0.5-200=?"""
input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    tokenize=True,
    return_tensors="pt",
)
generated_tokens = model.generate(
    inputs=input_ids,
    eos_early_stop=True,
    gen_length=512,
    block_length=32,
    threshold=0.5,
    editing_threshold=0,
    temperature=0.0,
)
generated_answer = tokenizer.decode(
    generated_tokens[0],
    skip_special_tokens=True,
)
print(generated_answer)

Best Practices

To achieve optimal performance, we recommend the following settings:

  1. Sampling Parameters: We recommend the following general sampling parameters: block_length=32, temperature=0.0, top_p=None and top_k=None. We are currently exploring more diverse sampling configurations.

  2. Denoising Thresholds: There are three denoising params: threshold, editing_threshold and max_post_steps. We recommend threshold=0.7, editing_threshold=0.5 for Quality Mode and threshold=0.5, editing_threshold=0.0 for Speed Mode. For both modes, we suggest setting max_post_steps to a value greater than 5. We recommend 16 as a balanced default, which was used for most of our internal testing.

Note: Low threshold may causes stuttering in trade-off for quick inference.

  1. Adequate Output Length: We recommend using an output length of 16384 tokens for most scenarios.

🤖ModelScope

If you're in mainland China, we strongly recommend you to use our model from 🤖ModelScope


Deployment

SGLang

SGLang enables dLLM inference either through offline batching or by launching an HTTP server for online requests. You can start the SGLang dLLM using the following commands:

python3 -m sglang.launch_server \
	  --model-path inclusionAI/LLaDA2.1-mini \
	  --dllm-algorithm JointThreshold \
	  --tp-size 1 \
	  --trust-remote-code \
	  --mem-fraction-static 0.8 \
	  --max-running-requests 1 \
	  --attention-backend flashinfer	

Enviroment Preparation

Pull Request (PR) has been submitted and merged to the SGLang community, please prepare the environment with the lateset version


🌐 License

This project is licensed under the terms of the Apache License 2.0.


🤝 Contact & Collaboration

For questions, collaborations, or feedback, please reach out via Hugging Face or open an issue in the repository.

👉 Join us in advancing open, efficient, and intelligent language models!

Author: inclusionAI

Likes: 11

Downloads: 0

Tags: transformers, safetensors, llada2_moe, text-generation, dllm, diffusion, llm, text_generation, conversational, custom_code, license:apache-2.0, region:us

iitolstykh/VIBE-Image-Edit-DistilledCFG


language:

  • en pipeline_tag: image-to-image tags:
  • image-editing
  • text-guided-editing
  • diffusion
  • sana
  • qwen-vl
  • multimodal
  • distilled
  • cfg-distillation base_model:
  • iitolstykh/VIBE-Image-Edit library_name: diffusers

VIBE: Visual Instruction Based Editor

<div align="center"> <img src="VIBE.png" width="800" alt="VIBE"/> </div> <p style="text-align: center;"> <div align="center"> </div> <p align="center"> <a href="https://riko0.github.io/VIBE"> 🌐 Project Page </a> | <a href="https://arxiv.org/abs/2601.02242"> 📜 Paper on arXiv </a> | <a href="https://github.com/ai-forever/vibe"> Github </a> | <a href="https://huggingface.co/spaces/iitolstykh/VIBE-Image-Edit-DEMO">🤗 Space | </a> <a href="https://huggingface.co/iitolstykh/VIBE-Image-Edit">🤗 VIBE-Image-Edit | </a> </p>

VIBE-DistilledCFG is a specialized version of the original VIBE-Image-Edit model.

This model can be run without classifier-free guidance, substantially reducing image generation time while maintaining high quality outputs.

Performance Comparison

Below is a comparison of total inference time between the original VIBE model (using CFG) and this DistilledCFG model (without CFG). The distillation process yields an approx 1.8x - 2x speedup.

| Resolution | Original Model (with CFG) | DistilledCFG Model (No CFG) | | :--- | :--- | :--- | | 1024x1024 | 1.1453s | 0.6389s | | 2048x2048 | 4.0837s | 1.9687s |

Model Details

  • Name: VIBE-DistilledCFG
  • Parent Model: iitolstykh/VIBE-Image-Edit
  • Task: Text-Guided Image Editing
  • Architecture:
    • Diffusion Backbone: Sana1.5 (1.6B parameters) with Linear Attention.
    • Condition Encoder: Qwen3-VL (2B parameters).
  • Technique: Classifier-Free Guidance (CFG) Distillation.
  • Model precision: torch.bfloat16 (BF16)
  • Model resolution: Optimized for up to 2048px images.

Features

  • Blazing Fast Inference: Runs approximately 2x faster than the original model by skipping the guidance pass.
  • Text-Guided Editing: Edit images using natural language instructions.
  • Compact & Efficient: Retains the lightweight footprint of the original 1.6B/2B architecture.
  • Multimodal Understanding: Powered by Qwen3-VL for precise instruction following.
  • Text-to-Image support.

Inference Requirements

  • vibe library
pip install git+https://github.com/ai-forever/VIBE
  • requirements for vibe library:
pip install transformers==4.57.1 torchvision==0.21.0 torch==2.6.0 diffusers==0.33.1 loguru==0.7.3

Quick start

Note: When using this distilled model, please set image_guidance_scale and guidance_scale to 0.0 to disable CFG.

from PIL import Image
import requests
from io import BytesIO
from huggingface_hub import snapshot_download

from vibe.editor import ImageEditor

# Download model
model_path = snapshot_download(
    repo_id="iitolstykh/VIBE-Image-Edit-DistilledCFG",
    repo_type="model",
)

# Load model
# Note: Guidance scales are removed for the distilled version
editor = ImageEditor(
    checkpoint_path=model_path,
    num_inference_steps=20,
    image_guidance_scale=0.0,
    guidance_scale=0.0,
    device="cuda:0",
)

# Download test image
resp = requests.get('https://image.civitai.com/xG1nkqKTMzGDvpLrqFT7WA/3f58a82a-b4b4-40c3-a318-43f9350fcd02/original=true,quality=90/115610275.jpeg')
image = Image.open(BytesIO(resp.content))

# Generate edited image
edited_image = editor.generate_edited_image(
    instruction="let this case swim in the river",
    conditioning_image=image,
    num_images_per_prompt=1,
)[0]

edited_image.save(f"edited_image.jpg", quality=100)

License

This project is built upon the SANA. Please refer to the original SANA license for usage terms: SANA License

Citation

If you use this model in your research or applications, please acknowledge the original projects:

@misc{vibe2026,
  Author = {Grigorii Alekseenko and Aleksandr Gordeev and Irina Tolstykh and Bulat Suleimanov and Vladimir Dokholyan and Georgii Fedorov and Sergey Yakubson and Aleksandra Tsybina and Mikhail Chernyshov and Maksim Kuprashevich},
  Title = {VIBE: Visual Instruction Based Editor},
  Year = {2026},
  Eprint = {arXiv:2601.02242},
}

Author: iitolstykh

Likes: 7

Downloads: 0

Tags: diffusers, safetensors, image-editing, text-guided-editing, diffusion, sana, qwen-vl, multimodal, distilled, cfg-distillation, image-to-image, en, arxiv:2601.02242, base_model:iitolstykh/VIBE-Image-Edit, base_model:finetune:iitolstykh/VIBE-Image-Edit, diffusers:VIBESanaEditingPipeline, region:us

worstplayer/acestep15_drumnbass_lora


base_model:

  • ACE-Step/Ace-Step1.5 pipeline_tag: text-to-audio tags:
  • music

This lora was made using 17 drum'n'bass tracks composed by myself.

It's an attempt to fix the model's inability to produce fast breakbeats. While it doesn't work 100% of the time, the result usually least vaguely resembles d'n'b

Example output: <audio controls src="https://cdn-uploads.huggingface.co/production/uploads/66edc6ec77590e90ba7c109c/BrjdCdS3mIE2ucOtIMLw6.mpga"></audio>

Author: worstplayer

Likes: 7

Downloads: 0

Tags: music, text-to-audio, base_model:ACE-Step/Ace-Step1.5, base_model:finetune:ACE-Step/Ace-Step1.5, region:us

inclusionAI/LLaDA2.1-flash


license: apache-2.0 library_name: transformers tags:

  • dllm
  • diffusion
  • llm
  • text_generation

LLaDA2.1-flash

LLaDA2.1-flash is a diffusion language model of the LLaDA series featuring the editing enhancement. It significantly improves inference speed while delivering strong task performance.

<div align="center"> <img src="https://mdn.alipayobjects.com/huamei_qa8qxu/afts/img/A*uOo8QKQMiBwAAAAAgNAAAAgAemJ7AQ/original" width="800" /> </div> <div align="center"> <img src="https://mdn.alipayobjects.com/huamei_qa8qxu/afts/img/A*biwvQpCmKjEAAAAAULAAAAgAemJ7AQ/original" width="800" /> </div> --- <table> <thead> <tr> <th align="left"><b>Benchmark</b></th> <th align="center"><b>Qwen3-30B-<br>A3B-Inst-2507</b><br><sub>(Score)</sub></th> <th align="center"><b>Ling-flash-2.0</b><br><br><sub>(Score)</sub></th> <th align="center"><b>LLaDA2.0-flash</b><br><br><sub>(Score | TPF)</sub></th> <th align="center"><b>LLaDA2.1-flash<br>(S Mode)</b><br><sub>(Score | TPF)</sub></th> <th align="center"><b>LLaDA2.1-flash<br>(Q Mode)</b><br><sub>(Score | TPF)</sub></th> </tr> </thead> <tbody> <tr> <td align="left"><b>Average</b></td> <td align="center">73.09</td> <td align="center">71.52</td> <td align="center">72.43 | 3.08</td> <td align="center">72.34 | 5.93</td> <td align="center">73.54 | 3.64</td> </tr> <tr> <td colspan="6" align="center"><b>Knowledge</b></td> </tr> <tr> <td align="left">GPQA</td> <td align="center">54.14</td> <td align="center">69.16</td> <td align="center">62.31 | 3.29</td> <td align="center">66.67 | 3.95</td> <td align="center">67.30 | 2.37</td> </tr> <tr> <td align="left">MMLU-Pro</td> <td align="center">74.21</td> <td align="center">77.55</td> <td align="center">74.79 | 2.36</td> <td align="center">75.31 | 4.43</td> <td align="center">76.59 | 2.62</td> </tr> <tr> <td align="left">C-EVAL</td> <td align="center">88.12</td> <td align="center">87.54</td> <td align="center">85.21 | 1.90</td> <td align="center">86.93 | 2.71</td> <td align="center">86.71 | 1.75</td> </tr> <tr> <td align="left">PHYBench</td> <td align="center">29.84</td> <td align="center">27.67</td> <td align="center">30.06 | 2.70</td> <td align="center">26.04 | 4.10</td> <td align="center">28.23 | 2.66</td> </tr> <tr> <td align="left">TriviaQA</td> <td align="center">65.61</td> <td align="center">69.76</td> <td align="center">66.88 | 1.94</td> <td align="center">72.55 | 4.30</td> <td align="center">72.93 | 2.92</td> </tr> <tr> <td colspan="6" align="center"><b>Reasoning</b></td> </tr> <tr> <td align="left">BIG-Bench Hard</td> <td align="center">85.54</td> <td align="center">89.36</td> <td align="center">86.75 | 2.66</td> <td align="center">87.82 | 5.61</td> <td align="center">88.69 | 3.28</td> </tr> <tr> <td align="left">BIG-Bench Extra Hard</td> <td align="center">37.80</td> <td align="center">23.24</td> <td align="center">27.86 | 4.60</td> <td align="center">33.51 | 5.04</td> <td align="center">35.77 | 3.17</td> </tr> <tr> <td align="left">bbh-zh</td> <td align="center">86.18</td> <td align="center">75.09</td> <td align="center">87.52 | 3.21</td> <td align="center">82.55 | 5.78</td> <td align="center">86.23 | 3.77</td> </tr> <tr> <td align="left">MuSR</td> <td align="center">79.15</td> <td align="center">82.72</td> <td align="center">82.72 | 1.70</td> <td align="center">80.10 | 2.90</td> <td align="center">79.84 | 1.85</td> </tr> <tr> <td align="left">ZebraLogic</td> <td align="center">90.97</td> <td align="center">87.60</td> <td align="center">82.30 | 2.74</td> <td align="center">84.20 | 5.80</td> <td align="center">88.90 | 3.26</td> </tr> <tr> <td align="left">PrOntoQA</td> <td align="center">97.12</td> <td align="center">97.88</td> <td align="center">96.50 | 2.64</td> <td align="center">95.00 | 9.23</td> <td align="center">97.00 | 5.73</td> </tr> <tr> <td align="left">PIQA</td> <td align="center">91.57</td> <td align="center">91.95</td> <td align="center">96.50 | 1.43</td> <td align="center">92.44 | 2.38</td> <td align="center">92.17 | 1.44</td> </tr> <tr> <td align="left">OCNLI</td> <td align="center">71.59</td> <td align="center">65.36</td> <td align="center">71.63 | 1.09</td> <td align="center">72.17 | 1.83</td> <td align="center">72.75 | 1.32</td> </tr> <tr> <td align="left">HellaSwag</td> <td align="center">86.31</td> <td align="center">81.59</td> <td align="center">84.97 | 1.26</td> <td align="center">85.60 | 2.31</td> <td align="center">85.31 | 1.51</td> </tr> <tr> <td align="left">KOR-Bench</td> <td align="center">69.2</td> <td align="center">69.44</td> <td align="center">63.04 | 3.44</td> <td align="center">62.80 | 4.97</td> <td align="center">65.12 | 2.77</td> </tr> <tr> <td align="left">DROP</td> <td align="center">87.57</td> <td align="center">88.32</td> <td align="center">87.90 | 2.26</td> <td align="center">87.55 | 5.40</td> <td align="center">87.86 | 2.53</td> </tr> <tr> <td align="left">SQuAD 2.0</td> <td align="center">89.51</td> <td align="center">81.32</td> <td align="center">90.00 | 3.10</td> <td align="center">90.65 | 5.01</td> <td align="center">90.80 | 3.90</td> </tr> <tr> <td colspan="6" align="center"><b>Coding</b></td> </tr> <tr> <td align="left">LiveCodeBench</td> <td align="center">46.42</td> <td align="center">52.48</td> <td align="center">42.51 | 4.23</td> <td align="center">44.05 | 6.48</td> <td align="center">45.37 | 3.80</td> </tr> <tr> <td align="left">CRUXEval-O</td> <td align="center">86.75</td> <td align="center">82.75</td> <td align="center">85.12 | 3.21</td> <td align="center">85.25 | 6.54</td> <td align="center">87.50 | 3.80</td> </tr> <tr> <td align="left">MBPP+</td> <td align="center">78.21</td> <td align="center">80.89</td> <td align="center">79.37 | 4.02</td> <td align="center">76.72 | 10.43</td> <td align="center">77.25 | 5.96</td> </tr> <tr> <td align="left">HumanEval+</td> <td align="center">87.88</td> <td align="center">87.58</td> <td align="center">88.41 | 6.45</td> <td align="center">89.63 | 13.81</td> <td align="center">89.63 | 9.18</td> </tr> <tr> <td align="left">MultiPL-E</td> <td align="center">70.67</td> <td align="center">65.76</td> <td align="center">74.87 | 3.14</td> <td align="center">70.89 | 7.77</td> <td align="center">73.34 | 4.33</td> </tr> <tr> <td align="left">BigCodeBench-Full</td> <td align="center">41.49</td> <td align="center">40.70</td> <td align="center">41.58 | 3.33</td> <td align="center">37.11 | 8.51</td> <td align="center">39.21 | 4.70</td> </tr> <tr> <td align="left">BIRD-SQL</td> <td align="center">47.75</td> <td align="center">47.49</td> <td align="center">45.76 | 2.16</td> <td align="center">42.18 | 5.09</td> <td align="center">44.04 | 2.95</td> </tr> <tr> <td align="left">Spider</td> <td align="center">81.79</td> <td align="center">80.58</td> <td align="center">82.49 | 4.42</td> <td align="center">79.18 | 8.74</td> <td align="center">81.04 | 5.70</td> </tr> <tr> <td colspan="6" align="center"><b>Math</b></td> </tr> <tr> <td align="left">AIME 2025</td> <td align="center">61.88</td> <td align="center">55.89</td> <td align="center">60.00 | 4.57</td> <td align="center">63.33 | 5.36</td> <td align="center">63.33 | 3.46</td> </tr> <tr> <td align="left">OlympiadBench</td> <td align="center">77.59</td> <td align="center">76.19</td> <td align="center">74.07 | 3.70</td> <td align="center">75.85 | 6.46</td> <td align="center">76.59 | 3.81</td> </tr> <tr> <td align="left">GSM-Plus</td> <td align="center">89.41</td> <td align="center">89.71</td> <td align="center">89.74 | 2.68</td> <td align="center">89.23 | 7.14</td> <td align="center">89.69 | 3.83</td> </tr> <tr> <td align="left">CMATH</td> <td align="center">96.58</td> <td align="center">96.52</td> <td align="center">96.90 | 2.17</td> <td align="center">96.54 | 4.84</td> <td align="center">96.63 | 2.65</td> </tr> <tr> <td align="left">Omni-MATH</td> <td align="center">54.00</td> <td align="center">53.00</td> <td align="center">50.30 | 3.39</td> <td align="center">52.30 | 6.01</td> <td align="center">54.10 | 3.50</td> </tr> <tr> <td colspan="6" align="center"><b>Agent & Alignment</b></td> </tr> <tr> <td align="left">IFEval-strict-prompt</td> <td align="center">83.73</td> <td align="center">81.15</td> <td align="center">82.62 | 1.47</td> <td align="center">83.36 | 2.24</td> <td align="center">83.55 | 1.41</td> </tr> <tr> <td align="left">BFCL v3</td> <td align="center">73.41</td> <td align="center">67.69</td> <td align="center">74.94 | 4.87</td> <td align="center">74.86 | 9.24</td> <td align="center">75.61 | 6.76</td> </tr> <tr> <td align="left">Nexus FC</td> <td align="center">49.93</td> <td align="center">36.25</td> <td align="center">50.45 | 5.53</td> <td align="center">44.83 | 11.29</td> <td align="center">47.65 | 7.38</td> </tr> </tbody> </table>

🚀 Highlights

  • Error-Correcting Editable: Structural innovation of editable generation for dLLM
  • Speedy vs Quality Mode: The 100B flash model achieves ultra-fast inference under Speed Mode while remaining competitive across various tasks and under Quality Mode.
  • Reinforcement Learning on 100B-scale dLLM: Tailored algorithm and framework to enable reinforcement learning for large dLLM.

🗺️ What's Next

  • Powerful Agentic/Tool Use Capability with LLaDA: Next update will be equipped with powerful Agentic and long-distance tool-use capability.
  • Extreme Editing: Next update will feature stronger and more extensive editing capabilities, aimed at correcting more errors in parallel reasoning.
  • Explore More Training Paradigms: We want to explore more training paradigms than SFT and RL for dLLM.

📦 Model Variants

| Model ID | Description | Hugging Face Link | | --- | --- | --- | | inclusionAI/LLaDA2.1-mini | Instruction-tuned model, ready for downstream applications. | 🤗 Model Card | | inclusionAI/LLaDA2.1-flash | Instruction-tuned model, ready for downstream applications. | 🤗 Model Card |


🔍 Model Overview

LLaDA2.1-flash has the following specifications:

  • Type: Mixture-of-Experts (MoE) Diffusion Language Model
  • Total Parameters (Non-Embedding): 100B
  • Number of Layers: 32
  • Attention Heads: 32
  • Context Length: 32,768 tokens
  • Position Embedding: Rotary (RoPE)
  • Vocabulary Size: 157,184

🤗 Hugging Face Transformers

Make sure you have transformers and its dependencies installed:

import torch
import torch.nn.functional as F
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "/path/to/LLaDA2.1-flash"
device = "auto"
model = AutoModelForCausalLM.from_pretrained(
    model_path, trust_remote_code=True, device_map=device,
)
model = model.to(torch.bfloat16)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

prompt = """Calculate 1+5-28*0.5-200=?"""
input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    tokenize=True,
    return_tensors="pt",
)
generated_tokens = model.generate(
    inputs=input_ids,
    eos_early_stop=True,
    gen_length=512,
    block_length=32,
    threshold=0.5,
    editing_threshold=0,
    temperature=0.0,
)
generated_answer = tokenizer.decode(
    generated_tokens[0],
    skip_special_tokens=True,
)
print(generated_answer)

Multi-block Editing inference comming soon.

Best Practices

To achieve optimal performance, we recommend the following settings:

  1. Sampling Parameters: We recommend the following general sampling parameters: block_length=32, temperature=0.0, top_p=None and top_k=None. We are currently exploring more diverse sampling configurations.

  2. Denoising Thresholds: There are three denoising params: threshold, editing_threshold and max_post_steps. We recommend threshold=0.7, editing_threshold=0.5 for Quality Mode and threshold=0.5, editing_threshold=0.0 for Speed Mode. For both modes, we suggest setting max_post_steps to a value greater than 5. We recommend 16 as a balanced default, which was used for most of our internal testing.

Note: Low threshold may causes stuttering in trade-off for quick inference.

  1. Adequate Output Length: We recommend using an output length of 16384 tokens for most scenarios.

🤖ModelScope

If you're in mainland China, we strongly recommend you to use our model from 🤖ModelScope


Deployment

SGLang

SGLang enables dLLM inference either through offline batching or by launching an HTTP server for online requests. You can start the SGLang dLLM using the following commands:

python3 -m sglang.launch_server \
	  --model-path inclusionAI/LLaDA2.1-flash \
	  --dllm-algorithm JointThreshold \
	  --tp-size 4 \
	  --trust-remote-code \
	  --mem-fraction-static 0.8 \
	  --max-running-requests 1 \
	  --attention-backend flashinfer	

Enviroment Preparation

Pull Request (PR) has been submitted and merged to the SGLang community, please prepare the environment with the lateset version


🌐 License

This project is licensed under the terms of the Apache License 2.0.


🤝 Contact & Collaboration

For questions, collaborations, or feedback, please reach out via Hugging Face or open an issue in the repository.

👉 Join us in advancing open, efficient, and intelligent language models!

Author: inclusionAI

Likes: 4

Downloads: 0

Tags: transformers, safetensors, llada2_moe, text-generation, dllm, diffusion, llm, text_generation, conversational, custom_code, license:apache-2.0, region:us

tsukemono/neuTTS-JP-150m


language:

  • ja library_name: transformers pipeline_tag: text-to-speech license: cc-by-nc-4.0 datasets:
  • amphion/Emilia-Dataset tags:
  • tts
  • audio
  • japanese base_model:
  • llm-jp/llm-jp-3-150m

neuTTS-JP-150m

日本語専用TTSモデルです

トークナイザーを大幅に修正している関係で日本語以外はしゃべません

150Mパラメータでのvoice cloneingを目指しています

ストリーミング再生を意識しており、無音部分としてpaddingトークンを導入しています

推論時はpaddingトークンを無視するか、そこで一区切りさせるなど、工夫を行ってください

インストール

pip install torch torchaudio transformers neucodec pyopenjtalk

推論

from pathlib import Path

import torch
import torchaudio
from torchaudio import transforms as T
from transformers import AutoModelForCausalLM, AutoTokenizer
from neucodec import NeuCodec

# model読み込み
tokenizer = AutoTokenizer.from_pretrained("tsukemono/neuTTS-JP-150m", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("tsukemono/neuTTS-JP-150m")
model.eval()
codec = codec.eval()

# 参照音声のエンコード
waveform, sr = torchaudio.load("参照音源.mp3")
if waveform.shape[0] > 1:
    waveform = waveform.mean(dim=0, keepdim=True)
if sr != 16_000:
    waveform = T.Resample(sr, 16_000)(waveform)
waveform = waveform.unsqueeze(0)  # (B, 1, T_16k)
with torch.inference_mode():
    ref_codes = codec.encode_code(waveform).flatten().tolist()

# テキストをトークナイズしてプロンプト作成
text_ids = tokenizer(
    "ここに作成したいテキストを書いてください",
    add_special_tokens=False,
    return_attention_mask=False,
    return_token_type_ids=False,
)["input_ids"]

eos_id = int(tokenizer.eos_token_id)
input_ids = ref_codes + [eos_id] + text_ids + [eos_id]
input_ids = torch.tensor([input_ids], dtype=torch.long)

# 生成
with torch.inference_mode():
    generated = model.generate(
        input_ids=input_ids,
        repetition_penalty=1.1,
        max_new_tokens=1500,
    )

# 生成トークンから音声トークンだけ抽出
gen_ids = generated[0, input_ids.shape[1] :]
gen_ids = gen_ids[gen_ids < 65536]

# vocoderで音声作成 
with torch.inference_mode():
    audio_data = codec.decode_code(gen_ids.unsqueeze(0).unsqueeze(0)).cpu()
torchaudio.save("output.mp3", audio_data[0], 24_000, format="mp3")

サンプル音声

<audio controls src="https://huggingface.co/tsukemono/neuTTS-JP-150m/resolve/main/inference_sample/sample_output_1.mp3?download=true"></audio> <audio controls src="https://huggingface.co/tsukemono/neuTTS-JP-150m/resolve/main/inference_sample/sample_output_2.mp3?download=true"></audio> <audio controls src="https://huggingface.co/tsukemono/neuTTS-JP-150m/resolve/main/inference_sample/sample_output_3.mp3?download=true"></audio> <audio controls src="https://huggingface.co/tsukemono/neuTTS-JP-150m/resolve/main/inference_sample/sample_output_4.mp3?download=true"></audio> <audio controls src="https://huggingface.co/tsukemono/neuTTS-JP-150m/resolve/main/inference_sample/sample_output_5.mp3?download=true"></audio>

Author: tsukemono

Likes: 4

Downloads: 0

Tags: transformers, safetensors, llama, text-generation, tts, audio, japanese, text-to-speech, ja, dataset:amphion/Emilia-Dataset, base_model:llm-jp/llm-jp-3-150m, base_model:finetune:llm-jp/llm-jp-3-150m, license:cc-by-nc-4.0, text-generation-inference, endpoints_compatible, region:us

artificialguybr/LOGO-REDMOND-ZIMAGETURBO


tags:

  • text-to-image
  • lora
  • diffusers
  • template:diffusion-lora widget:
  • output: url: images/z_deturbo_logo_09.png text: '-'
  • output: url: images/z_deturbo_logo_08.png text: '-'
  • output: url: images/z_deturbo_logo_02.png text: '-'
  • output: url: images/z_deturbo_logo_01.png text: '-' base_model: Tongyi-MAI/Z-Image-Turbo instance_prompt: Logo, logoredmaf license: apache-2.0

LOGO REDMOND Z IMAGE

<Gallery />

Model description

Z Image Turbo - Logo

These are LoRA adapter weights for Z Image Turbo

Acknowledgment

I'm grateful for the GPU time from Redmond.AI that allowed me to make this model!

Description

This LoRA adapter has been trained specifically for generating logo style artwork. It provides clean, professional logo designs with excellent quality and consistent style.

The model works exceptionally well with Z Image Turbo and can be used to create various types of logo illustrations, from character art to more detailed designs.

Trigger words

Use `LOGO, logoredmaf` to activate the LoRA style and generate logo images.

How to Use

I recommend using this LoRA with ComfyUI for the best results.


Support & Links

Support My Work

If you find this model useful, please consider supporting my work:

  • Patreon: https://www.patreon.com/artificialguybr
  • Ko-fi: https://ko-fi.com/artificialguybr
  • Buy Me a Coffee: https://buymeacoffee.com/jvkape

Follow Me

Acknowledgment

GPU time provided by Redmond.AI

Trigger words

You should use Logo to trigger the image generation.

You should use logoredmaf to trigger the image generation.

Download model

Download them in the Files & versions tab.

Author: artificialguybr

Likes: 3

Downloads: 0

Tags: diffusers, text-to-image, lora, template:diffusion-lora, base_model:Tongyi-MAI/Z-Image-Turbo, base_model:adapter:Tongyi-MAI/Z-Image-Turbo, license:apache-2.0, region:us

KlingTeam/MultiShotMaster

MultiShotMaster: A Controllable Multi-Shot Video Generation Framework

<!-- ### <div align="center"> SIGGRAPH Asia 2025 </div> --> <div align="center"> <p> <a href="https://qinghew.github.io/">Qinghe Wang</a><sup>1</sup> <a href="https://xiaoyushi97.github.io/">Xiaoyu Shi</a><sup>2✉</sup> <a href="https://libaolu312.github.io/">Baolu Li</a><sup>1</sup> <a href="https://wkbian.github.io/">Weikang Bian</a><sup>3</sup> <a href="https://liuquande.github.io/">Quande Liu</a><sup>2</sup> <a href="https://scholar.google.com/citations?user=D3nE0agAAAAJ&hl=zh-CN&oi=ao">Huchuan Lu</a><sup>1</sup> <br> <a href="https://xinntao.github.io/">Xintao Wang</a><sup>2</sup> <a href="https://magicwpf.github.io/">Pengfei Wan</a><sup>2</sup> <a href="https://scholar.google.com/citations?user=PXO4ygEAAAAJ&hl=zh-CN">Kun Gai</a><sup>2</sup> <a href="https://stephenjia.github.io/">Xu Jia</a><sup>1✉</sup> </p> <p> <sup>1</sup>Dalian University of Technology &nbsp;&nbsp; <sup>2</sup>Kling Team, Kuaishou Technology<br> <sup>3</sup>The Chinese University of Hong Kong &nbsp;&nbsp; <!-- <sup>3</sup>HKUST(GZ) &nbsp;&nbsp; --> <sup>✉</sup>Corresponding author </p> </div> <p align="center"> <a href='https://qinghew.github.io/MultiShotMaster/'><img src='https://img.shields.io/badge/Project-Page-Green'></a> &nbsp; <a href="https://arxiv.org/abs/2512.03041"><img src="https://img.shields.io/static/v1?label=Arxiv&message=MultiShotMaster&color=red&logo=arxiv"></a> &nbsp; <a href='https://huggingface.co/KlingTeam/MultiShotMaster'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-orange'></a> </p>

Please refer to the Github README for usage.

📌 TL;DR

MultiShotMaster is a controllable multi-shot narrative video generation framework that supports 1) text-driven inter-shot consistency, 2) variable shot counts and shot durations, 3) customized subject with motion control, and 4) background-driven customized scene.

🌟 Citation

Please leave us a star 🌟 and cite our paper if you find our work helpful.

@article{wang2025multishotmaster,
  title={MultiShotMaster: A Controllable Multi-Shot Video Generation Framework},
  author={Wang, Qinghe and Shi, Xiaoyu and Li, Baolu and Bian, Weikang and Liu, Quande and Lu, Huchuan and Wang, Xintao and Wan, Pengfei and Gai, Kun and Jia, Xu},
  journal={arXiv preprint arXiv:2512.03041},
  year={2025}
}

Author: KlingTeam

Likes: 3

Downloads: 0

Tags: arxiv:2512.03041, region:us

Andycurrent/Gemma-3-4B-VL-it-Gemini-Pro-Heretic-Uncensored-Thinking_GGUF


license: apache-2.0 language:

  • en tags:
  • vision-language
  • multimodal
  • uncensored
  • gguf
  • text-generation
  • image-understanding base_model:
  • Gemma-3-4B

Gemma-3-4B-VL-it-Gemini-Pro-Heretic-Uncensored-Thinking-GGUF

This repository contains Gemma-3-4B-VL-it-Gemini-Pro-Heretic-Uncensored-Thinking-GGUF, a 4B-parameter vision-language instruction-tuned model provided in GGUF format for efficient local inference. The model is designed for open-ended reasoning, multimodal understanding, and minimal alignment constraints, making it suitable for experimentation, research, and advanced local deployments.


Model Summary

  • Model ID: Gemma-3-4B-VL-it-Gemini-Pro-Heretic-Uncensored-Thinking-GGUF
  • Architecture: Gemma 3 (4B parameters)
  • Type: Vision-Language (Text + Image)
  • Format: GGUF
  • Publisher: mradermacher
  • License: Apache 2.0 (inherits from base model)

Key Characteristics

  • Multimodal input support (text + images)
  • Instruction-tuned for conversational and reasoning tasks
  • Reduced content filtering and alignment constraints
  • Optimized for local inference runtimes
  • Suitable for research, exploration, and advanced user workflows

⚠️ This model is uncensored. Outputs may include sensitive or unfiltered content. Use responsibly.


Supported Use Cases

Text-Based

  • Conversational assistants
  • Creative writing and storytelling
  • Summarization and rewriting
  • General reasoning and analysis

Vision + Text

  • Image captioning
  • Visual question answering
  • Scene and object understanding
  • Multimodal reasoning tasks

GGUF Compatibility

This model can be used with GGUF-compatible runtimes such as:

  • llama.cpp
  • Ollama (GGUF-based builds)
  • Other local inference engines supporting GGUF

Performance and supported features may vary depending on runtime and hardware.


Basic Usage Example

Command Line (llama.cpp-style)

./main \
  -m Andycurrent/Gemma-3-4B-VL-it-Gemini-Pro-Heretic-Uncensored-Thinking_GGUF_F16.gguf \
  -p "Describe the key idea behind multimodal AI models."

Usage Notes

  • Provide clear, explicit prompts for best results
  • When using images, ensure proper formatting and resolution
  • Add moderation or filtering layers if deploying in public-facing applications

Ethical Considerations

Due to its uncensored nature:

  • Not recommended for unrestricted public deployment
  • Should not be used in safety-critical environments
  • Users are responsible for compliance with applicable laws and policies

Acknowledgements

  • Gemma base model contributors
  • Open-source inference and quantization communities
  • Tools and runtimes enabling efficient local LLM deployment

Author: Andycurrent

Likes: 3

Downloads: 0

Tags: gguf, vision-language, multimodal, uncensored, text-generation, image-understanding, en, license:apache-2.0, endpoints_compatible, region:us, conversational

artificialguybr/Logo-Redmond-QwenImage2512


tags:

  • text-to-image
  • lora
  • diffusers
  • template:diffusion-lora widget:
  • output: url: images/1769929799857__000001000_1.jpg text: '-'
  • output: url: images/1769927369815__000000500_2.jpg text: '-'
  • output: url: images/1769929753673__000001000_0.jpg text: '-'
  • output: url: images/qwen_logo_09.png text: '-'
  • output: url: images/qwen_logo_06.png text: '-'
  • output: url: images/qwen_logo_04.png text: '-'
  • output: url: images/qwen_logo_01.png text: '-' base_model: Qwen/Qwen-Image-2512 instance_prompt: LOGO, logoredmaf license: apache-2.0

Logo Redmond

<Gallery />

Model description

Qwen Image 2512 - Logo

These are LoRA adapter weights for Qwen Image.

Acknowledgment

I'm grateful for the GPU time from Redmond.AI that allowed me to make this model!

Description

This LoRA adapter has been trained specifically for generating logo style artwork. It provides clean, professional logo designs with excellent quality and consistent style.

The model works exceptionally well with Qwen Image and can be used to create various types of logo illustrations, from character art to more detailed designs.

Trigger words

Use `LOGO, logoredmaf` to activate the LoRA style and generate logo images.

How to Use

I recommend using this LoRA with ComfyUI for the best results.


Support & Links

Support My Work

If you find this model useful, please consider supporting my work:

  • Patreon: https://www.patreon.com/artificialguybr
  • Ko-fi: https://ko-fi.com/artificialguybr
  • Buy Me a Coffee: https://buymeacoffee.com/jvkape

Follow Me

Acknowledgment

GPU time provided by Redmond.AI

Trigger words

You should use LOGO to trigger the image generation.

You should use logoredmaf to trigger the image generation.

Download model

Download them in the Files & versions tab.

Author: artificialguybr

Likes: 2

Downloads: 0

Tags: diffusers, text-to-image, lora, template:diffusion-lora, base_model:Qwen/Qwen-Image-2512, base_model:adapter:Qwen/Qwen-Image-2512, license:apache-2.0, region:us

artificialguybr/Logo-Redmond-FLUXKLEIN9B


tags:

  • text-to-image
  • lora
  • diffusers
  • template:diffusion-lora widget:
  • output: url: images/flux_logo_10.png text: '-'
  • output: url: images/flux_logo_09.png text: '-'
  • output: url: images/flux_logo_07.png text: '-'
  • output: url: images/flux_logo_06.png text: '-'
  • output: url: images/flux_logo_05.png text: '-'
  • output: url: images/flux_logo_04.png text: '-'
  • output: url: images/flux_logo_01.png text: '-'
  • output: url: images/1770488169368__000001500_2.jpg text: '-'
  • output: url: images/1770488136221__000001500_1.jpg text: '-'
  • output: url: images/1770485669345__000000750_2.jpg text: '-'
  • output: url: images/1770485636186__000000750_1.jpg text: '-'
  • output: url: images/1770484851169__000000500_2.jpg text: '-' base_model: black-forest-labs/FLUX.2-klein-9B instance_prompt: LOGO, logoredmaf license: apache-2.0

LOGO REDMOND

<Gallery />

Model description

FLUX.2 Klein 9B - Logo Redmond

These are LoRA adapter weights for FLUX.2 Klein 9B.

Acknowledgment

I'm grateful for the GPU time from Redmond.AI that allowed me to make this model!

Description

This LoRA adapter has been trained specifically for generating logo style artwork. It provides clean, professional logo designs with excellent quality and consistent style.

The model works exceptionally well with FLUX.2 Klein 9B and can be used to create various types of logo illustrations, from character art to more detailed designs.

Trigger words

Use `LOGO, logoredmaf` to activate the LoRA style and generate logo images.

How to Use

I recommend using this LoRA with ComfyUI for the best results.

Support & Links

Support My Work

If you find this model useful, please consider supporting my work:

  • Patreon: https://www.patreon.com/artificialguybr
  • Ko-fi: https://ko-fi.com/artificialguybr
  • Buy Me a Coffee: https://buymeacoffee.com/jvkape

Follow Me

Acknowledgment

GPU time provided by Redmond.AI

Trigger words

You should use LOGO to trigger the image generation.

You should use logoredmaf to trigger the image generation.

Download model

Download them in the Files & versions tab.

Author: artificialguybr

Likes: 2

Downloads: 0

Tags: diffusers, text-to-image, lora, template:diffusion-lora, base_model:black-forest-labs/FLUX.2-klein-9B, base_model:adapter:black-forest-labs/FLUX.2-klein-9B, license:apache-2.0, region:us