Poralus/Poralus-Image-1357
language:
- en license: creativeml-openrail-m tags:
- stable-diffusion
- stable-diffusion-diffusers
- text-to-image
- diffusers
- fine-tuned
- landscape
- photography pipeline_tag: text-to-image base_model: runwayml/stable-diffusion-v1-5 datasets:
- zh-plus/tiny-imagenet
- laion/laion-coco inference: true
Poralus-Image-1357
We are pleased to introduce Poralus-Image-1357, a fine-tuned text-to-image generation model built on top of Stable Diffusion v1.5. The model was developed and trained by Poralus with a focus on producing high-quality, atmospheric imagery with particular strength in natural environments, cinematic lighting, and compositional depth.
Training was conducted incrementally across multiple rounds, with each session building directly on the previous checkpoint rather than restarting from the base model. This approach preserves previously learned visual knowledge while progressively expanding the model's capabilities.
Key Characteristics
- Atmospheric Natural Landscapes — The model demonstrates strong capability in rendering outdoor environments including mountains, forests, coastlines, and open terrain with realistic lighting and mood.
- Cinematic Color Grading — Outputs consistently exhibit a distinctive color treatment, favoring warm golden tones, desaturated moody palettes, and dramatic pink-to-purple sky gradients.
- Compositional Framing — The model has developed a tendency toward natural frame-within-frame compositions, using rock arches, foliage, and architectural openings to direct depth and focus.
- Seasonal and Atmospheric Conditions — Fog, mist, golden hour, and overcast lighting are rendered with high fidelity and consistency.
Quick Start
Install the required library:
pip install diffusers transformers accelerate torch
Basic usage:
from diffusers import StableDiffusionPipeline
import torch
pipe = StableDiffusionPipeline.from_pretrained(
"Poralus/Poralus-Image-1357",
torch_dtype=torch.float16,
)
pipe = pipe.to("cuda")
prompt = "a misty forest path in autumn with golden leaves, cinematic lighting, atmospheric depth"
image = pipe(
prompt=prompt,
num_inference_steps=50,
guidance_scale=7.5,
height=512,
width=512,
).images[0]
image.save("output.png")
Recommended Settings
| Parameter | Recommended Value | Notes | |---|---|---| | num_inference_steps | 30 to 50 | Higher values produce sharper detail | | guidance_scale | 7.0 to 9.0 | Higher values adhere more strictly to the prompt | | Resolution | 512 x 512 | Native training resolution | | Negative prompt | low quality, blurry, oversaturated, flat lighting | Improves output consistency |
Prompt Guidelines
The model responds well to descriptive prompts that include environmental context, lighting conditions, and atmospheric cues. Generic prompts tend to produce competent but uncharacteristic results; prompts that lean into the model's learned aesthetic produce the strongest outputs.
Recommended prompt structure:
[subject or scene], [setting and environment], [lighting condition], [mood or atmosphere], [quality descriptor]
Effective examples:
a calm mountain lake at dusk, soft reflections on still water, overcast sky, moody and cinematic
a misty forest path in autumn, golden and amber leaves on the ground, fog through the trees, natural diffused light
a dramatic coastal landscape at sunset, crashing waves on dark rocks, warm pink and orange sky, wide angle
a vast open plain at twilight, distant treeline, dramatic gradient sky from deep blue to warm pink
Sample Outputs
Prompt: a snowy mountain peak at dawn with pink clouds
The model interpreted this as a view through a natural rock arch framing a deep purple and pink sky, with distant mountains on the horizon. The composition demonstrates the model's tendency to introduce natural framing elements.
Prompt: a tropical beach with turquoise water and palm trees
Output rendered as a view through a low palm canopy arch toward a turquoise ocean with a white sand shoreline. Color accuracy for water was high. The arch framing motif appeared again, consistent with the model's learned compositional style.
Prompt: a misty forest path in autumn with golden leaves
Strong prompt adherence. Output featured a central vanishing-point path through tall trees with full autumn foliage, ground covered in fallen leaves, and soft fog filling the mid-ground. Color grading was warm and accurate. This category represents the model's strongest performance domain.
Prompt: a dramatic thunderstorm over an open field
The model rendered this as a desaturated black-and-white river landscape under a heavy overcast sky, capturing the atmospheric mood rather than the literal subject. Demonstrates the model's tendency to interpret dramatic atmospheric prompts through a landscape lens.
Prompt: a calm lake reflecting the milky way at night
Output produced a moody mountain lake scene with a teal-grey color palette, dramatic peak reflections, and heavy cloud cover. The literal prompt element (milky way) was not rendered; the model defaulted to its learned atmospheric treatment of night and water scenes.
Training Details
| Parameter | Value | |---|---| | Base Model | runwayml/stable-diffusion-v1-5 | | Training Method | Full UNet fine-tune, multi-round continued training | | Optimizer | AdamW 8-bit (bitsandbytes) | | Learning Rate | 1e-5 to 5e-5, cosine annealing decay | | Batch Size | 1 (effective batch size 4 via gradient accumulation) | | Gradient Accumulation Steps | 4 | | Mixed Precision | fp16 | | Gradient Checkpointing | Enabled | | Resolution | 512 x 512 | | Hardware | NVIDIA Tesla T4 (16 GB VRAM) | | Training Duration | Multiple 1-hour sessions, each continuing from prior checkpoint |
Training Data
Images were drawn from the following sources and stored locally prior to training:
- zh-plus/tiny-imagenet — 200-class image dataset covering diverse object and scene categories
- laion/laion-coco — Aesthetically filtered web images paired with descriptive captions
- Supplementary curated pool — 600 images across 30 categories including natural landscapes, architecture, urban environments, portraits, food, and abstract subjects
Training round 2 shifted dataset emphasis toward human figures, indoor architecture, and urban scenes to address weaknesses identified during evaluation of round 1 outputs.
Known Limitations
- Human figures — Faces and body proportions lack the fine detail present in models specifically trained for portrait generation. Human subjects in complex poses may render with anatomical inconsistencies.
- Indoor and architectural interiors — Rooms, furniture, and constructed environments are rendered with less precision than outdoor scenes. Prompt adherence for interior subjects is lower than for landscapes.
- Literal prompt fidelity — The model frequently substitutes learned aesthetic patterns for literal prompt elements (e.g., interpreting "thunderstorm" as a moody greyscale landscape rather than active storm imagery).
- Text rendering — In-image text is not supported and will render as visual noise if requested.
- Resolution — The model was trained at 512 x 512. Generating at higher resolutions without tiling will produce degraded results.
License
This model is released under the CreativeML Open RAIL-M License. Use of this model is subject to the terms of that license, including restrictions on harmful, deceptive, and non-consensual content generation.
Citation
This model is built on Stable Diffusion v1.5. If you use it in published work, please cite the original Latent Diffusion Models paper:
Author: Poralus
Likes: 4
Downloads: 0
Tags: diffusers, safetensors, stable-diffusion, stable-diffusion-diffusers, text-to-image, fine-tuned, landscape, photography, en, dataset:zh-plus/tiny-imagenet, dataset:laion/laion-coco, base_model:runwayml/stable-diffusion-v1-5, base_model:finetune:runwayml/stable-diffusion-v1-5, license:creativeml-openrail-m, endpoints_compatible, diffusers:StableDiffusionPipeline, region:us

