AI and creative tools

Nano Banana
image generation

Choose the right Gemini image model, optimize costs with batch workflows, and write prompts that use visual grounding for real-world accuracy.

Model selection

Pick the right Gemini image model for your use case. Each has distinct strengths and cost profiles.

Model Best for Resolution Tradeoffs
Nano Banana 1 (NB1) Fast drafts, thumbnails, batch generation, social media assets 512px default Fastest and cheapest. Lower detail on faces and text.
Nano Banana 2 (NB2) Editorial illustrations, headshots, text-in-image, brand assets Up to 1024px Better detail and text rendering. 2-3x slower than NB1.
Gemini Pro (image mode) Hero images, print-quality output, complex multi-subject scenes Up to 2048px Highest quality. 5-10x cost of NB1. Rate-limited.

Visual grounding with Google Search

Ground your prompts in real-world references so the model produces images with accurate details instead of hallucinated ones.

How it works

When you enable Google Search grounding, the model cross-references your prompt against real images and descriptions. This means asking for "the Brooklyn Bridge at sunset" produces the actual bridge, not a generic suspension bridge.

When to use it

Use grounding for real places, public figures, branded objects, or anything where accuracy matters. Skip it for abstract art, fictional scenes, or stylized illustrations where creative freedom is the point.

Cost optimization

The 512px batch-to-upscale workflow cuts costs by 60-80% compared to generating at full resolution.

1

Generate at 512px with NB1

Run your batch at the cheapest resolution. Review thumbnails and discard the ones that miss the mark.

2

Pick your winners

Select the 2-3 best compositions from your batch. You've spent pennies so far.

3

Upscale with NB2 or Pro

Regenerate only the winners at full resolution. You pay the higher cost only for images you'll use.

Parameters reference

Key parameters that control output quality, dimensions, and generation behavior.

Parameter Options Notes
Resolution 512, 768, 1024, 2048 Higher = more detail, more cost, slower. 512 is fine for drafts.
Aspect ratio 1:1, 16:9, 9:16, 4:3, 3:4 Match your output format. 16:9 for headers, 9:16 for stories, 1:1 for avatars.
Thinking mode On / Off When on, the model reasons about composition before generating. Better results for complex scenes.
Number of images 1-4 per request Batch within a single request to reduce overhead. 4 images at 512px costs less than 1 at 2048px.

Prompt recipes

Tested prompt patterns for common creative tasks. Copy, adapt, and combine.

3D selfie

Turn a flat photo into a stylized 3D character render. Works best with clear face shots and simple backgrounds.

NB2 1024px

Anime-to-photo

Convert illustrated or anime-style characters into photorealistic portraits. Grounding helps with clothing and hair accuracy.

NB2 Grounding

Historical street view

Recreate a specific street or location as it looked in a past decade. Grounding + era-specific details in the prompt are key.

Pro Grounding

Crayon filter

Apply a hand-drawn crayon aesthetic to any scene. Good for editorial illustrations and kid-friendly content.

NB1 512px

Comic strip

Generate multi-panel comic layouts with consistent characters. Specify panel count and describe each scene separately in the prompt.

NB2 16:9

Known limitations

Current constraints to keep in mind when planning image generation workflows.

Text rendering

NB1 struggles with legible text in images. NB2 handles short words and labels. For anything longer than 3-4 words, composite text in post-production.

Hands and fingers

All models can produce anatomically incorrect hands. Pro is the most reliable. For important shots, generate multiple variants and pick the best.

Character consistency

Maintaining the same character across multiple images is unreliable. Use reference images and detailed descriptions, but expect variation.

Safety filters

Prompts mentioning public figures, violence, or sensitive topics may be blocked. Rephrase with descriptive language instead of proper nouns when this happens.

Install the skill

1 Clone the repository

git clone https://github.com/jamditis/claude-skills-journalism.git

2 Copy skill to Claude skills directory

cp -r nano-banana-image-gen ~/.claude/skills/

3 Start generating

Ask Claude to generate images with Nano Banana, compare models, or optimize your prompt for visual grounding.

Based on @NanoBanana's guide to Gemini image generation (Mar 2026)

Adapted by Joe Amditis at the Center for Cooperative Media

Part of Claude Skills for Journalism • MIT License