Model selection
Pick the right Gemini image model for your use case. Each has distinct strengths and cost profiles.
| Model | Best for | Resolution | Tradeoffs |
|---|---|---|---|
| Nano Banana 1 (NB1) | Fast drafts, thumbnails, batch generation, social media assets | 512px default | Fastest and cheapest. Lower detail on faces and text. |
| Nano Banana 2 (NB2) | Editorial illustrations, headshots, text-in-image, brand assets | Up to 1024px | Better detail and text rendering. 2-3x slower than NB1. |
| Gemini Pro (image mode) | Hero images, print-quality output, complex multi-subject scenes | Up to 2048px | Highest quality. 5-10x cost of NB1. Rate-limited. |
Visual grounding with Google Search
Ground your prompts in real-world references so the model produces images with accurate details instead of hallucinated ones.
How it works
When you enable Google Search grounding, the model cross-references your prompt against real images and descriptions. This means asking for "the Brooklyn Bridge at sunset" produces the actual bridge, not a generic suspension bridge.
When to use it
Use grounding for real places, public figures, branded objects, or anything where accuracy matters. Skip it for abstract art, fictional scenes, or stylized illustrations where creative freedom is the point.
Cost optimization
The 512px batch-to-upscale workflow cuts costs by 60-80% compared to generating at full resolution.
Generate at 512px with NB1
Run your batch at the cheapest resolution. Review thumbnails and discard the ones that miss the mark.
Pick your winners
Select the 2-3 best compositions from your batch. You've spent pennies so far.
Upscale with NB2 or Pro
Regenerate only the winners at full resolution. You pay the higher cost only for images you'll use.
Parameters reference
Key parameters that control output quality, dimensions, and generation behavior.
| Parameter | Options | Notes |
|---|---|---|
| Resolution | 512, 768, 1024, 2048 | Higher = more detail, more cost, slower. 512 is fine for drafts. |
| Aspect ratio | 1:1, 16:9, 9:16, 4:3, 3:4 | Match your output format. 16:9 for headers, 9:16 for stories, 1:1 for avatars. |
| Thinking mode | On / Off | When on, the model reasons about composition before generating. Better results for complex scenes. |
| Number of images | 1-4 per request | Batch within a single request to reduce overhead. 4 images at 512px costs less than 1 at 2048px. |
Prompt recipes
Tested prompt patterns for common creative tasks. Copy, adapt, and combine.
3D selfie
Turn a flat photo into a stylized 3D character render. Works best with clear face shots and simple backgrounds.
Anime-to-photo
Convert illustrated or anime-style characters into photorealistic portraits. Grounding helps with clothing and hair accuracy.
Historical street view
Recreate a specific street or location as it looked in a past decade. Grounding + era-specific details in the prompt are key.
Crayon filter
Apply a hand-drawn crayon aesthetic to any scene. Good for editorial illustrations and kid-friendly content.
Comic strip
Generate multi-panel comic layouts with consistent characters. Specify panel count and describe each scene separately in the prompt.
Known limitations
Current constraints to keep in mind when planning image generation workflows.
Text rendering
NB1 struggles with legible text in images. NB2 handles short words and labels. For anything longer than 3-4 words, composite text in post-production.
Hands and fingers
All models can produce anatomically incorrect hands. Pro is the most reliable. For important shots, generate multiple variants and pick the best.
Character consistency
Maintaining the same character across multiple images is unreliable. Use reference images and detailed descriptions, but expect variation.
Safety filters
Prompts mentioning public figures, violence, or sensitive topics may be blocked. Rephrase with descriptive language instead of proper nouns when this happens.
Install the skill
1 Clone the repository
git clone https://github.com/jamditis/claude-skills-journalism.git
2 Copy skill to Claude skills directory
cp -r nano-banana-image-gen ~/.claude/skills/
3 Start generating
Ask Claude to generate images with Nano Banana, compare models, or optimize your prompt for visual grounding.
Based on @NanoBanana's guide to Gemini image generation (Mar 2026)
Adapted by Joe Amditis at the Center for Cooperative Media
Part of Claude Skills for Journalism • MIT License