A step-by-step implementation plan for product managers who want to prototype fast using Ollama + Google Gemma 4 locally, with Claude Code as the agentic coding engine. Tailored for Windows WSL2 and Linux machines.
| Model | Parameters | Context | Min VRAM/RAM | Best for PMs | Fit |
|---|---|---|---|---|---|
gemma4:e2b |
2.3B eff. | 128K | 5 GB | Quick idea drafts, ultra-low-spec laptops | Limited |
gemma4:e4b |
4.5B eff. | 128K | 8 GB | Most dev laptops β ideal starting point | Recommended |
gemma4:26b |
26B MoE | 256K | 18 GB | Complex multi-file prototypes, richer reasoning | Best quality |
gemma4:31b |
31B dense | 256K | 24 GB | Frontier intelligence locally β workstation/DGX | High-end only |
gemma4:e4b. It fits a standard dev laptop (16 GB RAM), supports 128K context, handles function calling natively, and runs inference under 2 seconds per token on modern hardware. Upgrade to 26B when you need richer multi-step reasoning.
winver)Open PowerShell as Administrator:
# Step 1: Install WSL2 + Ubuntu wsl --install # Reboot when prompted # Step 2: Set Ubuntu as default wsl --set-default Ubuntu # Step 3: Open Ubuntu terminal # All remaining steps run INSIDE WSL
Now inside your Ubuntu (WSL) terminal:
# Install Ollama inside WSL curl -fsSL https://ollama.com/install.sh | sh # Start the Ollama server ollama serve & # Verify it's running curl http://localhost:11434 # Should return: Ollama is running
~/projects/), not under /mnt/c/. This gives up to 20Γ faster I/O for Claude Code's file operations.
# Install Ollama curl -fsSL https://ollama.com/install.sh | sh # Enable as a system service (auto-start) sudo systemctl enable ollama sudo systemctl start ollama # Check status systemctl status ollama # Verify API endpoint curl http://localhost:11434 # Should return: Ollama is running
Optional: expose to LAN for team access:
# Override service environment sudo mkdir -p /etc/systemd/system/ollama.service.d sudo tee /etc/systemd/system/ollama.service.d/override.conf <<EOF [Service] Environment="OLLAMA_HOST=0.0.0.0" EOF sudo systemctl daemon-reload && sudo systemctl restart ollama
# For 8β16 GB RAM machines (recommended for PMs) ollama pull gemma4:e4b # For 18+ GB RAM / dedicated GPU machines ollama pull gemma4:26b # Check download & list installed models ollama list
~/.ollama/models/.ollama run gemma4:e4b # You'll get an interactive chat prompt # Type: "Write a Python Flask hello-world app" # Type: /bye to exit
# REST API test (OpenAI-compatible format) curl http://localhost:11434/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "gemma4:e4b", "messages": [ {"role": "user", "content": "Write a 3-step user story for a food delivery app"} ] }'
# Set keep-alive to prevent model unloading export OLLAMA_KEEP_ALIVE="-1" # keeps model loaded forever # Or add to ~/.bashrc / ~/.zshrc for persistence echo 'export OLLAMA_KEEP_ALIVE="-1"' >> ~/.bashrc source ~/.bashrc
# Step 1: Install NVM (Node Version Manager) curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash source ~/.bashrc # Step 2: Install Node.js 20 LTS nvm install 20 nvm use 20 node --version # Must show v20.x.x which node # Must show Linux path, NOT /mnt/c/ # Step 3: Configure npm to avoid sudo npm config set prefix '~/.npm-global' echo 'export PATH=~/.npm-global/bin:$PATH' >> ~/.bashrc source ~/.bashrc # Step 4: Install Claude Code npm install -g @anthropic-ai/claude-code # Step 5: Verify install claude --version
# Step 1: Install NVM curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.5/install.sh | bash source ~/.bashrc # Step 2: Install Node.js 20 LTS nvm install 20 && nvm use 20 node --version # Step 3: Configure npm prefix (no sudo) npm config set prefix '~/.npm-global' echo 'export PATH=~/.npm-global/bin:$PATH' >> ~/.bashrc source ~/.bashrc # Step 4: Install ripgrep (improves code search) sudo apt install ripgrep -y # Step 5: Install Claude Code npm install -g @anthropic-ai/claude-code # Step 6: Authenticate claude # Browser opens β sign in with Anthropic account
ANTHROPIC_API_KEY environment variable. For teams behind corporate proxies, API key auth is more reliable.
Ollama v0.14+ natively supports the Anthropic Messages API format β no proxy or translation layer needed. You redirect Claude Code with three environment variables.
# Add to ~/.bashrc for permanence cat >> ~/.bashrc <<'EOF' # Ollama + Claude Code local backend export ANTHROPIC_AUTH_TOKEN=ollama export ANTHROPIC_API_KEY="" export ANTHROPIC_BASE_URL=http://localhost:11434 export ANTHROPIC_MODEL=gemma4:e4b # Alias: switch between local and cloud alias claude-local='ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 ANTHROPIC_MODEL=gemma4:e4b claude' alias claude-cloud='unset ANTHROPIC_AUTH_TOKEN ANTHROPIC_BASE_URL ANTHROPIC_MODEL && claude' EOF source ~/.bashrc
# Same config β add to ~/.bashrc or ~/.zshrc cat >> ~/.bashrc <<'EOF' export ANTHROPIC_AUTH_TOKEN=ollama export ANTHROPIC_API_KEY="" export ANTHROPIC_BASE_URL=http://localhost:11434 export ANTHROPIC_MODEL=gemma4:e4b alias claude-local='ANTHROPIC_AUTH_TOKEN=ollama ANTHROPIC_BASE_URL=http://localhost:11434 ANTHROPIC_MODEL=gemma4:e4b claude' alias claude-cloud='unset ANTHROPIC_AUTH_TOKEN ANTHROPIC_BASE_URL ANTHROPIC_MODEL && claude' EOF source ~/.bashrc
# Navigate to a project folder mkdir ~/pm-prototypes && cd ~/pm-prototypes # Launch Claude Code β it will use Gemma 4 via Ollama claude # Inside Claude Code, type your first PM prompt: # "Create a landing page HTML for a SaaS invoicing tool. # Include a hero, 3 pricing tiers, and a CTA form."
Ask Gemma 4 to help you structure your idea into a testable hypothesis. Open Claude Code and run:
claude > "I want to validate this idea: [YOUR IDEA]. Help me write a lean hypothesis: Problem, proposed solution, target user, success metric, and the riskiest assumption. Format it as a YAML file."
Claude Code will create a hypothesis.yaml file in your project folder. This becomes your north star for the prototype's scope.
Tell Claude Code exactly what to build β be specific about the tech stack that's easiest to demo quickly:
# Example: no-database, HTML/JS prototype claude > "Based on hypothesis.yaml, build a single-page HTML prototype. No backend needed β use localStorage for data. Include: [key flows from hypothesis]. Make it look polished enough to test with real users."
Claude Code reads your hypothesis file, writes all the HTML/CSS/JS, and confirms each file creation step.
Claude Code keeps the full project in context. You can iterate naturally:
# In the same Claude Code session > "The pricing page feels cluttered. Move the feature list to a toggle/accordion. Keep the CTA above the fold." > "Add a fake 'loading' spinner when the user submits the sign-up form, then show a success state after 1.5s." > "Generate 5 realistic dummy user profiles and pre-populate the dashboard with their data."
> "Based on this prototype, write:
1. A 5-question usability test script
2. A discussion guide for a 30-min discovery interview
3. A RICE prioritization table for the next 3 features
Save each as a separate markdown file."
When you need maximum quality for external deliverables, flip to the Anthropic cloud model:
# Switch to full Claude (cloud) claude-cloud > "Based on hypothesis.yaml and this prototype, write a one-pager executive summary and a 5-slide pitch deck outline for a Series A investor meeting." # When done, switch back to local (free, private) claude-local
# Inside WSL terminal cd ~/pm-prototypes python3 -m venv venv && source venv/bin/activate pip install flask requests # Ask Claude Code to scaffold the Flask app claude > "Create a Flask API with SQLite for [feature]. Include endpoints for [user actions]. Add a simple HTML frontend that calls these endpoints."
# Same commands β Linux terminal cd ~/pm-prototypes python3 -m venv venv && source venv/bin/activate pip install flask requests # Ask Claude Code to scaffold the Flask app claude > "Create a minimal Flask REST API with SQLite. Implement CRUD for [resource]. Add CORS headers for the frontend to call it."
# Create a feedback analysis script claude > "I have a CSV of 47 user interview responses in feedback.csv. Write a Python script that: 1. Reads the CSV 2. Sends each row to Ollama (gemma4:e4b) for sentiment + theme tagging 3. Aggregates the top 5 themes and outputs a summary markdown report"
Use Claude Code + Gemma 4 to build a static HTML prototype (no backend). Share via GitHub Pages or a local share tool like npx serve .. Get 5 users to click through it. Your goal: does the value proposition land?
In the same Claude Code session, describe the feedback you got and ask it to modify the prototype. Add realistic data, smooth rough edges, test the highest-risk assumption from your hypothesis.
If you've validated enough signal, use Claude Code to build a minimal Python/Node backend for the one feature that makes or breaks the idea. Keep everything else as prototype UI.
claude-cloud # Switch to full Claude for best quality > "Given hypothesis.yaml, the prototype code, and these user interview notes [paste notes], write a one-page go/no-go recommendation document. Include: evidence summary, key risks, proposed next steps if GO, pivot options if NO-GO."
| Task | Use local Gemma 4 | Use cloud Claude |
|---|---|---|
| Generating prototype code | Local β | |
| Iterating on UI/UX | Local β | |
| Analyzing user feedback CSVs | Local β | |
| Writing user stories & PRDs | Local β | |
| Investor / exec pitch decks | Cloud β | |
| Complex architectural decisions | Cloud β | |
| Processing sensitive user PII | Local only | |
| Multi-file codebase refactors | Cloud preferred |