On-Device AI with Flutter & Gemma

Prerequisites

What you need before starting

10 min

Goal

Make sure your machine and phone can actually run on-device AI before you burn two hours finding out they can't.

On your computer

Flutter SDK 3.24+ — install from flutter.dev
An IDE — VS Code (with Flutter extension) or Android Studio
Git — for cloning and pushing later
~10GB of free disk space — Flutter + Android SDK + model = not small
Stable WiFi — you'll download a 3 GB AI model

For Android

An Android phone with 6 GB RAM minimum (8 GB recommended)
Android 8.0+ (API level 26 or higher)
USB debugging enabled (Settings → About → tap Build Number 7 times → Developer Options → USB Debugging)

A real constraint worth knowing

Gemma 3n E2B takes about 3 GB of RAM. On a 6 GB phone, you can't use the live camera while the model is loaded — Android's memory manager will kill the app. We work around it with gallery picker. On 8 GB+ phones, live camera works fine.

For iOS (optional but recommended)

A Mac with Xcode 15+
An iPhone running iOS 16+
An Apple ID (free) — no paid developer account needed

Background knowledge (nice-to-have)

Basic Dart / Flutter — write a StatefulWidget, understand async/await
Comfort with running Terminal commands

Mark as done

Step 01

Check your Flutter install

2 min

Goal

Confirm Flutter is installed and recent enough for flutter_gemma.

Open a terminal and run:

bashTerminal

flutter --version

Expected output

Something like this (your numbers may differ):

Flutter 3.38.10 • channel stable
Framework • revision c6f67dede3
Engine • hash 3c25ef829c74f0f39fbb8df093d9a6b9f941ea6b
Tools • Dart 3.10.9 • DevTools 2.51.1

What version do you need?

Flutter 3.24 or newer works with flutter_gemma 0.13.2. If you're on 3.22 or older, upgrade:

bashUpgrade Flutter

flutter upgrade

If something goes wrong

"flutter: command not found"

Flutter isn't on your PATH. Follow the install guide.

"channel [user-branch]" or "unknown source"

Not a problem. Means Flutter was installed via Homebrew or direct download instead of git clone.

Can't upgrade / company-managed Flutter

Use FVM for a per-project Flutter install: brew tap leoafarias/fvm && brew install fvm, then fvm install stable in your project.

Mark as done

Step 02

Create the Flutter project

3 min

Goal

Generate a clean Flutter project. Verify the default counter app runs.

Pick a folder for your work, then create the project:

bashTerminal

cd ~/Desktop
flutter create gemma_vision_demo
cd gemma_vision_demo
code .

Last line opens VS Code. Use studio . for Android Studio.

Verify the default app runs

Connect your Android phone via USB. Accept "trust this computer" prompts. Then:

bashTerminal

flutter devices
flutter run

Expected output

Default counter app launches. Tap "+" to confirm responsiveness. Press q to stop.

If something goes wrong

"No devices found"

Verify: data cable (not just charging), USB debugging on, "trust this computer" accepted. Run adb devices to check.

Android license errors

Run flutter doctor --android-licenses and accept all.

Build takes forever first time

Normal. Gradle is downloading dependencies. Subsequent builds are fast.

Mark as done

Step 03

Add flutter_gemma and supporting packages

5 min

Goal

Pull in the Gemma runtime, image picker, env loader, and path utilities.

Open pubspec.yaml. Replace the dependencies: section:

yamlpubspec.yaml

dependencies:
  flutter:
    sdk: flutter
  cupertino_icons: ^1.0.8
  flutter_gemma: ^0.13.2
  image_picker: ^1.1.2
  path_provider: ^2.1.5
  flutter_dotenv: ^5.2.1

At the bottom in the flutter: section, register .env as an asset:

yamlpubspec.yaml (bottom)

flutter:
  uses-material-design: true
  assets:
    - .env

Install:

bashTerminal

flutter pub get

Why each package?

flutter_gemma — Dart wrapper around MediaPipe's native inference engine.

image_picker — gallery selection for multimodal input (Step 12).

path_provider — access device file paths where the model gets stored.

flutter_dotenv — load HuggingFace token from .env at runtime.

If something goes wrong

"Version solving failed"

Run flutter pub outdated. Usually fixed by upgrading Flutter to latest stable.

Mark as done

Step 04

Get your HuggingFace token

5 min

Goal

Get access to Google's gated Gemma model. Put the token in a Git-safe .env.

4a. Create a HuggingFace account

huggingface.co/join — free tier is enough.

4b. Request model access

Visit google/gemma-3n-E2B-it-litert-preview. Click Agree and access repository. Usually approved instantly.

4c. Generate an access token

Go to huggingface.co/settings/tokens. Click Create new token:

Type: Read
Name: flutter-gemma-demo

Copy the token — starts with hf_.... You won't see it again.

4d. Create `.env` in project root

Same level as pubspec.yaml:

text.env

HUGGINGFACE_TOKEN=hf_paste_your_actual_token_here

No quotes, no spaces around =.

4e. Add `.env` to `.gitignore`

bashTerminal

echo ".env" >> .gitignore

Why HuggingFace gates the model

Gemma is Apache 2.0 licensed, but Google ships weights through a gated HuggingFace repo to enforce license acceptance. It's a compliance checkbox — once you accept, you're free to use commercially.

If something goes wrong

"Access denied" on the model page

Hit refresh. HuggingFace sometimes takes a minute. Check email for a confirmation link.

Lost the token

Generate a new one. Revoke old one while you're there.

Mark as done

Step 05

Configure Android

10 min

Goal

Bump minSdk, add permissions, configure ProGuard for release builds.

5a. minSdk + Gradle settings

Open android/app/build.gradle.kts, find defaultConfig:

kotlinandroid/app/build.gradle.kts

defaultConfig {
    applicationId = "com.example.gemma_vision_demo"
    minSdk = 26          // Required by flutter_gemma
    targetSdk = flutter.targetSdkVersion
    versionCode = flutter.versionCode
    versionName = flutter.versionName
}

5b. Gradle memory

Replace android/gradle.properties:

propertiesandroid/gradle.properties

org.gradle.jvmargs=-Xmx4G -XX:MaxMetaspaceSize=2G -XX:+HeapDumpOnOutOfMemoryError -Dfile.encoding=UTF-8
org.gradle.parallel=true
org.gradle.caching=true
android.useAndroidX=true
android.enableJetifier=true

5c. Permissions

In android/app/src/main/AndroidManifest.xml, inside <manifest>, above <application>:

xmlAndroidManifest.xml

<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission android:name="android.permission.READ_MEDIA_IMAGES" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE" />
<uses-permission android:name="android.permission.FOREGROUND_SERVICE_DATA_SYNC" />

<uses-feature android:name="android.hardware.camera" android:required="false" />
<uses-feature android:name="android.hardware.opengles.aep" android:required="false" />

5d. ProGuard rules (for release builds)

Create android/app/proguard-rules.pro:

textandroid/app/proguard-rules.pro

# MediaPipe (used by flutter_gemma)
-keep class com.google.mediapipe.** { *; }
-dontwarn com.google.mediapipe.**

# Protocol Buffers
-keep class com.google.protobuf.** { *; }
-dontwarn com.google.protobuf.**

# localagents (Gemma-specific)
-keep class com.google.ai.edge.localagents.** { *; }
-dontwarn com.google.ai.edge.localagents.**

5e. Wire ProGuard into release

Back in android/app/build.gradle.kts, replace the buildTypes block:

kotlinandroid/app/build.gradle.kts

buildTypes {
    release {
        signingConfig = signingConfigs.getByName("debug")
        isMinifyEnabled = true
        isShrinkResources = true
        proguardFiles(
            getDefaultProguardFile("proguard-android-optimize.txt"),
            "proguard-rules.pro"
        )
    }
}

Why ProGuard rules?

R8 (Android's code shrinker) removes unused code in release builds. MediaPipe uses Java reflection — R8 can't see those references and removes them. Result: ClassNotFoundException at runtime. The rules tell R8 "keep these, no matter what."

If something goes wrong

"Missing classes" during release build

R8 found another class needs keeping. Open the referenced missing_rules.txt, copy new keep rules into proguard-rules.pro, rebuild.

Gradle daemon crashes / OOM

Close other apps. Try reducing -Xmx4G to -Xmx3G. Counter-intuitive but sometimes less heap is more stable.

Mark as done

Step 06

Configure iOS

15 min

Goal

iOS 16+, camera/photo permissions, free Apple ID signing.

Skip if no Mac/iPhone. Android works fine alone.

6a. Update Podfile

Top of ios/Podfile:

rubyios/Podfile

platform :ios, '16.0'

# (keep everything else as-is)

6b. Permissions in Info.plist

In ios/Runner/Info.plist, inside the top-level <dict>:

xmlios/Runner/Info.plist

<key>NSCameraUsageDescription</key>
<string>We use the camera for multimodal AI input.</string>
<key>NSPhotoLibraryUsageDescription</key>
<string>We use your photos for multimodal AI input.</string>
<key>NSAppTransportSecurity</key>
<dict>
    <key>NSAllowsArbitraryLoads</key>
    <true/>
</dict>

6c. Install CocoaPods

bashTerminal

cd ios
pod install --repo-update
cd ..

2-5 minutes first run. Downloads MediaPipe native libraries.

6d. Xcode signing

bashTerminal

open ios/Runner.xcworkspace

In Xcode:

Click Runner project (top of left sidebar)
Select Runner target
Go to Signing & Capabilities
Check "Automatically manage signing"
Pick your Personal Team from Team dropdown
Change Bundle Identifier from com.example.gemmaVisionDemo to something unique like com.yourname.gemmaVisionDemo

6e. Register your device

If you see "Your team has no devices":

Plug iPhone into Mac via USB
Tap Trust on iPhone
Enable Developer Mode: Settings → Privacy & Security → Developer Mode → On (restart)
Click Try Again in Xcode

Expected state

No warnings in Signing & Capabilities.

If something goes wrong

"Unable to log in with account [email]"

Stale account from a previous user. Xcode → Settings (⌘,) → Accounts → select wrong account → click "–" → re-add yours with "+".

"No profiles for 'com.example.*' were found"

Apple won't issue free profiles for com.example. Use any other prefix.

"Personal teams do not support Push Notifications"

Ignore.

Mark as done

Step 07

Verify the setup

5 min

Goal

Confirm the default counter app still runs after all our config changes.

Connect Android, then:

bashTerminal

flutter devices
flutter run

If prompted for device, pick Android. Default counter app should launch. Press q to stop.

Disconnect Android, connect iPhone, repeat flutter run. Pick iPhone.

If iOS shows "Untrusted Developer" alert, go to Settings → General → VPN & Device Management → [your Apple ID] → Trust. Then launch the app.

Expected state

Default counter app runs on both devices. Environment is ready for the actual build.

Mark as done

Step 08

Wire up `main.dart`

5 min

Goal

Load the env file, initialize flutter_gemma with your token, set up the app theme.

Replace the entire contents of lib/main.dart:

dartlib/main.dart

import 'package:flutter/material.dart';
import 'package:flutter_dotenv/flutter_dotenv.dart';
import 'package:flutter_gemma/core/api/flutter_gemma.dart';
import 'package:gemma_vision_demo/screens/home_screen.dart';

Future<void> main() async {
  WidgetsFlutterBinding.ensureInitialized();

  // Load env vars from .env
  await dotenv.load(fileName: ".env");

  // Initialize flutter_gemma with HF token
  FlutterGemma.initialize(
    huggingFaceToken: dotenv.env['HUGGINGFACE_TOKEN'],
    maxDownloadRetries: 20,
  );

  runApp(const GemmaDemoApp());
}

class GemmaDemoApp extends StatelessWidget {
  const GemmaDemoApp({super.key});

  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      title: 'Gemma Vision Demo',
      debugShowCheckedModeBanner: false,
      theme: ThemeData(
        colorScheme: ColorScheme.fromSeed(
          seedColor: const Color(0xFFFF5722),
          brightness: Brightness.dark,
        ),
        useMaterial3: true,
      ),
      home: const HomeScreen(),
    );
  }
}

The import path on line 4 uses your project's package name. If you named your project differently, update it (it's whatever's in pubspec.yaml's name: field).

What's happening here

WidgetsFlutterBinding.ensureInitialized() — required because we're doing async work before runApp.

dotenv.load() — reads .env at startup so dotenv.env['KEY'] works anywhere.

FlutterGemma.initialize() — sets up the global Gemma instance with your HF token. maxDownloadRetries: 20 is generous because the model is 3GB and WiFi can wobble.

debugShowCheckedModeBanner: false — hides the red "DEBUG" banner. Cleaner for screenshots and recordings.

Mark as done

Step 09

Build the home screen

5 min

Goal

A branded landing screen with "No Server. No Bills. No Internet. No Problem." that links to the model download.

Create folder lib/screens/. Inside it, create home_screen.dart:

dartlib/screens/home_screen.dart

import 'package:flutter/material.dart';
import 'package:gemma_vision_demo/screens/model_download_screen.dart';

class HomeScreen extends StatelessWidget {
  const HomeScreen({super.key});

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      backgroundColor: const Color(0xFF0A0A0F),
      body: SafeArea(
        child: Padding(
          padding: const EdgeInsets.all(32),
          child: Column(
            crossAxisAlignment: CrossAxisAlignment.start,
            children: [
              const Spacer(),
              const Text('No Server.',
                style: TextStyle(
                  fontSize: 48, fontWeight: FontWeight.w900,
                  color: Colors.white, height: 1.1,
                )),
              const Text('No Bills.',
                style: TextStyle(
                  fontSize: 48, fontWeight: FontWeight.w900,
                  color: Colors.white, height: 1.1,
                )),
              const Text('No Internet.',
                style: TextStyle(
                  fontSize: 48, fontWeight: FontWeight.w900,
                  color: Colors.white, height: 1.1,
                )),
              const Text('No Problem.',
                style: TextStyle(
                  fontSize: 48, fontWeight: FontWeight.w900,
                  color: Color(0xFFFF5722),
                  fontStyle: FontStyle.italic, height: 1.1,
                )),
              const SizedBox(height: 32),
              const Text('On-device AI with Flutter & Gemma',
                style: TextStyle(
                  fontSize: 18, color: Colors.white70, height: 1.4,
                )),
              const Spacer(),
              SizedBox(
                width: double.infinity,
                child: FilledButton(
                  onPressed: () {
                    Navigator.push(
                      context,
                      MaterialPageRoute(
                        builder: (_) => const ModelDownloadScreen(),
                      ),
                    );
                  },
                  style: FilledButton.styleFrom(
                    backgroundColor: const Color(0xFFFF5722),
                    padding: const EdgeInsets.symmetric(vertical: 20),
                    shape: RoundedRectangleBorder(
                      borderRadius: BorderRadius.circular(4),
                    ),
                  ),
                  child: const Text('Get Started →',
                    style: TextStyle(
                      fontSize: 18, fontWeight: FontWeight.w600,
                    )),
                ),
              ),
            ],
          ),
        ),
      ),
    );
  }
}

This won't compile yet — it imports model_download_screen.dart which we haven't created. That's next.

Mark as done

Step 10

Model download screen

10 min

Goal

Download the 3GB Gemma model with a progress bar and foreground service. Persist across app restarts.

Create lib/screens/model_download_screen.dart:

dartlib/screens/model_download_screen.dart

import 'package:flutter/material.dart';
import 'package:flutter_gemma/core/api/flutter_gemma.dart';
import 'package:flutter_gemma/core/model.dart';
import 'package:gemma_vision_demo/screens/chat_screen.dart';

class ModelDownloadScreen extends StatefulWidget {
  const ModelDownloadScreen({super.key});

  @override
  State<ModelDownloadScreen> createState() => _ModelDownloadScreenState();
}

class _ModelDownloadScreenState extends State<ModelDownloadScreen> {
  // Gemma 3n E2B — multimodal, ~3.1 GB, gated repo
  static const String _modelUrl =
      'https://huggingface.co/google/gemma-3n-E2B-it-litert-preview/resolve/main/gemma-3n-E2B-it-int4.task';
  static const String _modelName = 'gemma-3n-E2B-it-int4.task';

  bool _isChecking = true;
  bool _isInstalled = false;
  bool _isDownloading = false;
  int _downloadProgress = 0;
  String? _errorMessage;

  @override
  void initState() {
    super.initState();
    _checkIfModelExists();
  }

  Future<void> _checkIfModelExists() async {
    try {
      final installed = await FlutterGemma.isModelInstalled(_modelName);
      setState(() {
        _isInstalled = installed;
        _isChecking = false;
      });
    } catch (e) {
      setState(() {
        _isChecking = false;
        _errorMessage = 'Error checking model: $e';
      });
    }
  }

  Future<void> _downloadModel() async {
    setState(() {
      _isDownloading = true;
      _downloadProgress = 0;
      _errorMessage = null;
    });

    try {
      await FlutterGemma.installModel(modelType: ModelType.gemmaIt)
          .fromNetwork(_modelUrl, foreground: true)
          .withProgress((progress) {
        if (mounted) setState(() => _downloadProgress = progress);
      }).install();

      setState(() {
        _isDownloading = false;
        _isInstalled = true;
      });
    } catch (e) {
      setState(() {
        _isDownloading = false;
        _errorMessage = 'Download failed: $e\n\nTip: Move closer to your Wi-Fi router.';
      });
    }
  }

  void _goToChat() {
    Navigator.push(
      context,
      MaterialPageRoute(builder: (_) => const ChatScreen()),
    );
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      backgroundColor: const Color(0xFF0A0A0F),
      appBar: AppBar(
        backgroundColor: Colors.transparent,
        elevation: 0,
        iconTheme: const IconThemeData(color: Colors.white),
      ),
      body: SafeArea(
        child: Padding(
          padding: const EdgeInsets.all(32),
          child: Column(
            crossAxisAlignment: CrossAxisAlignment.start,
            children: [
              const Text('Gemma 3 Nano E2B',
                style: TextStyle(
                  fontSize: 32, fontWeight: FontWeight.w900,
                  color: Colors.white, height: 1.1,
                )),
              const SizedBox(height: 8),
              const Text('Multimodal · 2B params · 3.1 GB',
                style: TextStyle(
                  color: Color(0xFFFF5722), fontSize: 14,
                  letterSpacing: 2,
                )),
              const SizedBox(height: 40),
              if (_isChecking)
                const Center(child: CircularProgressIndicator(color: Color(0xFFFF5722)))
              else if (_errorMessage != null)
                _buildErrorView()
              else if (_isInstalled)
                _buildReadyView()
              else if (_isDownloading)
                _buildDownloadingView()
              else
                _buildPromptView(),
              const Spacer(),
              _buildInfoCard(),
            ],
          ),
        ),
      ),
    );
  }

  Widget _buildPromptView() {
    return Column(
      crossAxisAlignment: CrossAxisAlignment.start,
      children: [
        const Text('Download the model once. Run it forever.',
          style: TextStyle(color: Colors.white, fontSize: 16, height: 1.5)),
        const SizedBox(height: 24),
        SizedBox(
          width: double.infinity,
          child: FilledButton(
            onPressed: _downloadModel,
            style: FilledButton.styleFrom(
              backgroundColor: const Color(0xFFFF5722),
              padding: const EdgeInsets.symmetric(vertical: 20),
              shape: RoundedRectangleBorder(
                borderRadius: BorderRadius.circular(4),
              ),
            ),
            child: const Text('Download Model',
              style: TextStyle(fontSize: 16, fontWeight: FontWeight.w600)),
          ),
        ),
      ],
    );
  }

  Widget _buildDownloadingView() {
    return Column(
      crossAxisAlignment: CrossAxisAlignment.start,
      children: [
        const Text('Downloading model...',
          style: TextStyle(color: Colors.white, fontSize: 16)),
        const SizedBox(height: 24),
        ClipRRect(
          borderRadius: BorderRadius.circular(4),
          child: LinearProgressIndicator(
            value: _downloadProgress / 100,
            backgroundColor: Colors.white.withOpacity(0.1),
            valueColor: const AlwaysStoppedAnimation<Color>(Color(0xFFFF5722)),
            minHeight: 8,
          ),
        ),
        const SizedBox(height: 12),
        Text('$_downloadProgress%',
          style: const TextStyle(
            color: Color(0xFFFF5722),
            fontSize: 24, fontWeight: FontWeight.w900,
          )),
        const SizedBox(height: 8),
        const Text('One-time download. After this, inference is free.',
          style: TextStyle(color: Colors.white54, fontSize: 12)),
      ],
    );
  }

  Widget _buildReadyView() {
    return Column(
      crossAxisAlignment: CrossAxisAlignment.start,
      children: [
        Row(
          children: const [
            Icon(Icons.check_circle, color: Color(0xFFFF5722), size: 32),
            SizedBox(width: 12),
            Text('Model ready on-device',
              style: TextStyle(
                color: Colors.white, fontSize: 18,
                fontWeight: FontWeight.w600,
              )),
          ],
        ),
        const SizedBox(height: 24),
        SizedBox(
          width: double.infinity,
          child: FilledButton(
            onPressed: _goToChat,
            style: FilledButton.styleFrom(
              backgroundColor: const Color(0xFFFF5722),
              padding: const EdgeInsets.symmetric(vertical: 20),
              shape: RoundedRectangleBorder(
                borderRadius: BorderRadius.circular(4),
              ),
            ),
            child: const Text('Start Chat →',
              style: TextStyle(fontSize: 16, fontWeight: FontWeight.w600)),
          ),
        ),
      ],
    );
  }

  Widget _buildErrorView() {
    return Container(
      padding: const EdgeInsets.all(16),
      decoration: BoxDecoration(
        border: Border.all(color: Colors.red.withOpacity(0.5)),
        borderRadius: BorderRadius.circular(4),
      ),
      child: Column(
        crossAxisAlignment: CrossAxisAlignment.start,
        children: [
          const Text('Error',
            style: TextStyle(
              color: Colors.red, fontWeight: FontWeight.bold, fontSize: 14,
            )),
          const SizedBox(height: 8),
          Text(_errorMessage ?? 'Unknown error',
            style: const TextStyle(color: Colors.white70, fontSize: 12)),
          const SizedBox(height: 16),
          TextButton(
            onPressed: () {
              setState(() => _errorMessage = null);
              _downloadModel();
            },
            child: const Text('Retry',
              style: TextStyle(color: Color(0xFFFF5722))),
          ),
        ],
      ),
    );
  }

  Widget _buildInfoCard() {
    return Container(
      padding: const EdgeInsets.all(16),
      decoration: BoxDecoration(
        color: Colors.white.withOpacity(0.04),
        borderRadius: BorderRadius.circular(4),
      ),
      child: const Column(
        crossAxisAlignment: CrossAxisAlignment.start,
        children: [
          Text('ABOUT THIS MODEL',
            style: TextStyle(
              color: Colors.white38, fontSize: 10,
              letterSpacing: 2, fontWeight: FontWeight.w600,
            )),
          SizedBox(height: 8),
          Text(
            "Gemma 3 Nano E2B — Google's on-device multimodal AI. "
            'Understands text and images. Runs entirely on your phone. '
            'No data leaves this device.',
            style: TextStyle(color: Colors.white70, fontSize: 12, height: 1.5),
          ),
        ],
      ),
    );
  }
}

Why foreground: true?

It's the magic flag that makes a 3GB download actually complete on a phone. Without it, Android can suspend or kill your app mid-download to free memory. With it, your app runs as a foreground service with a persistent notification — Android won't kill it. You'll see "Downloading…" in your notification tray during the download.

If something goes wrong

Download fails immediately with 401 / 403

Your HF token is wrong, expired, or you didn't accept the Gemma license. Re-check Step 4.

Download stuck at 0%

Network/firewall issue. Try a different WiFi or mobile hotspot.

"TaskConnectionException: Task timed out"

WiFi flaked out. The library auto-retries. If you want to babysit, ensure phone is on the same network as the router and not behind walls. Move closer.

Download fails partway through (e.g. 60%)

HuggingFace doesn't support resumable downloads, so it restarts from 0% on retry. Frustrating but expected. Try once more on stable WiFi.

Mark as done

Step 11

Chat screen with streaming

15 min

Goal

Real on-device chat with token-by-token streaming responses. The "wow" moment of the demo.

Create lib/screens/chat_screen.dart with this initial text-only version (we add image support in Step 12):

dartlib/screens/chat_screen.dart

import 'package:flutter/material.dart';
import 'package:flutter_gemma/flutter_gemma.dart';
import 'package:flutter_gemma/core/api/flutter_gemma.dart';

class ChatScreen extends StatefulWidget {
  const ChatScreen({super.key});

  @override
  State<ChatScreen> createState() => _ChatScreenState();
}

class _ChatScreenState extends State<ChatScreen> {
  final TextEditingController _controller = TextEditingController();
  final ScrollController _scrollController = ScrollController();

  InferenceModel? _model;
  InferenceChat? _chat;

  final List<_ChatMessage> _messages = [];
  bool _isInitializing = true;
  bool _isGenerating = false;
  String? _currentResponse;
  String? _errorMessage;

  @override
  void initState() {
    super.initState();
    _initializeModel();
  }

  Future<void> _initializeModel() async {
    try {
      // Re-call install() — if the model file already exists,
      // this just registers it as active. Cheap and safe.
      await FlutterGemma.installModel(modelType: ModelType.gemmaIt)
          .fromNetwork(
            'https://huggingface.co/google/gemma-3n-E2B-it-litert-preview/resolve/main/gemma-3n-E2B-it-int4.task',
          )
          .install();

      final model = await FlutterGemma.getActiveModel(
        maxTokens: 2048,
        preferredBackend: PreferredBackend.gpu,
        supportImage: true,
        maxNumImages: 1,
      );

      final chat = await model.createChat(
        temperature: 0.8,
        topK: 40,
        supportImage: true,
      );

      setState(() {
        _model = model;
        _chat = chat;
        _isInitializing = false;
      });
    } catch (e) {
      setState(() {
        _isInitializing = false;
        _errorMessage = 'Failed to load model: $e';
      });
    }
  }

  Future<void> _sendMessage() async {
    final text = _controller.text.trim();
    if (text.isEmpty || _isGenerating || _chat == null) return;

    _controller.clear();
    setState(() {
      _messages.add(_ChatMessage(text: text, isUser: true));
      _isGenerating = true;
      _currentResponse = '';
    });
    _scrollToBottom();

    try {
      await _chat!.addQueryChunk(Message.text(text: text, isUser: true));

      final buffer = StringBuffer();
      await for (final response in _chat!.generateChatResponseAsync()) {
        if (response is TextResponse) {
          buffer.write(response.token);
          if (mounted) {
            setState(() => _currentResponse = buffer.toString());
            _scrollToBottom();
          }
        }
      }

      if (mounted) {
        setState(() {
          _messages.add(_ChatMessage(text: buffer.toString(), isUser: false));
          _currentResponse = null;
          _isGenerating = false;
        });
        _scrollToBottom();
      }
    } catch (e) {
      if (mounted) {
        setState(() {
          _messages.add(_ChatMessage(text: 'Error: $e', isUser: false));
          _currentResponse = null;
          _isGenerating = false;
        });
      }
    }
  }

  void _scrollToBottom() {
    WidgetsBinding.instance.addPostFrameCallback((_) {
      if (_scrollController.hasClients) {
        _scrollController.animateTo(
          _scrollController.position.maxScrollExtent,
          duration: const Duration(milliseconds: 200),
          curve: Curves.easeOut,
        );
      }
    });
  }

  @override
  void dispose() {
    _chat?.close();
    _model?.close();
    _controller.dispose();
    _scrollController.dispose();
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      backgroundColor: const Color(0xFF0A0A0F),
      appBar: AppBar(
        backgroundColor: const Color(0xFF0A0A0F),
        elevation: 0,
        iconTheme: const IconThemeData(color: Colors.white),
        title: const Text('On-Device Chat',
          style: TextStyle(
            color: Colors.white, fontSize: 18,
            fontWeight: FontWeight.w700,
          )),
      ),
      body: _isInitializing
          ? _buildLoading()
          : _errorMessage != null
              ? _buildError()
              : _buildChat(),
    );
  }

  Widget _buildLoading() {
    return const Center(
      child: Column(
        mainAxisAlignment: MainAxisAlignment.center,
        children: [
          CircularProgressIndicator(color: Color(0xFFFF5722)),
          SizedBox(height: 20),
          Text('Warming up the model...',
            style: TextStyle(color: Colors.white54, fontSize: 14)),
        ],
      ),
    );
  }

  Widget _buildError() {
    return Center(
      child: Padding(
        padding: const EdgeInsets.all(32),
        child: Text(_errorMessage!,
          style: const TextStyle(color: Colors.white70),
          textAlign: TextAlign.center),
      ),
    );
  }

  Widget _buildChat() {
    return Column(
      children: [
        Expanded(
          child: ListView.builder(
            controller: _scrollController,
            padding: const EdgeInsets.all(16),
            itemCount: _messages.length + (_currentResponse != null ? 1 : 0),
            itemBuilder: (context, index) {
              if (index < _messages.length) {
                return _buildBubble(_messages[index]);
              }
              return _buildBubble(
                _ChatMessage(text: _currentResponse ?? '', isUser: false),
              );
            },
          ),
        ),
        _buildInputBar(),
      ],
    );
  }

  Widget _buildBubble(_ChatMessage m) {
    return Container(
      margin: const EdgeInsets.only(bottom: 16),
      child: Column(
        crossAxisAlignment:
            m.isUser ? CrossAxisAlignment.end : CrossAxisAlignment.start,
        children: [
          Text(m.isUser ? 'You' : 'Gemma',
            style: TextStyle(
              color: m.isUser ? Colors.white54 : const Color(0xFFFF5722),
              fontSize: 10, letterSpacing: 2,
              fontWeight: FontWeight.w600,
            )),
          const SizedBox(height: 6),
          Container(
            constraints: BoxConstraints(
              maxWidth: MediaQuery.of(context).size.width * 0.8,
            ),
            padding: const EdgeInsets.symmetric(horizontal: 14, vertical: 10),
            decoration: BoxDecoration(
              color: m.isUser
                  ? const Color(0xFFFF5722)
                  : Colors.white.withOpacity(0.06),
              borderRadius: BorderRadius.circular(4),
            ),
            child: Text(m.text,
              style: const TextStyle(
                color: Colors.white, fontSize: 15, height: 1.4,
              )),
          ),
        ],
      ),
    );
  }

  Widget _buildInputBar() {
    return Container(
      decoration: const BoxDecoration(
        border: Border(top: BorderSide(color: Colors.white10)),
      ),
      padding: EdgeInsets.only(
        left: 16, right: 16, top: 12,
        bottom: MediaQuery.of(context).padding.bottom + 12,
      ),
      child: Row(
        children: [
          Expanded(
            child: TextField(
              controller: _controller,
              enabled: !_isGenerating,
              style: const TextStyle(color: Colors.white),
              decoration: InputDecoration(
                hintText: _isGenerating ? 'Thinking...' : 'Ask anything...',
                hintStyle: const TextStyle(color: Colors.white38),
                filled: true,
                fillColor: Colors.white.withOpacity(0.04),
                border: OutlineInputBorder(
                  borderRadius: BorderRadius.circular(4),
                  borderSide: BorderSide.none,
                ),
                contentPadding: const EdgeInsets.symmetric(
                  horizontal: 16, vertical: 14,
                ),
              ),
              onSubmitted: (_) => _sendMessage(),
            ),
          ),
          const SizedBox(width: 8),
          Material(
            color: _isGenerating
                ? Colors.white.withOpacity(0.1)
                : const Color(0xFFFF5722),
            borderRadius: BorderRadius.circular(4),
            child: InkWell(
              borderRadius: BorderRadius.circular(4),
              onTap: _isGenerating ? null : _sendMessage,
              child: const Padding(
                padding: EdgeInsets.all(14),
                child: Icon(Icons.arrow_upward,
                  color: Colors.white, size: 20),
              ),
            ),
          ),
        ],
      ),
    );
  }
}

class _ChatMessage {
  final String text;
  final bool isUser;
  _ChatMessage({required this.text, required this.isUser});
}

Run it!

Save everything. From terminal:

bashTerminal

flutter run

Pick Android. Tap Get Started → Download Model. Wait 5–15 minutes for the 3GB to download (be patient, monitor the foreground notification). Once done, tap Start Chat.

Try these prompts:

"Hi! What are you?" — baseline
"Explain quantum computing in 3 sentences." — reasoning
"Write a haiku about offline AI." — creative

Expected behavior

"Warming up the model..." for 10–30 seconds (model loads into RAM). Then the chat UI appears. Your prompt streams a response token-by-token. First response is slower than later ones.

The airplane mode test ✈️

Now turn on airplane mode. Type another prompt. Watch it still work. This is the magic. No internet. No server. No data leaving the phone.

What's happening in generateChatResponseAsync()

It returns a Dart Stream. Each TextResponse token is emitted as Gemma generates it. We accumulate into a buffer and update the UI on every token — that's what creates the typewriter effect.

Compare to typical cloud APIs where you await the full response. Streaming makes a huge UX difference: users see something happening immediately, even if total time is the same.

If something goes wrong

"No active inference model set"

The download didn't fully finalize. The installModel().install() call at the top of _initializeModel() handles this — it detects existing files and just sets them as active. If the error persists, uninstall the app from your phone and start fresh.

"Out of memory" / app crashes during loading

Your phone has <6GB RAM. Try changing preferredBackend: PreferredBackend.gpu to PreferredBackend.cpu — slower but uses less memory.

Streaming is super slow

Normal on first inference (model warmup). Subsequent prompts are 2-3x faster.

Response just appears all at once instead of streaming

Cosmetic only — your phone's GPU is generating tokens faster than the UI can repaint. Still works correctly.

Mark as done

Step 12

Add multimodal image input

10 min

Goal

Let users attach an image and have Gemma describe it. The hero feature.

Replace your chat_screen.dart with this fuller version that adds image picking, attachment preview, and image-aware messages:

dartlib/screens/chat_screen.dart

import 'dart:typed_data';
import 'package:flutter/material.dart';
import 'package:flutter_gemma/flutter_gemma.dart';
import 'package:flutter_gemma/core/api/flutter_gemma.dart';
import 'package:image_picker/image_picker.dart';

class ChatScreen extends StatefulWidget {
  const ChatScreen({super.key});

  @override
  State<ChatScreen> createState() => _ChatScreenState();
}

class _ChatScreenState extends State<ChatScreen> {
  final TextEditingController _controller = TextEditingController();
  final ScrollController _scrollController = ScrollController();
  final ImagePicker _imagePicker = ImagePicker();

  InferenceModel? _model;
  InferenceChat? _chat;

  final List<_ChatMessage> _messages = [];
  bool _isInitializing = true;
  bool _isGenerating = false;
  String? _currentResponse;
  String? _errorMessage;

  // Image attached but not yet sent
  Uint8List? _pendingImage;

  @override
  void initState() {
    super.initState();
    _initializeModel();
  }

  Future<void> _initializeModel() async {
    try {
      await FlutterGemma.installModel(modelType: ModelType.gemmaIt)
          .fromNetwork(
            'https://huggingface.co/google/gemma-3n-E2B-it-litert-preview/resolve/main/gemma-3n-E2B-it-int4.task',
          )
          .install();

      final model = await FlutterGemma.getActiveModel(
        maxTokens: 2048,
        preferredBackend: PreferredBackend.gpu,
        supportImage: true,
        maxNumImages: 1,
      );

      final chat = await model.createChat(
        temperature: 0.8,
        topK: 40,
        supportImage: true,
      );

      setState(() {
        _model = model;
        _chat = chat;
        _isInitializing = false;
      });
    } catch (e) {
      setState(() {
        _isInitializing = false;
        _errorMessage = 'Failed to load model: $e';
      });
    }
  }

  Future<void> _pickImage() async {
    try {
      final picked = await _imagePicker.pickImage(
        source: ImageSource.gallery,
        maxWidth: 1024,
        maxHeight: 1024,
        imageQuality: 85,
      );
      if (picked == null) return;
      final bytes = await picked.readAsBytes();
      setState(() => _pendingImage = bytes);
    } catch (e) {
      if (mounted) {
        ScaffoldMessenger.of(context).showSnackBar(
          SnackBar(content: Text('Failed to pick image: $e')),
        );
      }
    }
  }

  Future<void> _sendMessage() async {
    final text = _controller.text.trim();
    final hasImage = _pendingImage != null;
    if ((text.isEmpty && !hasImage) || _isGenerating || _chat == null) return;

    _controller.clear();
    final imageToSend = _pendingImage;

    setState(() {
      _messages.add(_ChatMessage(
        text: text, isUser: true, image: imageToSend,
      ));
      _pendingImage = null;
      _isGenerating = true;
      _currentResponse = '';
    });
    _scrollToBottom();

    try {
      // Build text-only OR image+text message
      final Message message = hasImage
          ? Message.withImage(
              text: text.isEmpty ? 'Describe this image in detail.' : text,
              imageBytes: imageToSend!,
              isUser: true,
            )
          : Message.text(text: text, isUser: true);

      await _chat!.addQueryChunk(message);

      final buffer = StringBuffer();
      await for (final response in _chat!.generateChatResponseAsync()) {
        if (response is TextResponse) {
          buffer.write(response.token);
          if (mounted) {
            setState(() => _currentResponse = buffer.toString());
            _scrollToBottom();
          }
        }
      }

      if (mounted) {
        setState(() {
          _messages.add(_ChatMessage(text: buffer.toString(), isUser: false));
          _currentResponse = null;
          _isGenerating = false;
        });
        _scrollToBottom();
      }
    } catch (e) {
      if (mounted) {
        setState(() {
          _messages.add(_ChatMessage(text: 'Error: $e', isUser: false));
          _currentResponse = null;
          _isGenerating = false;
        });
      }
    }
  }

  void _scrollToBottom() {
    WidgetsBinding.instance.addPostFrameCallback((_) {
      if (_scrollController.hasClients) {
        _scrollController.animateTo(
          _scrollController.position.maxScrollExtent,
          duration: const Duration(milliseconds: 200),
          curve: Curves.easeOut,
        );
      }
    });
  }

  @override
  void dispose() {
    _chat?.close();
    _model?.close();
    _controller.dispose();
    _scrollController.dispose();
    super.dispose();
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      backgroundColor: const Color(0xFF0A0A0F),
      appBar: AppBar(
        backgroundColor: const Color(0xFF0A0A0F),
        elevation: 0,
        iconTheme: const IconThemeData(color: Colors.white),
        title: const Text('On-Device Chat',
          style: TextStyle(
            color: Colors.white, fontSize: 18,
            fontWeight: FontWeight.w700,
          )),
      ),
      body: _isInitializing
          ? const Center(child: CircularProgressIndicator(color: Color(0xFFFF5722)))
          : _errorMessage != null
              ? Center(child: Text(_errorMessage!,
                  style: const TextStyle(color: Colors.white70)))
              : _buildChat(),
    );
  }

  Widget _buildChat() {
    return Column(
      children: [
        Expanded(
          child: ListView.builder(
            controller: _scrollController,
            padding: const EdgeInsets.all(16),
            itemCount: _messages.length + (_currentResponse != null ? 1 : 0),
            itemBuilder: (context, index) {
              if (index < _messages.length) {
                return _buildBubble(_messages[index]);
              }
              return _buildBubble(
                _ChatMessage(text: _currentResponse ?? '', isUser: false),
              );
            },
          ),
        ),
        if (_pendingImage != null) _buildPendingImage(),
        _buildInputBar(),
      ],
    );
  }

  Widget _buildBubble(_ChatMessage m) {
    return Container(
      margin: const EdgeInsets.only(bottom: 16),
      child: Column(
        crossAxisAlignment:
            m.isUser ? CrossAxisAlignment.end : CrossAxisAlignment.start,
        children: [
          Text(m.isUser ? 'You' : 'Gemma',
            style: TextStyle(
              color: m.isUser ? Colors.white54 : const Color(0xFFFF5722),
              fontSize: 10, letterSpacing: 2,
              fontWeight: FontWeight.w600,
            )),
          const SizedBox(height: 6),
          if (m.image != null)
            Container(
              constraints: BoxConstraints(
                maxWidth: MediaQuery.of(context).size.width * 0.65,
              ),
              margin: const EdgeInsets.only(bottom: 6),
              child: ClipRRect(
                borderRadius: BorderRadius.circular(4),
                child: Image.memory(m.image!, fit: BoxFit.cover),
              ),
            ),
          if (m.text.isNotEmpty)
            Container(
              constraints: BoxConstraints(
                maxWidth: MediaQuery.of(context).size.width * 0.8,
              ),
              padding: const EdgeInsets.symmetric(horizontal: 14, vertical: 10),
              decoration: BoxDecoration(
                color: m.isUser
                    ? const Color(0xFFFF5722)
                    : Colors.white.withOpacity(0.06),
                borderRadius: BorderRadius.circular(4),
              ),
              child: Text(m.text,
                style: const TextStyle(
                  color: Colors.white, fontSize: 15, height: 1.4,
                )),
            ),
        ],
      ),
    );
  }

  Widget _buildPendingImage() {
    return Container(
      padding: const EdgeInsets.symmetric(horizontal: 16, vertical: 12),
      decoration: BoxDecoration(
        color: Colors.white.withOpacity(0.04),
        border: const Border(
          top: BorderSide(color: Colors.white10),
          bottom: BorderSide(color: Colors.white10),
        ),
      ),
      child: Row(
        children: [
          ClipRRect(
            borderRadius: BorderRadius.circular(4),
            child: Image.memory(_pendingImage!,
              width: 50, height: 50, fit: BoxFit.cover),
          ),
          const SizedBox(width: 12),
          const Expanded(
            child: Text('Image attached',
              style: TextStyle(color: Colors.white70, fontSize: 13)),
          ),
          IconButton(
            icon: const Icon(Icons.close, color: Colors.white54),
            onPressed: () => setState(() => _pendingImage = null),
          ),
        ],
      ),
    );
  }

  Widget _buildInputBar() {
    return Container(
      decoration: const BoxDecoration(
        border: Border(top: BorderSide(color: Colors.white10)),
      ),
      padding: EdgeInsets.only(
        left: 8, right: 16, top: 8,
        bottom: MediaQuery.of(context).padding.bottom + 8,
      ),
      child: Row(
        children: [
          IconButton(
            icon: Icon(Icons.add_photo_alternate_outlined,
              color: _isGenerating
                  ? Colors.white24
                  : const Color(0xFFFF5722),
              size: 28,
            ),
            onPressed: _isGenerating ? null : _pickImage,
          ),
          Expanded(
            child: TextField(
              controller: _controller,
              enabled: !_isGenerating,
              style: const TextStyle(color: Colors.white),
              decoration: InputDecoration(
                hintText: _isGenerating
                    ? 'Thinking...'
                    : (_pendingImage != null
                        ? 'Ask about the image...'
                        : 'Ask anything...'),
                hintStyle: const TextStyle(color: Colors.white38),
                filled: true,
                fillColor: Colors.white.withOpacity(0.04),
                border: OutlineInputBorder(
                  borderRadius: BorderRadius.circular(4),
                  borderSide: BorderSide.none,
                ),
                contentPadding: const EdgeInsets.symmetric(
                  horizontal: 16, vertical: 14,
                ),
              ),
              onSubmitted: (_) => _sendMessage(),
            ),
          ),
          const SizedBox(width: 8),
          Material(
            color: _isGenerating
                ? Colors.white.withOpacity(0.1)
                : const Color(0xFFFF5722),
            borderRadius: BorderRadius.circular(4),
            child: InkWell(
              borderRadius: BorderRadius.circular(4),
              onTap: _isGenerating ? null : _sendMessage,
              child: const Padding(
                padding: EdgeInsets.all(14),
                child: Icon(Icons.arrow_upward,
                  color: Colors.white, size: 20),
              ),
            ),
          ),
        ],
      ),
    );
  }
}

class _ChatMessage {
  final String text;
  final bool isUser;
  final Uint8List? image;
  _ChatMessage({required this.text, required this.isUser, this.image});
}

Test it

Hot restart (press R in the running flutter terminal). On the chat screen, tap the image icon, pick a photo. Type "What is this?" and send.

Expected behavior

Image preview shows above the input bar. After sending, the image appears in your chat bubble. Gemma's response describes the image. First image inference takes 10-20 seconds (it processes image tokens). Subsequent ones are faster.

Why gallery only, not camera?

On 6GB RAM phones, opening the system camera while Gemma is loaded triggers OOM and Android kills your app. We learned this the hard way during development. Gallery picker is memory-safe.

If you have an 8GB+ phone and want camera too, change ImageSource.gallery to ImageSource.camera. On most modern flagships it works fine.

If something goes wrong

App crashes when picking image

Permission missing. Re-check Step 5c (Android) and 6b (iOS).

Image attaches but response is empty/weird

Try with a text prompt: "Describe this in detail." Empty prompts sometimes confuse the model.

Camera path crashes the app

Memory limit on your device. Stick with ImageSource.gallery.

Mark as done

Step 13

Release build + GitHub

15 min

Goal

Build a release APK that's faster and cleaner than debug. Push your code to GitHub for the world to see.

13a. Release build (Android)

Connect your Android phone:

bashTerminal

flutter run --release

First release build takes 3-5 minutes (R8 + ProGuard processing). Result: a much smaller, faster APK with no debug overlays. Inference is 2-3× faster than debug.

Once it's running, you can press q to stop. The release APK is now installed on your phone permanently — you can launch it from the home screen any time.

13b. Verify `.gitignore` is safe

Run this — your .env should be listed, your token should NOT appear anywhere in tracked files:

bashTerminal

git ls-files | grep -E "\.env|hf_"
# Should output NOTHING. If it lists files, stop and fix gitignore.

Add a few extra ignore patterns just in case:

bashTerminal

cat >> .gitignore <<'EOF'
# Secrets & tokens
.env
config.json

# Model files (too large for git)
*.task
*.tflite
*.bin
*.litertlm
models/

# OS
.DS_Store
Thumbs.db
EOF

13c. Create a great README

Replace your README.md:

markdownREADME.md

# Gemma Vision Demo

On-device AI Flutter app using Google's Gemma 3n E2B multimodal model.

## What it does

- Text chat with streaming responses
- Image understanding (multimodal vision)
- Works completely offline (airplane mode)
- Data never leaves the device

Powered by [Gemma 3n E2B](https://huggingface.co/google/gemma-3n-E2B-it-litert-preview)
running locally via [flutter_gemma](https://pub.dev/packages/flutter_gemma).

## Setup

1. Get a HuggingFace token at [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens)
2. Request access to the [Gemma model](https://huggingface.co/google/gemma-3n-E2B-it-litert-preview)
3. Create `.env` in project root:
   ```
   HUGGINGFACE_TOKEN=hf_your_token_here
   ```
4. Install: `flutter pub get` then `cd ios && pod install && cd ..`
5. Run: `flutter run`

First launch downloads the ~3GB model (one-time, over WiFi).

## Requirements

- Flutter 3.24+
- Android: minSdk 26, 6GB+ RAM (8GB+ for live camera)
- iOS: 16.0+

## Memory tradeoff

On 6GB RAM phones, opening the camera while Gemma is loaded triggers OOM —
Android kills the app. This demo uses gallery picker as workaround.
On 8GB+ phones, camera works fine.

## Credits

- [flutter_gemma](https://github.com/DenisovAV/flutter_gemma) by Sasha Denisov
- [Gemma](https://ai.google.dev/gemma) by Google DeepMind
- [MediaPipe](https://developers.google.com/mediapipe) for on-device inference

## License

MIT

13d. Initialize and push

bashTerminal

git init
git branch -M main
git add .
git status
# Verify .env and config.json are NOT in the list

git commit -m "Initial commit: on-device AI demo with Flutter and Gemma"

Now create a public repo at github.com/new. Name it gemma-vision-demo. Don't initialize with a README (we have one). Then:

bashTerminal

git remote add origin https://github.com/YOUR_USERNAME/gemma-vision-demo.git
git push -u origin main

Replace YOUR_USERNAME with your actual GitHub handle.

You should now have

A public GitHub repo with your full demo code, a clean README, no secrets exposed, ready to be cloned and run by anyone.

If something goes wrong

Release build fails with "Missing classes"

Re-verify Step 5d-5e. R8 needs explicit keep rules.

"warning: LF will be replaced by CRLF"

Harmless on Windows. Ignore.

Pushed too fast and now the token is on GitHub

Stop. Revoke the token immediately at HF tokens page. Generate a new one. Use git filter-repo or BFG Repo-Cleaner to scrub the old token from history.

Mark as done

Where to go from here

You shipped it. Now what?

Up to you

You now have a working on-device AI app. A few directions to take it:

Try a different model

Swap the URL to Gemma 3 1B (smaller, text-only, ~500MB) or experiment with other models on the LiteRT community.

Add system prompts

Pass a system prompt to createChat() to give your model a persona. Build a "code review buddy" or a "Hindi tutor."

Persist conversations

Save chat history to local storage with shared_preferences or sqflite. The model state lives only in memory — chat logs are yours to manage.

Build something specific

A receipt scanner. A travel translator. A photo organizer. Pick a real problem you have, not a generic chatbot. The best demos are specific.

Fine-tune for your use case

Use LoRA to fine-tune Gemma on your own data, then quantize and ship. Tutorial: Gemma + PEFT.

Combine with cloud

Use Gemma on-device for fast/private tasks, fall back to Gemini API for complex reasoning. Best of both worlds.

Resources

flutter_gemma on pub.dev — official package
Gemma docs — model family + use cases
Gemma models on HuggingFace — every variant
MediaPipe GenAI docs — what's under the hood

You built something real.
Now go tell people about it.

— Akansha Jain

What you need before starting

On your computer

For Android

For iOS (optional but recommended)

Background knowledge (nice-to-have)

Check your Flutter install

What version do you need?

"flutter: command not found"

"channel [user-branch]" or "unknown source"

Can't upgrade / company-managed Flutter

Create the Flutter project

Verify the default app runs

"No devices found"

Android license errors

Build takes forever first time

Add flutter_gemma and supporting packages

"Version solving failed"

Get your HuggingFace token

4a. Create a HuggingFace account

4b. Request model access

4c. Generate an access token

4d. Create .env in project root

4e. Add .env to .gitignore

"Access denied" on the model page

Lost the token

Configure Android

5a. minSdk + Gradle settings

5b. Gradle memory

5c. Permissions

5d. ProGuard rules (for release builds)

5e. Wire ProGuard into release

"Missing classes" during release build

Gradle daemon crashes / OOM

Configure iOS

6a. Update Podfile

6b. Permissions in Info.plist

6c. Install CocoaPods

6d. Xcode signing

6e. Register your device

"Unable to log in with account [email]"

"No profiles for 'com.example.*' were found"

"Personal teams do not support Push Notifications"

Verify the setup

Wire up main.dart

Build the home screen

Model download screen

Download fails immediately with 401 / 403

Download stuck at 0%

"TaskConnectionException: Task timed out"

Download fails partway through (e.g. 60%)

Chat screen with streaming

Run it!

The airplane mode test ✈️

"No active inference model set"

"Out of memory" / app crashes during loading

Streaming is super slow

Response just appears all at once instead of streaming

Add multimodal image input

Test it

App crashes when picking image

Image attaches but response is empty/weird

Camera path crashes the app

Release build + GitHub

13a. Release build (Android)

13b. Verify .gitignore is safe

13c. Create a great README

13d. Initialize and push

Release build fails with "Missing classes"

"warning: LF will be replaced by CRLF"

Pushed too fast and now the token is on GitHub

You shipped it. Now what?

Try a different model

Add system prompts

Persist conversations

Build something specific

Fine-tune for your use case

Combine with cloud

Resources

4d. Create `.env` in project root

4e. Add `.env` to `.gitignore`

Wire up `main.dart`

13b. Verify `.gitignore` is safe