Gemma 4 Device Calling Defined: Step-by-Step Information

April 19, 2026

18

Think about asking your AI mannequin, “What’s the climate in Tokyo proper now?” and as an alternative of hallucinating a solution, it calls your precise Python perform, fetches reside information, and responds appropriately. That’s how empowering the instrument name capabilities within the Gemma 4 from Google are. A very thrilling addition to open-weight AI: this perform calling is structured, dependable, and constructed straight into the AI mannequin!

Coupled with Ollama for native referencing, it permits you to develop non-cloud-dependent AI brokers. The perfect half – these brokers have entry to real-world APIs and companies regionally, with none subscription. On this information, we’ll cowl the idea and implementation structure in addition to three duties that you would be able to experiment with instantly.

Additionally learn: Working Claude Code for Free with Gemma 4 and Ollama

Conversational language fashions have a restricted data primarily based on once they have been developed. Therefore, they’ll provide solely an approximate reply while you ask for present market costs or present climate situations. This lack was addressed by offering an API wrapper round widespread fashions (capabilities). The intention – to unravel most of these questions by way of (tool-calling) service(s).

By enabling tool-calling, the mannequin can acknowledge:

When it’s essential to retrieve exterior data
Determine the right perform primarily based on the supplied API
Compile appropriately formatted technique calls (with arguments)

It then waits till the execution of that code block returns the output. It then composes an assessed reply primarily based on the obtained output.

To make clear: the mannequin by no means executes the tactic calls which were created by the consumer. It solely determines which strategies to name and easy methods to construction the tactic name argument listing. The consumer’s code will execute the strategies that they referred to as by way of the API perform. On this state of affairs, the mannequin represents the mind of a human, whereas the capabilities being referred to as signify the fingers.

Earlier than you start writing code, it’s useful to grasp how every thing works. Right here is the loop that every instrument in Gemma 4 will comply with, because it makes instrument calls:

Outline capabilities in Python to carry out precise duties (i.e., retrieve climate information from an exterior supply, question a database, convert cash from one foreign money to a different).
Create a JSON schema for every of the capabilities you could have created. The schema ought to include the identify of the perform and what its parameters are (together with their sorts).
When the system sends a message to you, you ship each the tool-schemas you could have created and the system’s message to the Ollama API.
The Ollama API returns information in a tool_calls block moderately than plain textual content.
You execute the perform utilizing the parameters despatched to you by the Ollama API.
You come the end result again to the Ollama API as a ‘function’:’instrument’ response.
The Ollama API receives the end result and returns the reply to you in pure language.

This two-pass sample is the muse for each function-calling AI agent, together with the examples proven beneath.

To execute these duties, you have to two elements: Ollama have to be put in regionally in your machine, and you have to to obtain the Gemma 4 Edge 2B mannequin. There are not any dependencies past what is supplied with the usual set up of Python, so that you don’t want to fret about putting in Pip packages in any respect.

1. To put in Ollama with Homebrew or MacOS:

# Set up Ollama (macOS/Linux) 
curl --fail -fsSL https://ollama.com/set up.sh | sh

2. To obtain the mannequin (which is roughly 2.5 GB):

# Obtain the Gemma 4 Edge Mannequin – E2B 
ollama pull gemma4:e2b

After downloading the mannequin, use the Ollama listing to verify it exists within the listing of fashions. Now you can connect with the operating API on the URL http://localhost:11434 and run requests towards it utilizing the helper perform we’ll create:

import json, urllib.request, urllib.parse
def call_ollama(payload: dict) -> dict:
    information = json.dumps(payload).encode("utf-8")
    req = urllib.request.Request(
        "http://localhost:11434/api/chat",
        information=information,
        headers={"Content material-Kind": "software/json"},
    )
    with urllib.request.urlopen(req) as resp:
        return json.masses(resp.learn().decode("utf-8"))

No third-party libraries are wanted; due to this fact, the agent can run independently and gives full transparency.

Additionally learn: Learn how to Run Gemma 4 on Your Cellphone: A Arms-On Information

Arms-on Activity 01: Reside Climate Lookup

The primary of our strategies makes use of open-meteo that pulls reside information for any location via a free climate API that doesn’t want a key to be able to pull the data all the way down to the native space primarily based on longitude/latitude coordinates. When you’re going to make use of this API, you’ll have to carry out a sequence of steps :

1. Write your perform in Python

def get_current_weather(metropolis: str, unit: str = "celsius") -> str:
    geo_url = f"https://geocoding-api.open-meteo.com/v1/search?identify={urllib.parse.quote(metropolis)}&rely=1"
    with urllib.request.urlopen(geo_url) as r:
        geo = json.masses(r.learn())
    loc = geo["results"][0]
    lat, lon = loc["latitude"], loc["longitude"] 
    url = (f"https://api.open-meteo.com/v1/forecast"
           f"?latitude={lat}&longitude={lon}"
           f"&present=temperature_2m,wind_speed_10m"
           f"&temperature_unit={unit}")
    with urllib.request.urlopen(url) as r:
        information = json.masses(r.learn())
    c = information["current"]
    return f"{metropolis}: {c['temperature_2m']}°, wind {c['wind_speed_10m']} km/h"

2. Outline your JSON schema

This gives the data to the mannequin in order that Gemma 4 is aware of precisely what the perform will probably be doing/anticipating when it’s referred to as.

 weather_tool = { 

    "kind": "perform",
    "perform": {
        "identify": "get_current_weather",
        "description": "Get reside temperature and wind velocity for a metropolis.",
        "parameters": {
            "kind": "object",
            "properties": {
                "metropolis": {"kind": "string", "description": "Metropolis identify, e.g. Mumbai"},
                "unit": {"kind": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["city"]
        }
    }

3. Create a question in your instrument name (in addition to deal with and course of the response again)

messages = [{"role": "user", "content": "What's the weather in Mumbai right now?"}] response = call_ollama({"mannequin": "gemma4:e2b", "messages": messages, "instruments": [weather_tool], "stream": False}) msg = response["message"]
if "tool_calls" in msg: tc = msg["tool_calls"][0] fn = tc["function"]["name"] args = tc["function"]["arguments"] end result = get_current_weather(**args) # executed regionally
messages.append(msg) 
messages.append({"function": "instrument", "content material": end result, "identify": fn})
ultimate = call_ollama({"mannequin": "gemma4:e2b", "messages": messages, "instruments": [weather_tool], "stream": False}) 
print(ultimate["message"]["content"])

Output

Arms-on Activity 02: Reside Foreign money Converter

The basic LLM fails by hallucinating foreign money values and never with the ability to present correct, up-to-date foreign money conversion. With the assistance of ExchangeRate-API, the converter can get the newest international change charges and convert precisely between two currencies.

When you full Steps 1-3 beneath, you should have a completely functioning converter in Gemma 4:

1. Write your Python perform

def convert_currency(quantity: float, from_curr: str, to_curr: str) -> str:
    url = f"https://open.er-api.com/v6/newest/{from_curr.higher()}"
    with urllib.request.urlopen(url) as r:
        information = json.masses(r.learn())
    price = information["rates"].get(to_curr.higher())
    if not price:
        return f"Foreign money {to_curr} not discovered."
    transformed = spherical(quantity * price, 2)
    return f"{quantity} {from_curr.higher()} = {transformed} {to_curr.higher()} (price: {price})"

2. Outline your JSON schema

currency_tool = { 

    "kind": "perform",
    "perform": {
        "identify": "convert_currency",
        "description": "Convert an quantity between two currencies at reside charges.",
        "parameters": {
            "kind": "object",
            "properties": {
                "quantity":    {"kind": "quantity", "description": "Quantity to transform"},
                "from_curr": {"kind": "string", "description": "Supply foreign money, e.g. USD"}, 
                "to_curr":   {"kind": "string", "description": "Goal foreign money, e.g. EUR"}
            },
            "required": ["amount", "from_curr", "to_curr"]
        } 
    }
}

3. Check your answer utilizing a pure language question

response = call_ollama({
    "mannequin": "gemma4:e2b",
    "messages": [{"role": "user", "content": "How much is 5000 INR in USD today?"}],
    "instruments": [currency_tool],
    "stream": False
})

Gemma 4 will course of the pure language question and format a correct API name primarily based on quantity = 5000, from = ‘INR’, to = ‘USD’. The ensuing API name will then be processed by the identical ‘Suggestions’ technique described in Activity 01.

Output

Gemma 4 excels at this job. You possibly can provide the mannequin a number of instruments concurrently and submit a compound question. The mannequin coordinates all of the required calls in a single go; handbook chaining is pointless.

1. Add the timezone instrument

def get_current_time(metropolis: str) -> str: 

    url = f"https://timeapi.io/api/Time/present/zone?timeZone=Asia/{metropolis}"
    with urllib.request.urlopen(url) as r:
        information = json.masses(r.learn())
    return f"Present time in {metropolis}: {information['time']}, {information['dayOfWeek']} {information['date']}"
time_tool = {
    "kind": "perform",
    "perform": {
        "identify": "get_current_time",
        "description": "Get the present native time in a metropolis.",
        "parameters": {
            "kind": "object",
            "properties": {
                "metropolis": {"kind": "string", "description": "Metropolis identify for timezone, e.g. Tokyo"}
            },
            "required": ["city"]
        }
    }

2. Construct the multi-tool agent loop

TOOL_FUNCTIONS = { "get_current_weather": get_current_weather, "convert_currency": convert_currency, "get_current_time": get_current_time, } 

def run_agent(user_query: str): all_tools = [weather_tool, currency_tool, time_tool] messages = [{"role": "user", "content": user_query}] 

response = call_ollama({"mannequin": "gemma4:e2b", "messages": messages, "instruments": all_tools, "stream": False}) 
msg = response["message"] 
messages.append(msg) 
 
if "tool_calls" in msg: 
    for tc in msg["tool_calls"]: 
        fn     = tc["function"]["name"] 
        args   = tc["function"]["arguments"] 
        end result = TOOL_FUNCTIONS[fn](**args) 
        messages.append({"function": "instrument]]]", "content material": end result, "identify": fn}) 
 
    ultimate = call_ollama({"mannequin": "gemma4:e2b", "messages": messages, "instruments": all_tools, "stream": False}) 
    return ultimate["message"]["content"]
return msg.get("content material", "")

3. Execute a compound/multi-intent question

print(run_agent(
    "I am flying to Tokyo tomorrow. What is the present time there, "
    "the climate, and the way a lot is 10000 INR in JPY?"
))e

Output

Right here, we described three distinct capabilities with three separate APIs in real-time via pure language processing utilizing one widespread idea. It consists of all native execution with out cloud options from the Gemma 4 occasion; none of those elements make the most of any distant assets or cloud.

What Makes Gemma 4 Completely different for Agentic AI?

Different open weight fashions can name instruments, but they don’t carry out reliably, and that is what differentiates them from Gemma 4. The mannequin constantly gives legitimate JSON arguments, processes non-obligatory parameters appropriately, and determines when to return data and never name a instrument. As you retain utilizing it, bear in mind the next:

Schema high quality is critically vital. In case your description discipline is imprecise, you should have a troublesome time figuring out arguments in your instrument. Be particular with items, codecs, and examples.
The required array is validated by Gemma 4. Gemma 4 respects the wanted/non-obligatory distinction.
As soon as the instrument returns a end result, that end result turns into a context for any of the “function”: “instrument” messages you ship throughout your ultimate go. The richer the end result from the instrument, the richer the response will probably be.
A typical mistake is to return the instrument end result as “function”: “consumer” as an alternative of “function”: “instrument”, because the mannequin is not going to attribute it appropriately and can try to re-request the decision.

Additionally learn: Prime 10 Gemma 4 Initiatives That Will Blow Your Thoughts

Conclusion

You’ve gotten created an actual AI agent that makes use of the Gemma 4 function-calling characteristic, and it’s working fully regionally. The agent-based system makes use of all of the elements of the structure in manufacturing. Potential subsequent steps can embrace:

including a file system instrument that may permit for studying and writing native information on demand;
utilizing a SQL database as a way for making pure language information queries;
making a reminiscence instrument that may create session summaries and write them to disk, thus offering the agent with the flexibility to recall previous conversations

The open-weight AI agent ecosystem is evolving shortly. The flexibility for Gemma 4 to natively help structured perform calling affords substantial autonomous performance to you with none reliance on the cloud. Begin small, create a working system, and the constructing blocks in your subsequent tasks will probably be prepared so that you can chain collectively.

Technical content material strategist and communicator with a decade of expertise in content material creation and distribution throughout nationwide media, Authorities of India, and personal platforms

Gemma 4 Device Calling Defined: Step-by-Step Information

1. To put in Ollama with Homebrew or MacOS:

2. To obtain the mannequin (which is roughly 2.5 GB):

Arms-on Activity 01: Reside Climate Lookup

1. Write your perform in Python

2. Outline your JSON schema

3. Create a question in your instrument name (in addition to deal with and course of the response again)

Output

Arms-on Activity 02: Reside Foreign money Converter

1. Write your Python perform

2. Outline your JSON schema

3. Check your answer utilizing a pure language question

Output

1. Add the timezone instrument

2. Construct the multi-tool agent loop

3. Execute a compound/multi-intent question

Output

What Makes Gemma 4 Completely different for Agentic AI?

Conclusion

Login to proceed studying and revel in expert-curated content material.

Related Articles

AI brings object-level imaginative and prescient prosthetics nearer to actuality

From autonomous networks to clever telcos

Why Amazon Dropped Its OpenAI Film, Knowledge Heart Staff Combat Again, and Meta Leaks Worker Knowledge

LEAVE A REPLY Cancel reply

Latest Articles

AI brings object-level imaginative and prescient prosthetics nearer to actuality

From autonomous networks to clever telcos

Why Amazon Dropped Its OpenAI Film, Knowledge Heart Staff Combat Again, and Meta Leaks Worker Knowledge

So Lengthy and Thanks for All of the Context – O’Reilly

Run remoted sandboxes with full lifecycle management: AWS Lambda introduces MicroVMs

ABOUT US