How to Build a Coding Agent

Introduction

How to build a coding agent cover image

Coding agents have become a fundamental part of every developer’s toolbox. Whether you are using them inside your IDE (Cursor Agent, Copilot in VSCode), inside the CLI (Claude Code, Codex CLI, Copilot CLI), on a standalone desktop app (Codex App, Claude Desktop) or on the web as sandboxed agents with computers of their own (Perplexity Computer, Claude AI), what makes a coding agent different and so powerful from the previous generation of AI coding assistants of autocomplete and tab completion models is the full autonomy that they have over actions they take on your codebase. Given a prompt, a coding agent must be able to gather context independently and decide on what action to take. Coding agents are very powerful but quite simple in concept. You provide an LLM with tools that allow it to interact with your computer’s file system and your terminal, so, it can perform CRUD operations on files and run commands independently. During inference, the agent can call these tools, which are mapped to local functions in your code that handle the tool calls and return status information back to the LLM.

In this tutorial, we will be building tiny-code a standalone CLI agent that you can open in any folder and ask it for perform tasks for you. I will provide the code in both Typescript and Python so both NodeJS gurus and Pythonistas can follow along well. Let’s get right into it!

Project Setup

We will be using gpt-5.4 for this project so you need an OpenAI API Key to follow along.

You need to have NodeJS installed. Follow the installation guide here.

Set up the project and install dependencies with these commands

mkdir tiny-code
cd tiny-code
npm init -y
tsc --init
npm install openai @types/node dotenv

Replace content of tsconfig.json with this

{
	"compilerOptions": {
		"target": "ES2022",
		"module": "NodeNext",
		"moduleResolution": "NodeNext",
		"outDir": "dist",
		"rootDir": "src",
		"strict": true
	}
}

Replace content of package.json with this

{
	"name": "tiny-code",	
	"version": "0.1.0",
	"type": "module",
	"bin": {
		"tiny-code": "./dist/agent.js"
	},
	"scripts": {
		"build": "tsc",
		"postbuild": "chmod +x dist/agent.js",
		"prepare": "npm run build",
		"start": "node dist/agent.js",
		"dev": "node --env-file=.env --watch src/agent.js"
	},
	"dependencies": {
		"@types/node": "^25.3.2",
		"dotenv": "^17.3.1",
		"openai": "^6.27.0"
	}
}

Project structure:

|__tiny-code/
	|__src/
		|__functions.ts
		|__tools.ts
		|__agent.ts

Install uv We will be using uv as the package manager for our project. You can find guidance for installing uv here.

Set up the project and install dependenices with these commands:

mkdir tiny-code
cd tiny-code
uv init
uv venv .venv
source .venv/bin/activate
touch .env
uv add openai python-dotenv

Project structure:

|__tiny-code/
	|__src/
		|____init.py__
		|__functions.py
		|__tools.py
		|__agent.py

Creating the functions

We know that at the heart of any LLM-based agent is a tool calling loop. So, what makes a coding agent different from say a customer service agent or a deep research agent? I think it is the functions that you provide to it as tools it can call. A coding agent requires two classes of functions: file system functions and command execution functions. These functions are meant for your coding agent. What do I mean by that? The LLM in your agent loop is the one that provides the arguments to these functions. So for instance, if the LLM calls the read_file tool, it will provide the path to the file it wants to read as an argument, which you will use to execute the readfile() function that acts as the handler for that tool.

Think about these functions like superpowers that your coding agent can invoke when they need to do something beyond sending you a text response.

For this project, we will be providing tiny-code with 5 functions.

read_file
create_file
edit_file
delete_file
run_command

Let’s have a break down of each of them

read_file: this is a function that takes a path to a file and returns its content in utf-8 format. It uses the builtin file system module of your programming language of choice. It is very important to your coding agent to help it gather context when performing tasks. For example, if you have an agent.md file where you have written instructions for how to perform a task, instead of putting this instruction into the system prompt directly, you can pass the path in the system prompt and the agent will use the read_file tool to read the content itself or in a multi-agent system the main agent can spin up worker agents who read the same file and therefore maintain context on your task.

Let’s look at the implementation of this function in code.

import fs from "node:fs/promises"

export interface ToolResult {
	ok: boolean;
	message: string;
}

export async function readFile({path}: {path: string}): Promise<ToolResult> {
	const content = await fs.readFile(path, "utf-8");
	return { ok: true, message: content }
}

First, we create a ToolResult interface to represent the function’s return value: an object containing a status message and the tool result. This implementation wraps the the readFile()method in Node’s filesystem API, fs, and we return the content, which is the result of readFile() in utf-8 format.

from pathlib import Path
def read_file(path: str) -> dict:
	file_path = Path(path)
	content = file_path.read_text()
	return { "ok": True, "content": content }

This implementation uses the read_text() method on the Path subclass in Python. We take the path string, convert it to a Path object and assign it to a new variable, file_path. Then we call read_text() on file_path to get the file content. We return the content in a dict alongside a status message. This lets the model know if the function handled the tool call successfully.

create_file: This function gives our agent the ability to create files. The code and logic that makes up every piece of software, from a small project like the one we are currently writing, to large libraries, applications and web services like Netflix, OpenClaw and Amazon are stored in files. A create_file function turns our coding agent into an autonomous entity that can write programs that a computer can compile or interpret and run. Here’s the implementation of this function:

import { access } from "node:fs/promises";

export async function createFile({path, content}: {path: string, content: string}): Promise<ToolResult> {
	try {
		await access(path);
		return {ok: false, message: `${path} already exists.`}
		} catch {
		await fs.writeFile(path, content);
		return { ok: true, message: `created new file at ${path}`};
	}
}

Similar to the read_file() function above, this function uses a method from Node’s fs module, the writeFile() method. This function accepts two arguments: path and content, representing the location of the file we want to create and content to add to it. We don’t want to overwrite existing files accidentally, so, we only call the writeFile() function if the file doesn’t already exists. To overwrite an existing file, it would be better to use the edit_file() function.

def create_file(path: str, content: Optional[str] = None):
	file_path = Path(path)
	file_path.parent.mkdir(parents=True, exist_ok=True)
	if file_path.exists():
		return {"ok": False, "message": f"{path} already exists."}
	file_path.write_text(content or "")
	return {"ok": True, "message": f"created {file_path} successfully."}

Here again, we are using a method from Python’s Path subclass. First, we create a parent directory for the file if it doesn’t exist, and we check whether the file already exists, so, we don’t overwrite an existing file mistakenly. Then, we use the write_text() method to write the content to the file. If we are creating an empty file, the function can take an empty string.

Edit_file: This function is how the agent is able to create diffs, that is, editing existing code. The way the edit_file function works is to find all instances of the content the model wants to change and replace it with the updated content. This is what it looks like in code.

export async function editFile({path, oldContent, newContent}: {path: string, oldContent: string, newContent: string}): Promise<ToolResult> {
	let content = await fs.readFile(path, "utf-8");
	content = content.replace(oldContent, newContent);
	await fs.writeFile(path, content);
	return {ok: true, message: `edith ${path}, replaced all occurences of ${oldContent} with ${newContent}`};
}

def edit_file(path: str, old_content: str, new_content: str):
	
	def write_to_file(path: str, content):
		file_path = Path(path)
		with open(file_path, "w") as f:
			f.write(content)
		return {"ok": True, "message": f"written {content} to {file_path}"}

	file_path = Path(path)
	existing_content = file_path.read_text()
	new_file_content = existing_content.replace(old_content, new_content)
	write_to_file(path, new_file_content)
	return {"ok": True, "message": f"edited {path} successfully."}

delete_file: This function lets our agent perform clean up operations on a codebase by letting it delete files.

export async function deleteFile({path}: {path: string}): Promise<ToolResult>{
	await fs.rm(path);
	return {ok: true, message: `successfully deleted ${path}`};
}

import os

def delete_file(path: str):
	file_path = Path(path)
	os.remove(file_path)
	return {"ok": True, "message": f"deleted {path} successfully."}

Since this function performs destructive actions, you might want to approve the execution of the function whenever your agent calls it. Or you could go dangerously-skip-permissions or yolo_mode and give your agent full autonomy, your choice.

run_command: This is the most powerful function that you can give to your agent. In fact, every other function we have looked at so far can be replaced by this single function. This function gives your agent the ability to interact directly with your computer’s operating system by creating new processes that it can use to run commands directly through your computer’s command line interface.

When an agent runs a command, it is creating a new process on your computer. Processes are running programs with memory and resources isolated from each other. Every application on your computer is made up of multiple concurrent running processes. Processes are what makes it possible for your computer to run multiple applications at the same time. When you run a command in a terminal, like mkdir some-dir or pwd you are using binaries - programs, that your operating system supports directly.

So, for example, mkdir and pwd are programs that are built directly into your operating system kernel, and when you run mkdir some-dir your operating system creates a new process where it runs the mkdir binary to actually create some-dir. The command line interface is a visual interface for you to input these commands to the OS.

Python’s subprocess module and Node’s child_process module provide methods that you can use to create these processes programatically without needing a command line. You have the option of running these commands through a shell, like Bash, Zsh on Mac and Linux or CommandPrompt and Powershell on Windows. The shell acts like a middleman between your commands and the OS kernel. It takes your commands, figuring out what programs you want to run, and asks your operating system’s kernel to execute them.

For implementing this function, we will use the shell in the TypeScript version and create the process directly without going through the shell in the Python version.

Node’s subprocess module has multiple methods for creating new processes programatically. For example, spawn() lets you create a new process without creating a shell. The output from your command, stdout and stderr arrives in chunks as the process produces it . exec() creates a shell and collects the process output into memory and returns the complete result in a callback. exec() is recommended when you need to run a command and get the entire output at once but has a buffer size limit of 1MB so if you get a very large output from a command, it will throw an error. spawn() is recommended when you have a long running process or expect a very large output from your process because it streams output directly and does not need to store anything. For this tutorial, we will be using exec.

So how does this function work?

import { exec } from "node:child_process";

interface CommandResult {
	stdout: string;
	stderr: string;
	exitCode: number;
}

export function runCommandWithExec({ command}: { command: string }): Promise<CommandResult> {
	return new Promise((resolve, reject) => {
		exec(command, (error, stdout, stderr) => {
			if (error && typeof error.code !== "number") {
				reject(error);
				return;
			}
		
			resolve({
				stdout: stdout.trim(),
				stderr: stderr.trim(),
				exitCode: error ? Number(error.code) : 0
			});
		});
	});
}

We pass our command into the exec function alongside a callback function with error, stdout and stderr as arguments. Inside the callback, we handle any errors arising from outside the process itself . exec() itself is wrapped inside a Promise with resolve and reject as callback arguments. Inside the reject() method we pass any runtime errors and return control to the Promise() method. We also pass the outputs from the process, stdout and stderr, and any runtime errors, error, if the Promise() fails to resolve() as an object literal.

import subprocess

def run_command(cmd: str) -> dict:
	args = cmd.split()
	result = subprocess.run(	
		args,
		shell=False,
		capture_output=True,
		text=True,
		check=True
	)
	return {"ok": True, "result": result.stdout}

In Python, we have one method, run(), for creating child processes through the subprocess module. Shell configuration and output streaming are handled as parameters that we pass to the run() method. Let’s go over the code line by line. We pass our command as a string and we want this function to return a dict with status message and the output of the process. At line 2 we change our command from a string data type to a list of strings using the split() method on the Python string object. This means that each part of our command is passed as a separate literal argument, reducing errors that arise when Python is trying to figure out where the command starts and ends. In the subprocess.run() call, we pass the args, shell=False means that we are not passing our commands through a shell to the os but creating the process directly. We set capture_output to True so that we can receive the output being streamed from the process. The output from the process will be in byte format so we set text=True so that we can convert it into readable text. check=True lets us know when there is an error in the process. It will raise an error automatically if the command fails.

Using the run_command() method tiny-code can use os-native programs and command line utilities like cat, mv, curl rm and more to perform almost every action it needs to on a computer including creating, removing, editing and running code files and even making requests to a web server. So, what is the point of the other 4 functions then? Remember, with great power comes great responsibility.

A command execution function is powerful but can also be risky. In production settings, in fact, you would need to have block lists that explicitly prevent your coding agent from running destructive commands or accessing sensitive files like .env, or allow lists where you define a narrow list of commands that the agent is allowed to run. Secondly, it’s good to have separation of concerns. If we have different functions for different categories of operations, we can add logging and monitoring on each function, we can also prevent having a single point of failure where if for any reason the command execution function is failing, the entire agent program doesn’t work.

Function/Tool Schemas

By themselves the functions we’ve defined are not useful to an LLM. A tool schema is a JSON object that provides structured information about available functions to an LLM. The LLM is trained to recognize this structure and uses it decide when to use a function and which arguments to call it with. For the OpenAI API, the shape of the tool schema looks like this:

[
	{
	type: "function",
	name: "some_function",
	description: "a description of the function",
	parameters: {
		type: "object",
		properties: {
			an_argument: {
				type: "the argument's data type, e.g, string, float, list",
				description: "a description of the argument"
			}
		},
		required: ["an_argument"],
		additionalProperties: false
	},
		strict: true
]

We provide the type of the object (it is always a function under the OpenAI API), define its name and give it a description, and if the function takes in parameters, we provide them as properties with their own types and descriptions. We also specify if any parameter is required by adding it to a required list. When we set additionalProperties to false it prevents our model from introducing unknown arguments when it’s calling a function. Setting strict to true forces our model to strictly adhere to the tool schema. Those two flags give us more reliability in our tool calls.

Below are the tool definitions for our 5 functions:

import { Tool } from "openai/resources/responses/responses";

export const tools: Tool[] = [
	
	{
		type: "function",
		name: "read_file",
		description: "Returns the contents of a file given the path. Use it to read file contents.",
		parameters: {
			type: "object",
			properties: {
				path: {
				type: "string",
				description: "The path to the file"
				}
			},
			required: ["path"],
			additionalProperties: false			
		},
			strict: true
	},
	{
		type: "function",
		name: "create_file",
		description: "Create a file and add some content to it. Eg create a file hello.txt with content 'hello' inside it.",
		
		parameters: {
			type: "object",
			properties: {
				path: {
					type: "string",
					description: "The path to the file to create."
				},
				content: {
					type: "string",
					description: "The content or data to write to the created file."				
					}
				},
			required: ["path", "content"],
			additionalProperties: false,
		},
		strict: true
	},

	{
		type: "function",
		name: "edit_file",
		description: "Edit a file by writing new content to it. eg, edit a file hell.txt with content 'Hello world' to 'Hello planet' by replacing 'world' with 'planet'",
		parameters: {
			type: "object",
			properties: {
				path: {
					type: "string",
					description: "Path to the file to edit"
				},
				oldContent: {
					type: "string",
					description: "The content to replace"
				},
				newContent: {
					type: "string",
					description: "The new content to replace the old content with."
				},
			},
			required: ["path", "oldContent", "newContent"],
			additionalProperties: false
		},
		strict: true
	},
	{
		type: "function",
		name: "delete_file",
		description: "Delete a file",
		parameters: {
		type: "object",
		properties: {
			path: {
				type: "string",
				description: "path to the file to delete"
			}
		},
		required: ["path"],
		additionalProperties: false
		},
		strict: true,
			
	},
	{
		type: "function",
		name: "run_command_with_exec",
		description: "Run commands using Node's exec() child process method. Creates a new shell to run the commands in. Only use for short commands that don't return large outputs. Eg pwd, ls -lh",
		parameters: {
			type: "object",
			properties: {
				command: {
					type: "string",
					description: "The command to run. eg 'ls -lh', 'pwd' "
				}
			},
			required: ["command"],
			additionalProperties: false
		},
		strict: true
	},
]

tools: list[ToolParam] = [
	{
	"type": "function",
	"name": "run_command",
	"description": "run a command",
	"parameters": {
		"type": "object",
		"properties": {
			"cmd": {
				"type": "string",
				"description": "The command(s) to run"
			}
		},
		"required": ["cmd"],
		"additionalProperties": False
	},
	"strict": True
	},
	{
		"type": "function",
		"name": "read_file",
		"description": "Read the contents of a file.",
		"parameters": {
		"type": "object",
		"properties": {
			"path": {
			"type": "string",
			"description": "Path to the file to read"
			}
		},
		"required": ["path"],
		"additionalProperties": False,
		},
		"strict": True
	},

	{
		"type": "function",
		"name": "write_to_file",
		"description": "Add content to an existing file.",
		"parameters": {
			"type": "object",
			"properties": {
				"path": {
					"type": "string",
					"description": "The path to the file you want to write to"
				},
				"content": {
					"type": "string",
					"description": "The content you want to write"
				}
			},
			"required": ["path", "content"],
			"additionalProperties": False,
			},
		"strict": True,
	},
	{
		"type": "function",
		"name": "create_file",
		"description": "Create a new file.",
		"parameters": {
			"type": "object",
			"properties": {
				"path": {
					"type": "string",
					"description": "The path to the file to create"	
				},
				"content": {
					"type": "string",
					"description": "Content you want to write to the file"
				}
		
			},
		"required": ["path", "content"],
		"additionalProperties": False,
		},
		"strict": True,
	},
	{
		"type": "function",
		"name": "edit_file",
		"description": "Edit a file by writing new content to it",
		"parameters": {
			"type": "object",
			"properties": {
				"path": {
					"type": "string",
					"description": "The path to the file to edit"
				},
				"old_content": {
					"type": "string",
					"description": "The text to be replaced"
				},
				"new_content": {
					"type": "string",
					"description": "The text to replace old_content with."
				}
			},
			"required": ["path", "old_content", "new_content"],
			"additionalProperties": False,
		},
		"strict": True,
	},
]

Tool handlers

To connect the tool schemas and the functions, we create tool handlers, which are key-value pairs with the tool schema name as the key and the matching function as the value.

export const toolHandlers: Record<string, (args: any) => Promise<any>> = {
	read_file: readFile,
	create_file: createFile,
	edit_file: editFile,
	delete_file: deleteFile,
	run_command_with_exec: runCommandWithExec
}

from tools import read_file, edit_file, create_file, run_command, write_to_file

TOOLS = {
	"run_command": run_command,
	"read_file": read_file,
	"create_file": create_file,
	"edit_file": edit_file
}

Inference: Where the LLM transforms into an agent

After we have written the functions and defined the tool handlers, the next part is to actually get a response from the model. This is done through a process called inference. We send prompts to the model and based on those prompts the model predicts what is likely to come next. For instance, if we prompt the model with, “How are you?”, the model is trained to predict that the next likely sequence of words to follow is, “I am fine.”

Tool calling takes an LLM from just being able to predict the next word to being able to take actions by invoking tools when it is given prompts that require action. So, given the prompt, “create a snake game in Python.”, instead of just outputting the code to you as a text response, it can invoke the create_file tool and write the code into the file and use the run_command tool to run the game. If you follow up with a second prompt, “Port the game to typescript.”, the model can use the edit_file tool to update existing files or even use the delete_file tool to delete existing ones and create new .ts files with the create_file tool.

A good analogy to understand this is thinking of the LLM as a cook. Without tools it is a cook without a kitchen or utensils, it knows all the recipes and can say them to you, but to have the cook make you a meal you can eat, you need to give them a stove, utensils, plates, etc. Then they will use these tools to turn the recipes they know into meals and serve them to you.

Here is the code for inference process using the OpenAI Responses API.

#!/usr/bin/env node

import OpenAI from "openai";
import { tools, toolHandlers } from "./tools/toolSchemas.js";
import { ResponseInput } from "openai/resources/responses/responses";
import * as rl from "node:readline";
import { stdin as input, stdout as output } from "node:process";
import "dotenv/config";

const client = new OpenAI({
	apiKey: process.env.OPENAI_API_KEY,
});

async function runAgentTurn(conversationHistory: ResponseInput, userInput: string){

	conversationHistory.push({ role: "user", content: userInput });
	while (true) {
		const response = await client.responses.create({
		model: "gpt-5.4",
		input: conversationHistory,
		tools,
		tool_choice: "auto"
	});

	conversationHistory.push(...(response.output as ResponseInput ));

	const toolCalls = response.output.filter((output) => output.type === "function_call");

	if ( toolCalls.length === 0) {
		console.log(`\nAssistant: ${response.output_text}\n`);
		return;
	}

  
	for (const call of toolCalls) {
		const toolName = call.name;
		const handler = toolHandlers[toolName];
		if (!handler){
			conversationHistory.push({
				type: "function_call_output",
				call_id: call.call_id,
				output: JSON.stringify({ error: `Unknown tool: ${toolName}`})
			});
		continue;
	}

	let result: any;

	try {
		const args = JSON.parse(call.arguments || "{}");
		result = await handler(args);
	} catch (err: any) {
	
		result = { error: {
			type: "tool_runtime_error",
			message: err instanceof Error ? err.message : String(err)
			}
		};
	}

	conversationHistory.push({
		type: "function_call_output",
		call_id: call.call_id,
		output: JSON.stringify(result)
	});
	}
	}
}


async function main() {
	const terminalChat = rl.createInterface({ input, output });
	console.log(`\nHello! I am your tiny-agent ready to help you with your coding.`);
	const conversationHistory: any[] = [
		{
			role: "developer",
			content: "You are a coding agent. Use the provided tools to answer the user's questions."
	
		}
	];

	const terminalInput = (userInput: string) => new Promise<string>((resolve) => terminalChat.question(userInput, resolve));

	while (true) {
		const newInputFromUser = (await terminalInput("You: ")).trim();
		if (!newInputFromUser) continue;
		if (newInputFromUser === "exit()") break;
		await runAgentTurn(conversationHistory, newInputFromUser);
	}
	terminalChat.close();
}

main().catch((err) => {
	console.error(err);
	process.exit(1);
});

Let’s break down the code and see what each line is doing:

import OpenAI from "openai";
import { tools, toolHandlers } from "./tools/toolSchemas.js";
import { ResponseInput } from "openai/resources/responses/responses";
import * as rl from "node:readline";
import { stdin as input, stdout as output } from "node:process";
import "dotenv/config";

We import the OpenAI client class which we will use to send requests to the API. We import the tools and tool handlers, these will go into the OpenAI API create() method that we use to receive a response from the model. We import the ResponseInput type which our prompts will take. We also import the node:readline and stdin, stdout modules so we can communicate with the agent through a terminal. Lastly, we import dontenv/config to load our OPENAI_API_KEY as an environment variable.

const client = new OpenAI({
	apiKey: process.env.OPENAI_API_KEY,
});

We define the OpenAI client.

async function runAgentTurn(conversationHistory: ResponseInput, userInput: string){

	conversationHistory.push({ role: "user", content: userInput });
	
	while (true) {
		const response = await client.responses.create({
			model: "gpt-5.4",
			input: conversationHistory,
			tools,
			tool_choice: "auto"
		});
	
		conversationHistory.push(...response.output);

		const toolCalls = response.output.filter((output) => output.type === "function_call");

		if ( toolCalls.length === 0) {
			console.log(`\nAssistant: ${response.output_text}\n`);
			return;	
			}

  
		for (const call of toolCalls) {
			const toolName = call.name;
			const handler = toolHandlers[toolName];
			if (!handler){
				conversationHistory.push({
					type: "function_call_output",
					call_id: call.call_id,
					output: JSON.stringify({ error: `Unknown tool: ${toolName}`})
				});
			continue;
		}

			let result: any;
	
			try {
				const args = JSON.parse(call.arguments || "{}");
				result = await handler(args);
			} catch (err: any) {
				result = { error: {
								type: "tool_runtime_error",
								message: err instanceof Error ? err.message : String(err)
							}
						}
					}
	
			conversationHistory.push({
				type: "function_call_output",
				call_id: call.call_id,
				output: JSON.stringify(result)
			});
		}
	}
}

We define a runAgentTurn() function which handles the core part of the inference work. It takes in conversationHistory and userInput as arguments. The conversation history is running list of all our back and forth messages with the model. userInput is the message we send from the terminal. When runAgentTurn() is called, the first thing it does is push userInput into conversationHistory. The OpenAI chat format uses roles to differentiate between different kinds of messages during inference. Messages sent by an end user take the role user, messages sent by the model take the role assistant and messages defined by a developer to control the way a model interacts take the role developer.

Next, we set up the inference loop. We use while (True) so that the loop continues to run until we directly terminate it. Inside the loop, we set up the OpenAI client by assigning the create() method to a response variable. The parameters we set in this method are the kernel of the agent. We define the model we are using for inference as gpt-5.4, we also set the input to the model as our conversationHistory list. We pass our tool schemas to the model and finally we set tool_choice to auto meaning that the model can choose to call any tool from the list we have provided.

The create() method on the OpenAI Responses API returns an object with output as one of its properties. We push that output to the end of our conversationHistory list. This means that when we send the next message, the model will have full context of all our previous messages and its previous responses.

Next we filter out all tool calls from the model’s output and if there is no tool call, we print response.output_text, the model’s final text response, to the terminal. This handles cases where you send a prompt that does not require a tool call, like, “How are you?” or after a series of tool calls, the model is sending a summary of the work it has done. For instance, you ask, “How many .md files are there in this project?” The model uses its tools to do the count and sends you a final response with the actual number. We end the current iteration of the loop with a return statement until there is a new userInput. Then, we get to handling tool calls.

We loop through each tool_call in output. First, we access the tool name we defined in the function schema and pass that name as a key to our tool handler map/dict to retrieve the corresponding function. We also handle the failure case of when a model calls a tool that we don’t have. This is referred to as hallucination, a situation where the model is literally making things up. We add this information to the conversationHistory so the model can see the error and correct it in its next response.

Next, we create a result variable that we will assign the result of function execution on any tool calls that the agent makes. We set up a try-catch block and inside it we handle the tool calls. Next, we parse the arguments from the function call that is returned in the model’s output, the args variable is optionally initialized to an empty dict/map if there are no arguments. The result = await handler(args) is the actual function execution. So if the handler is read_file() and args is "/hello.txt" await handler(args) would be await read_file("/hello.txt").

We catch any errors from the tool execution, for instance, in the case of read_file() the path already exists. These errors are added to the conversation history too and passed to the model so it is aware of them and can fix them in the next turn. Finally, we add the output of the function execution to the conversation history.

async function main() {
	const terminalChat = rl.createInterface({ input, output });
	console.log(`\nHello! I am your tiny-agent ready to help you with your coding.`);
	const conversationHistory: any[] = [
		{
			role: "developer",
			content: "You are a coding agent. Use the provided tools to answer the user's questions."
	
		}
	];

	const terminalInput = (userInput: string) => new Promise<string>((resolve) => terminalChat.question(userInput, resolve));

	while (true) {
		const newInputFromUser = (await terminalInput("You: ")).trim();
		if (!newInputFromUser) continue;
		if (newInputFromUser === "exit()") break;
		await runAgentTurn(conversationHistory, newInputFromUser);
	}
	terminalChat.close();
}

main().catch((err) => {
	console.error(err);
	process.exit(1);
});

Finally, we set up our main() function, which will be the entry point to the agent. Here, we initilalize the conversationHistory with a system prompt directing the model on how to behave. This is a very minimal prompt, in a production setting, your prompt would be more elaborate with examples and more detailed guidance. We set up a loop to retrieve userInput from the terminal and pass it to runAgentTurn(). Finally, we call the main() function.

And that’s our coding agent fully set up!

You can find the Python version of inference loop below. The code explanations I provided on the TypeScript version also apply here, the only difference is syntax.

from openai import OpenAI
from dotenv import load_dotenv
from tools.tool_handlers import tools, TOOLS
import json
import os
from logging import Logger

load_dotenv()

def run_agent_turn(conversation_history, user_input):
	conversation_history.append({"role": "user", "content": user_input})
	print(conversation_history)

	while True:
		client = OpenAI(
			api_key = os.environ["OPENAI_API_KEY"]
		)

		response = client.responses.create(
			model="gpt-5.4",
			input=conversation_history,
			tools=tools
		)

		'''
		
		We extend conversation_history += response.output because response.output is a list of output items not a single item
		
		If we appended using conversation_history.append(response.output), on the next api call, input=conversation_history would
		
		likely be malformed and raise a type/validation error like "expected an input item/object, got array/list"
		
		+= is equivalent to extending the list, so, conversation_history.extend(response.output) would also work here.
		
		'''

		conversation_history += response.output
		print(conversation_history)
		tool_calls = [o for o in response.output if o.type == "function_call"]
		print(f"\n{tool_calls}") 
	
		if len(tool_calls) == 0:
			print(f"\nAssistant: {response.output_text}")
		return
	
		for call in tool_calls:
			function_name = call.name
			tool_handler = TOOLS[function_name]
			args = call.arguments
			print(f"\ncalling {function_name} with {args}")

			'''
			Why we use json.loads()
			
			the model gives function arguments as text,
			
			but the tool handler needs real values like dicts, lists, strings, numbers, and booleans.
			
			'''

		try:
			tool_result = tool_handler(**json.loads(args))
		except Exception as e:
			print(e)
			tool_result = {"ok": False, "message": f"{type(e).__name__}: {e}"}
		print(f"\n{tool_result}")
	
		conversation_history.append({
			"type": "function_call_output",
			"call_id": call.call_id,
			"output": json.dumps({
			"tool_result": tool_result
			})
		})

def main():
	print("Hi! I'm tiny, your minimal coding agent. Give me a task to do!")
	conversation_history = []
	system_prompt = '''
		You are an advanced coding agent.
		Use the provided tools to answer the user's questions accurately.
	'''

	conversation_history.append({
		"role": "developer",
		"content": system_prompt
	})

	while True:
		user_input = input("\nUser: ")
		if user_input is None:
			continue

		if user_input == "exit()":
			print("Goodbye! See you another time!")
			break
		run_agent_turn(conversation_history, user_input)

if __name__ == "__main__":

main()

Using the agent

To use our agent, we will turn a CLI agent like Claude Code or Codex that you can run in any folder from the command line. First, we need to add shebang at the top of our agent.ts file as the main entrypoint to the app. I will only be implementing this part in Typescript.

#!/usr/bin/env node

The shebang tells your computer’s operating system to run the code in your file using Node. The shebang makes the agent portable by using the Node version available on your path, rather than looking for a specific version.

Next, we expose the project as a bin (binary) command and add scripts for building the cli app.

"bin": {
	"tiny-code": "./dist/agent.ts"
},
"scripts": {
	"build": "tsc",
	"postbuild": "chmod +x dist/agent.js",
	"prepare": "npm run build",
	"start": "node dist/agent.js",
	"dev": "node --env-file=.env dist/agent.js"
}

Then from tiny-code’s root, run:

npm install
npm run build
npm link

Then, you can open a terminal, cd into any directory and do:

	tiny-code

Task: build a weather website

Time to test our agent! I gave it this prompt: Build a weather website for London with a minimalistic black and white theme.. Here is the result below:

Pretty cool if you ask me!

Movie gif. Anthony Michael Hall as Brian Johnson from the Breakfast Club adjusts his ray-ban sunglasses and shimmies.|460

You can find all the code used in this project in this repo: https://github.com/Chinenyay/tiny-code

Conclusion

In this tutorial, we built our own coding agent from scratch. We wrote file operations and command execution functions, defined their tool schemas and provided them to gpt-5.4 as tools to call. Creating the coding agent is just the beginning. There are a myriad of other considerations like: security hardening - building your agent in a way that you don’t get pwned, metrics and evaluation - to measure how well your agent is performing, cloud sandboxing - running your agent in the cloud so you can access it from anywhere. I will be covering these topics in future tutorials, so, if you liked this one, stay tuned!

If you are looking for a more beginner-friendly tutorial on how agents work generally, you can check out my previous tutorial: https://jcumoke.com/blog/how-to-build-an-ai-agent-typescript-openai-api/

Till another time,

Jennifer

References

https://developers.openai.com/api/docs/guides/function-calling

https://www.w3schools.com/nodeJs/nodejs_child_process.asp

https://sylviamoss.me/post/pseudo-terminals/

https://tldp.org/LDP/Bash-Beginners-Guide/html/

https://docs.python.org/3/library/subprocess.html