Engineering

MCP vs API: What's Actually Different and When It Matters

Julia Mase12 min read

The question I get most often about MCP is some version of: "Why would I use this instead of just calling the API directly?" It is a good question, and the honest answer is not "because MCP is newer and shinier." The honest answer is that MCP and a raw API solve different problems, and most confusion about which to use comes from treating them as interchangeable.

This post is the short version of a mental model I have been refining for six months. If you are a developer deciding whether to build an MCP server or just wire your AI application to an API, this should save you a few hours of second-guessing.

#The one-sentence difference

A raw API is for when your code calls the service. MCP is for when an AI agent needs to decide whether to call the service.

That is the whole thing. Every other difference is a consequence of that one distinction. If you write code that deterministically fetches data from a service every morning at 8 AM, you want a direct API call. If you write an AI agent that might need to look up a customer record mid-conversation depending on what the user asks, you want MCP.

#How an API works

A raw API is a set of endpoints exposed by a service. You write code that knows ahead of time which endpoint to hit, what parameters to send, and what to do with the response. The contract is fixed at the time you write the code.

direct-api.py
import requests
 
response = requests.get(
    "https://api.linear.app/graphql",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"query": "{ issues { nodes { id title } } }"}
)
issues = response.json()
for issue in issues["data"]["issues"]["nodes"]:
    print(issue["title"])

The developer decided when to call, what to call, and how to handle the result. The AI is not involved in any of those decisions. This is the right shape for cron jobs, webhook handlers, scheduled syncs, and basically anything where the workflow is predictable.

#How MCP works

An MCP server exposes the same kind of functionality, but it does it through a protocol that an AI client can discover at session start and then decide when to use.

mcp-server.py
from mcp.server import Server
from mcp.types import Tool, TextContent
 
app = Server("linear-mcp")
 
@app.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="list_issues",
            description="List open Linear issues. Use when the user asks about tickets or bugs.",
            inputSchema={
                "type": "object",
                "properties": {
                    "status": {"type": "string", "enum": ["open", "closed", "all"]}
                }
            }
        )
    ]
 
@app.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if name == "list_issues":
        # ... actually call Linear API here ...
        return [TextContent(type="text", text="Fetched 47 open issues")]
    raise ValueError(f"Unknown tool: {name}")

The AI client connects to this server, asks "what tools do you have?", and gets back a list with descriptions. Later, when a user says "what bugs are open in Linear?", the AI model decides to call list_issues because the tool description matches the request. The server then actually makes the underlying API call to Linear.

Notice the crucial difference: the AI chose when and how to call Linear, based on the tool description. The same underlying Linear API is involved in both cases, but the division of labor is different.

#Why this distinction matters

Every time I see a team struggle with MCP versus API, it is because they did not think clearly about who is deciding what.

If you are building a chatbot that needs to answer arbitrary user questions and sometimes needs to look things up in Linear, GitHub, or Notion, you want MCP. The whole point is that you do not know ahead of time which service will be relevant. The AI decides based on the conversation, and MCP gives the AI a menu to choose from.

If you are building a nightly report that emails engineering metrics to your team lead, you want a direct API. The workflow is fixed. The AI is not in the loop. Adding MCP to this is pure overhead.

Most real systems end up with both. A nightly cron job that fetches data from several APIs directly. An AI assistant sitting next to it that uses MCP to let users ask questions about the fetched data in natural language. The two patterns do not compete, they complement.

#The decision flowchart

MCP vs API decision guide
FeatureUse MCPUse direct API
Who decides when to call?AI at runtimeyour code, predetermined
Is the workflow deterministic?no, depends on user inputyes, same every time
Are you building an agent?yesno, a pipeline
Do you need to react to natural language?yesno
Does the set of possible actions vary by session?yesno, fixed
Is latency critical?moderateimportant
Is discovery at session start acceptable?yesnot applicable
Would a junior dev understand the shape?maybeyes
If most rows in a column match your situation, that is your answer.

#Three cases where most teams get it wrong

I have seen the same three mistakes enough times to generalize them.

Mistake one: building MCP for non-agentic workflows. I once watched a team build an MCP server so that their deployment script could "talk to Slack." But the deployment script was not an agent. It was a shell script that ran on a fixed trigger and always did the same thing. The MCP server added four layers of indirection to a workflow that could have been a single curl call. The right answer was a webhook, not an MCP server.

The tell: you are building an MCP server but there is no AI model in the loop making decisions. If a human or a script always decides when to call the server, you do not need MCP, you need a function.

Mistake two: treating MCP as a drop-in replacement for internal APIs. Another team I know tried to migrate their internal microservices to MCP servers because "everything should be MCP now." Their microservices were called by other microservices on a predictable schedule. The migration added latency, complicated deployment, and did not give them any new capabilities. They rolled it back after a month.

The tell: the services being called are not called by AI. They are called by other services in a fixed architecture. MCP is the wrong layer to sit between two pieces of deterministic code.

Mistake three: skipping MCP when you do actually have an AI deciding what to call. The opposite mistake is also common. A team builds an AI assistant that has custom logic for calling three different APIs, hard-codes when each one gets called, and ends up with a brittle if-else tree the AI cannot reason about. Then they find out that every new API they add means more custom code, and the AI cannot use new tools without a code deploy.

The tell: you have an AI in the loop but you are making decisions on its behalf instead of letting it choose. Every hard-coded "if user asks about X, call Y" in your codebase is a place where an MCP server description would have let the AI decide dynamically.

#What about latency and overhead?

MCP adds a small amount of latency. The client-server handshake, the tool discovery at session start, the JSON-RPC message overhead. In practice, this is negligible for anything longer than milliseconds. If your workflow cares about microsecond-level latency, MCP is the wrong layer, and honestly your AI workflow is probably also the wrong layer.

The more real overhead is operational. Running an MCP server means you have one more process to deploy, monitor, and keep updated. For a small team, this is a real cost. The tradeoff is worth it if you have multiple AI clients (Claude Code, Cursor, ChatGPT, internal tools) that all need to integrate with the same services. Build the MCP server once, plug it into every client, done.

#When to build your own MCP server

Short version: only when there is no existing server that does what you need.

As of April 2026, the official MCP server directory and PulseMCP list more than 10,000 servers. The common ones (filesystem, GitHub, Slack, Notion, Linear, Stripe, SQLite, Postgres) already exist and are maintained by either Anthropic's reference team or the vendors themselves. Building your own for something that already has an official server is usually wasted effort.

When you should build your own:

  • You have an internal API that does not exist in the public MCP ecosystem
  • You need tighter security or scoping than the public server offers
  • The public server is slow, broken, or unmaintained
  • You need custom resources or prompts that the public server does not expose

If none of those apply, install the existing server and move on.

#Does MCP replace APIs?

No, and this question comes up enough that I want to end on it specifically. MCP does not replace APIs. MCP sits on top of APIs. Every MCP server I have mentioned in this post calls an underlying API under the hood. The MCP layer is an adapter that makes the API callable by an AI model, not a replacement for the API itself.

The right mental model: MCP is how AI agents talk to APIs. APIs are how services talk to each other. These are different layers, they do not compete, and every serious AI system uses both.

Try It Out

Getting ready for AI-tooling interviews?

MCP versus API is becoming a common question in engineering interviews. Our AI interview prep covers the architectural decisions employers actually ask about.

Start Free Session

#FAQ

Frequently asked questions

What is the difference between MCP and an API?
A raw API is called directly by your code on a predetermined schedule or in response to fixed events. MCP is a protocol that lets an AI model discover available tools at session start and decide at runtime which ones to call based on user input. MCP is for agentic workflows, APIs are for deterministic ones. MCP usually sits on top of an API under the hood.
Does MCP replace REST APIs?
No. MCP does not replace APIs. MCP is an adapter layer that lets AI models call APIs dynamically. Every MCP server calls an underlying API or service under the hood. If you build an MCP server for a REST API, you still need the REST API to exist. MCP is how AI agents reach APIs, not a substitute for them.
When should I use MCP instead of calling an API directly?
Use MCP when an AI model needs to decide whether to call a service based on dynamic user input. Use a direct API when your code deterministically decides when to call. If you can write the full decision logic as a fixed flowchart without any 'the AI decides' nodes, a direct API is the right answer. If the flowchart has decision points driven by conversation, MCP is the right answer.
Can I use both MCP and direct APIs in the same system?
Yes, and most real systems do. A typical pattern is a nightly cron job that syncs data from several APIs directly, plus an AI assistant sitting on top that uses MCP to let users ask questions about the synced data in natural language. The two patterns complement each other, they do not compete.
Is MCP slower than calling an API directly?
Slightly. MCP adds the client-server handshake, tool discovery at session start, and JSON-RPC message overhead. In practice this is negligible for anything longer than millisecond-level work. The more meaningful overhead is operational, because you have one more process to run and maintain. For AI workflows where the AI decides at runtime, the extra latency is almost always worth the benefit.
Do I need to build my own MCP server?
Usually not. As of April 2026, there are more than 10,000 MCP servers in the public ecosystem including official ones for GitHub, Slack, Notion, Linear, Stripe, SQLite, and Postgres. Build your own only when the public ecosystem does not cover your internal API, or when you need tighter scoping than a public server offers.

#Sources

Related Posts