MCP host 조금 이해해보기

MCP가 무엇입니까

MCP란 Anthropic에서 claude를 위해 개발된 프로토콜입니다. MCP는 Model Context Protocol의 줄임말로써 LLM이 능동적으로 외부에 동작이나 리소스를 요청할 수 있도록 해주는 프로토콜입니다. MCP는 진짜 문자 그대로 요청과 응답을 주는 프로토콜에 불과하기 때문에 그 과정과 실행은 개발자가 해줘야 합니다.

내부 동작에 대해서

내부 동작에 대해 설명하기 앞서, Gemini Function Calling에 대해 짚고 넘어 가겠습니다. Gemini Function Calling도 MCP와 동일하게 LLM이 주도적으로 외부 동작을 호출할 수 있도록 합니다. 그럼 왜 Function Calling을 굳이 가져왔는가 의문이 들 것입니다. 굳이 가져온 이유는 Function Calling이 MCP보다 먼저 나오기도 했고, 동일하게 OpenAPI 스키마를 이용한다는 점에서 호환이 되어, 상호 간의 동작이 유사할 것으로 추측했습니다. 그렇다보니 비교적 Gemini Function Calling의 설명이 더욱 상세하기에 도움이 될 것으로 보여 가져왔습니다.

FunctionCalling

전체적인 흐름은 이렇습니다.

함수를 정의합니다.
프롬프트와 함께 Gemini에 함수 정의를 전송합니다.
1. "Send user prompt along with the function declaration(s) to the model. It analyzes the request and determines if a function call would be helpful. If so, it responds with a structured JSON object."
Gemini가 필요하면 함수 호출을 요청합니다.
1. Gemini가 필요하면 함수 호출을 위한 이름과 패러미터를 호출자가 전달받습니다.
2. 호출자는 실행을 할지, 말지 정할 수 있습니다.
  1. 호출해서 정당한 값을 돌려줄 것인지
  2. 호출하지 않고 호출한 것처럼 데이터를 반환할지
  3. 그냥 무시할지
Gemini는 위 과정에서 한번에 여러개의 함수를 호출하거나, 함수 호출 후 결과를 보고 또 호출하는 등의 동작을 수행 및 요청합니다.
결과적으로 정돈된 대답이 나오면 종료됩니다.

이 흐름은 일반적으로 MCP와 일맥상통합니다. 이는 MCP의 튜토리얼에서도 비슷하게 설명하고 있습니다. 이는 ollama tools도 비슷합니다.

그리고 정말 다행이게도 이 3가지 도구, ollama tools, MCP, Gemini Function Calling은 스키마 구조가 공유되다시피 해서 MCP 하나만 구현함으로 3곳에 다 쓸 수도 있다는 것입니다.

아 그리고 모두가 공유하는 단점이 있습니다. 결국 모델이 실행시켜주는 것이기 때문에 여러분이 쓰는 모델이 상태가 안 좋다면, 함수를 호출하지 않거나, 이상하게 호출한다거나, MCP 서버에 DOS를 날리는 등의 오동작을 할 수 있습니다.

Go로 된 MCP 호스트

mark3lab's mcphost

Go에는 mark3lab이란 조직에서 개발 중인 mcphost가 있습니다.

사용법은 매우 간단합니다.

1go install github.com/mark3labs/mcphost@latest

설치 후, $HOME/.mcp.json 파일을 만들어서 다음과 같이 작성합니다.

 1{
 2  "mcpServers": {
 3    "sqlite": {
 4      "command": "uvx",
 5      "args": [
 6        "mcp-server-sqlite",
 7        "--db-path",
 8        "/tmp/foo.db"
 9      ]
10    },
11    "filesystem": {
12      "command": "npx",
13      "args": [
14        "-y",
15        "@modelcontextprotocol/server-filesystem",
16        "/tmp"
17      ]
18    }
19  }
20}

그리고 다음과 같이 ollama 모델로 실행합니다.
물론 그 전에 필요하면 ollama pull mistral-small로 모델을 받습니다.

기본적으로 claude나 qwen2.5를 추천하지만, 저는 현재로썬 mistral-small을 추천합니다.

1mcphost -m ollama:mistral-small

다만 이렇게 실행하면, CLI 환경에서 질의응답 식으로만 사용할 수 있습니다.
그렇기에 저희는 이 mcphost의 코드를 수정해서 좀 더 프로그래머블 하게 동작할 수 있게 수정해보겠습니다.

mcphost 포크

이미 확인했다시피 mcphost에는 MCP를 활용해서 메타데이터를 추출하고, 함수를 호출하는 기능이 포함되어 있습니다. 그러므로 llm을 호출하는 부분, mcp 서버를 다루는 부분, 메시지 히스토리를 관리하는 부분이 필요합니다.

해당하는 부분을 가져온 것이 다음 패키지의 Runner입니다.

 1package runner
 2
 3import (
 4	"context"
 5	"encoding/json"
 6	"fmt"
 7	"log"
 8	"strings"
 9	"time"
10
11	mcpclient "github.com/mark3labs/mcp-go/client"
12	"github.com/mark3labs/mcp-go/mcp"
13
14	"github.com/mark3labs/mcphost/pkg/history"
15	"github.com/mark3labs/mcphost/pkg/llm"
16)
17
18type Runner struct {
19	provider   llm.Provider
20	mcpClients map[string]*mcpclient.StdioMCPClient
21	tools      []llm.Tool
22
23	messages []history.HistoryMessage
24}

해당하는 부분의 내부 선언은 따로 보지 않겠습니다. 다만 거의 이름 그대로입니다.

 1func NewRunner(systemPrompt string, provider llm.Provider, mcpClients map[string]*mcpclient.StdioMCPClient, tools []llm.Tool) *Runner {
 2	return &Runner{
 3		provider:   provider,
 4		mcpClients: mcpClients,
 5		tools:      tools,
 6		messages: []history.HistoryMessage{
 7			{
 8				Role: "system",
 9				Content: []history.ContentBlock{{
10					Type: "text",
11					Text: systemPrompt,
12				}},
13			},
14		},
15	}
16}

여기에 쓰일 mcpClients와 tools에 대해서는 해당 파일을 확인해 주세요.
provider는 ollama의 것을 쓸 테니 해당 파일을 확인해 주세요.

메인 요리는 Run 메서드입니다.

  1func (r *Runner) Run(ctx context.Context, prompt string) (string, error) {
  2	if len(prompt) != 0 {
  3		r.messages = append(r.messages, history.HistoryMessage{
  4			Role: "user",
  5			Content: []history.ContentBlock{{
  6				Type: "text",
  7				Text: prompt,
  8			}},
  9		})
 10	}
 11
 12	llmMessages := make([]llm.Message, len(r.messages))
 13	for i := range r.messages {
 14		llmMessages[i] = &r.messages[i]
 15	}
 16
 17	const initialBackoff = 1 * time.Second
 18	const maxRetries int = 5
 19	const maxBackoff = 30 * time.Second
 20
 21	var message llm.Message
 22	var err error
 23	backoff := initialBackoff
 24	retries := 0
 25	for {
 26		message, err = r.provider.CreateMessage(
 27			context.Background(),
 28			prompt,
 29			llmMessages,
 30			r.tools,
 31		)
 32		if err != nil {
 33			if strings.Contains(err.Error(), "overloaded_error") {
 34				if retries >= maxRetries {
 35					return "", fmt.Errorf(
 36						"claude is currently overloaded. please wait a few minutes and try again",
 37					)
 38				}
 39
 40				time.Sleep(backoff)
 41				backoff *= 2
 42				if backoff > maxBackoff {
 43					backoff = maxBackoff
 44				}
 45				retries++
 46				continue
 47			}
 48
 49			return "", err
 50		}
 51
 52		break
 53	}
 54
 55	var messageContent []history.ContentBlock
 56
 57	var toolResults []history.ContentBlock
 58	messageContent = []history.ContentBlock{}
 59
 60	if message.GetContent() != "" {
 61		messageContent = append(messageContent, history.ContentBlock{
 62			Type: "text",
 63			Text: message.GetContent(),
 64		})
 65	}
 66
 67	for _, toolCall := range message.GetToolCalls() {
 68		input, _ := json.Marshal(toolCall.GetArguments())
 69		messageContent = append(messageContent, history.ContentBlock{
 70			Type:  "tool_use",
 71			ID:    toolCall.GetID(),
 72			Name:  toolCall.GetName(),
 73			Input: input,
 74		})
 75
 76		parts := strings.Split(toolCall.GetName(), "__")
 77
 78		serverName, toolName := parts[0], parts[1]
 79		mcpClient, ok := r.mcpClients[serverName]
 80		if !ok {
 81			continue
 82		}
 83
 84		var toolArgs map[string]interface{}
 85		if err := json.Unmarshal(input, &toolArgs); err != nil {
 86			continue
 87		}
 88
 89		var toolResultPtr *mcp.CallToolResult
 90		req := mcp.CallToolRequest{}
 91		req.Params.Name = toolName
 92		req.Params.Arguments = toolArgs
 93		toolResultPtr, err = mcpClient.CallTool(
 94			context.Background(),
 95			req,
 96		)
 97
 98		if err != nil {
 99			errMsg := fmt.Sprintf(
100				"Error calling tool %s: %v",
101				toolName,
102				err,
103			)
104			log.Printf("Error calling tool %s: %v", toolName, err)
105
106			toolResults = append(toolResults, history.ContentBlock{
107				Type:      "tool_result",
108				ToolUseID: toolCall.GetID(),
109				Content: []history.ContentBlock{{
110					Type: "text",
111					Text: errMsg,
112				}},
113			})
114
115			continue
116		}
117
118		toolResult := *toolResultPtr
119
120		if toolResult.Content != nil {
121			resultBlock := history.ContentBlock{
122				Type:      "tool_result",
123				ToolUseID: toolCall.GetID(),
124				Content:   toolResult.Content,
125			}
126
127			var resultText string
128			for _, item := range toolResult.Content {
129				if contentMap, ok := item.(map[string]interface{}); ok {
130					if text, ok := contentMap["text"]; ok {
131						resultText += fmt.Sprintf("%v ", text)
132					}
133				}
134			}
135
136			resultBlock.Text = strings.TrimSpace(resultText)
137
138			toolResults = append(toolResults, resultBlock)
139		}
140	}
141
142	r.messages = append(r.messages, history.HistoryMessage{
143		Role:    message.GetRole(),
144		Content: messageContent,
145	})
146
147	if len(toolResults) > 0 {
148		r.messages = append(r.messages, history.HistoryMessage{
149			Role:    "user",
150			Content: toolResults,
151		})
152
153		return r.Run(ctx, "")
154	}
155
156	return message.GetContent(), nil
157}

코드 자체는 해당 파일의 일부 코드를 짜집기 하였습니다.

내용은 대략 다음과 같습니다.

프롬프트와 함께 툴 목록을 전송하여 실행 여부, 혹은 응답 생성을 요청합니다.
응답이 생성되면 재귀를 멈추고 반환합니다.
LLM이 툴 실행 요청을 남긴다면, 호스트에서는 MCP Server를 호출합니다.
응답을 히스토리에 추가해서 다시 1번으로 돌아갑니다.

끝으로

벌써 끝?

사실 할 말이 그렇게 많진 않습니다. 대략적으로 MCP Server가 어떻게 동작되는 지에 대한 이해를 도와 드리기 위해 작성된 글입니다. 이 글이 여러분들에게 자그맣게나마 MCP host의 동작을 이해함에 도움이 되었길 바랍니다.