You can build a ServiceNow agent in an afternoon. Connect an LLM to the Table API, give it a tool for “query records” and a tool for “update records,” tell it the table names it cares about. Run a workflow. It works.
Run it twice. It still works.
Run it on a real instance with custom scopes, scoped application boundaries, ACL evaluation chains, business rule order tags, and dictionary inheritance — and the wheels come off in five different ways before lunch. The agent confidently calls PUT /api/now/table/sys_user_group to add a member to a group, and the call succeeds against the API but silently fails against the ACL evaluator because the scope of the operation isn’t the same as the scope of the authenticated user. The change appears to land. It hasn’t.
Generic LLM-with-tools is a great prototype. It’s not a ServiceNow agent. The gap between the two is what “platform-aware” means, and it’s most of the engineering work we’ve done.
What ServiceNow actually is
ServiceNow has a reputation as “a CRM with extra tables.” That reputation is the problem. It’s a platform with:
- A scope model where the same table name (
incident) refers to different objects depending on the calling application - An ACL evaluator that runs read/write/create/delete checks per row, per field, in a defined chain that includes role checks, conditions, and scripts — and that chain is order-sensitive
- A
sys_dictionarymetadata layer where every field has type, default, choice list, calculated value, and inheritance from a parent table — and the inheritance is multi-level - A business rule system with
orderfields that determine execution sequence, async vs sync flags, and “current” / “previous” object semantics that change based on whether the BR is on insert, update, or query - Update sets that capture some changes but not others (data is not captured; certain config tables are not captured; cross-scope changes aren’t captured by default)
- Choice lists where the same field on the same table has different values in different scopes (a
statechoice ontasklooks different in the ITSM scope vs a custom scope that extendstask) - A schema where extending a table inherits ACLs, dictionary entries, and business rules — but only some of them, and only if the parent isn’t in a different scope
You cannot reason about a ServiceNow operation without modeling this stuff. A REST call that “should work” against the Table API can fail at the ACL chain, succeed at the ACL chain but bypass a business rule order, succeed at the BR chain but trigger an async BR that fires after your transaction commits, or land cleanly in production while silently failing in sys_audit. We’ve seen all four.
What “platform-aware” means
A platform-aware agent doesn’t think in terms of “I’ll call PUT and see what happens.” It thinks in terms of:
- What is the target’s scope, and is my caller in that scope? If not, what cross-scope access rules apply? Is this field marked accessible from other scopes in
sys_dictionary? - What’s the ACL chain for this table+field+operation? If a role check fails, will a script ACL still grant access? If both fail, will a write actually be silently dropped or will the call return an error?
- What business rules will fire on this write? Sync or async? In what order? Will the order tag cause a downstream BR to see stale
currentdata? - Will this be captured by an update set? If not, can I capture it differently (data load, scripted XML, scoped artifact)?
- What’s the dictionary chain for this field? If I’m updating
task.state, am I getting the choice list fortaskor for the child table the record actually belongs to?
These aren’t questions you can answer at runtime by reading the response. The response is fine. The state is wrong. By the time you find out, you’ve shipped the change to thirty users.
Serac answers these questions ahead of time by maintaining a pre-indexed map of the instance — a snapshot of sys_dictionary, sys_security_acl, sys_script (business rules), sys_choice, sys_dictionary_override, sys_scope, and the relationships between them. We call this the Instance Graph. It’s part of the platform-aware tier, not the core CLI, but every tool that does writes consults it.
A concrete example
A workflow we benchmarked in late 2025: “Resolve all P5 incidents older than 90 days, set resolution code to ‘Auto-Closed - Stale,’ notify assignment groups.”
A naive agent — generic LLM with a Table API tool and a Notification tool — does this in roughly 30 API calls:
- Query incidents (1 call)
- For each result (say 200 records): GET to load full record (200 calls) — to figure out what fields exist on this specific record’s class, because it might be a child of incident
- For each: PUT to update (200 calls)
- For each: query assignment group (200 calls)
- For each: insert notification record (200 calls)
That’s 801 API calls. It will work on a test instance with 5 records. On 200, it’ll hit rate limits, time out, and leave the system half-updated.
A platform-aware agent does this in 4:
- Single GlideRecord-style query against
incidentwith the filter, returning only the fields needed (1 call against/api/now/table/incidentwithsysparm_queryandsysparm_fields) - Read the Instance Graph for
incidentto know that resolution code is ontask, notincident, and the choice listAuto-Closed - Staleexists at thetasklevel — so the update payload is valid against the parent class - Single batched update via
/api/now/v1/table/incidentwith the operation set — one transaction, ACL chain validated against the caller’s scope ahead of time - Single notification dispatch via the Notification API, with assignment groups joined from the prior query result
Four calls. The Instance Graph made calls 2-4 unnecessary because the agent already knew the dictionary inheritance, the choice list scope, and the field-level ACLs before making any HTTP request.
That’s the value of pre-indexing. It’s not faster because we wrote a faster client. It’s faster because we don’t need to ask ServiceNow the same metadata questions thirty times.
What the Instance Graph contains
Approximate sizes for a mid-complexity customer instance:
sys_dictionarysnapshot: ~85,000 rows (every field on every table, with inheritance flags and override pointers)- ACL chain map: ~12,000 entries, with role IDs, conditions, and script ACL references resolved
- Business rule index: ~3,000 BRs, sorted by table + operation + order tag
- Choice list index: ~7,500 distinct choice sets, with per-scope overrides
- Scope graph: ~150 scopes for a customer with a custom application suite, each with cross-scope access flags
- Dictionary override map: ~2,800 overrides (field-level customizations per child table)
This snapshot is cached locally to the Serac instance, refreshed on a schedule (default: 6 hours), and invalidated immediately when the agent writes to any of the source tables.
The point of caching it: the agent’s planning step needs to know “can I write to u_my_field on incident_extended in scope x_acme_custom?” and that’s not a single ServiceNow query. It’s a join across five system tables. Asking the instance every time the planner thinks about a write is unworkable; the planning loop slows to a crawl. Pre-indexing makes it a memory lookup.
What you can’t shortcut
A few things we tried and gave up on:
- “Let the LLM read sys_dictionary at runtime.” Doesn’t work. The LLM doesn’t know what to ask for. It picks the first row that matches a partial field name and confidently builds a plan against the wrong choice list.
- “Embed the dictionary as a vector store.” Works for “find related fields.” Doesn’t work for “is this write valid in this scope.” Vector similarity is the wrong tool for a relational graph traversal.
- “Just have the agent test the write on a staging row first.” Doesn’t catch async BR side effects, and “staging rows” don’t exist for many tables —
sys_user_groupdoesn’t have a non-production version of itself. - “Use a vendor-provided ServiceNow MCP server.” The off-the-shelf MCP servers we evaluated treat ServiceNow as a generic REST surface. None of them model the ACL chain or BR order. Most don’t even know about scopes. They’re fine for read-only chat and unsafe for anything that writes.
The thing that works is the boring thing: snapshot the metadata, keep it fresh, consult it in the planner, and design tools that fail fast when the snapshot says no. Not novel research. Just a lot of careful integration code.
What’s still hard
Legacy scripted scopes are the limit of the Instance Graph. If your platform has a scope built in 2014 by a developer who used naming conventions only they remember, and the scope’s behavior is half configuration and half scripted ACL — no agent fixes that. You read the code. Serac will surface the relevant ACL scripts in the planning step and refuse to write through a path it can’t model, but it can’t replace the developer who knows why the script exists.
The other limit is custom apps with bespoke API endpoints (/api/x_acme_custom/...). Those aren’t in the Instance Graph because they’re not declarative — they’re scripted REST APIs. Serac treats them as opaque write surfaces, requires explicit user approval per call, and won’t auto-batch them. If you want richer support, you ship a custom tool definition that describes the API contract. That’s hand-written, not generated.
We’re working on auto-discovery for scripted REST APIs through OpenAPI spec generation. It’s promising but not in v1.0.
The pattern
ServiceNow is not a REST API. It’s a platform whose REST API hides the platform behavior. Generic LLM-with-tools sees the API and concludes the work is straightforward. The work is straightforward until the moment it isn’t, and at that moment, the agent that knew about scopes, ACLs, and BR order is already three steps ahead of the agent that didn’t.
Platform-aware isn’t a feature. It’s a debt you pay up front so your agent doesn’t accumulate it in production.