18 public routes Powered by SlimeTree-RLM MIT source
SlimeTree-RLM × Platform Integrations
Cut LLM cost 60-80%.
A 272 KB safety device answers what it can instantly and blocks what it shouldn't.
Only R verdicts go to Gemini / Claude / OpenAI; the cut rate is ~73% with any provider.
Gemini = open to everyone (free key via Google AI Studio) / Claude · OpenAI = corporate plans. You pay each provider directly.
Platform Integrations Hub (18 routes) → Start 30-day trial → See the concrete savings ↓ ★ Layered cut (NEW) ↓
Supported platforms
SlimeTree-RLM (D/µ/R deterministic classification + audit chain) is platform-neutral. We are rolling out the same pattern — RLM pre-filter wrapping each platform's API — one platform at a time.
* RLM itself is platform-independent. We release the combinations — each platform's API + RLM pre-filter — as units of "public demo + MIT source + commercial package." Priority-platform requests welcome via Contact.
Concrete example: 10,000 DMs / month ― per provider
One month for a typical BtoC brand. Average 200 in / 1,000 out tokens; the inbound volume is unchanged; RLM filters 73% (D 6,200 + µ 1,100 = 7,300 zero-LLM-call; only R 2,700 goes to the LLM).
From freeing up $13,000+/year (Claude Opus) to running on $0 via the Gemini Flash free tier ― same code, same config.
+ every query is recorded as a SHA-256 audit chain; zero extra cost for compliance.
audit_chain_head: a3f1d2e8b4c97f5e... [✓ verified, 10,000 records]
* Sample ratios (D 62% / µ 11% / R 27%) are averages from our 4-LLM cross-validation benchmark; adjustable by industry. Prices are public unit prices as of 2026-05; verify with each provider directly.
NEXT THEME · 2026-05-30 Pattern B · Cost-tier escalation ★ All 18 demos shipped (estimate → next: measured)
★ Multi-agent layered cut ― on top of the existing 73% cut
The table above assumes "all 2,700 R-verdicts go to the same premium LLM (e.g. GPT-5 / Claude Opus)".
The next theme — LLM multi-agent support — adds another layer on those 2,700: "try cheap LLM → RLM re-judges quality → escalate only insufficient items to premium".
On top of the 73% cut, the R portion is cut by another 50-70%. Combined: 85-92%.
[Current] 10,000 items
↓ SlimeTree-RLM (D/µ/R)
↓ D=6,200 / µ=1,100 handled deterministically (0 LLM calls)
↓ R=2,700 ───────────────────────→ premium LLM 2,700 calls
[After Pattern B layered] 10,000 items
↓ SlimeTree-RLM (D/µ/R)
↓ D=6,200 / µ=1,100 handled deterministically (0 LLM calls)
↓ R=2,700
↓ ① Try all 2,700 on cheap LLM (e.g. Gemini Flash) → cheap 2,700 calls
↓ ② RLM re-judges output quality (R-meta verdict)
↓ ③ Only the ~15% judged insufficient escalate to premium → premium 405 calls
↑ 6.7× fewer
Concrete example: 10,000 DMs / month · cheap = Gemini Flash · premium = GPT-5
Baseline.
Save $226/month (73%).
vs baseline save $295/month (95.3%).
$84 → $14.6 = −$69/month.
* The 15% escalation rate is an estimate (conservative re-judging setting). 10% → $10.5/month; 25% → $23/month.
* Gemini Flash 2,700 calls ≈ $2.03 (existing table); escalated 405 calls × GPT-5 unit price ≈ $12.6 (total $14.6).
Per premium: applying the same Pattern B to 6 configurations
* cheap = Gemini Flash, fixed. Each premium uses the same 15% R-meta escalation rate (raise the rate to push more onto premium when quality sensitivity is higher).
* In every combination, cheap = the same Gemini Flash, so SMBs can run the cheap layer at $0 on the free tier (15 RPM).
Why RLM works as an orchestrator
- Drops in without modification ― replace the R path of the existing D/µ/R verdict with a 3-stage "cheap try → quality re-judge → premium escalate." Same patch applies to all 18 routes.
- Audit doesn't break ― the existing SHA-256 WAL chain simply appends two records (cheap / premium). "Which query was handled by cheap, which escalated to premium" is fully traceable from the audit log.
- Preserves the Platform-native LLM principle ― on X, Grok cheap → Grok premium; on Google, Gemini Flash → Gemini Pro; on Meta, free cheap/premium mix. No forced Gemini stuffing.
- Fallback is natural ― if the premium API fails, the cheap response can be returned (with a quality flag). Frameworks like LangGraph require explicit wiring for this.
Step 1 done: B mode implemented across all 18 demos (Meta 6 + X 6 + Google 6), JS syntax verified (local + live). 3 Gateways + 15 bots all expose a "single LLM / cost-tier (B)" radio switch.
Step 2 done: dedicated multi-agent demo page (6-pattern showcase) is public; all of A/B/C/D/E/F work.
Step 3 done: shared module (MIT, ES module, 4 providers) public; AI agents (Claude Code / Codex etc.) can import it directly.
★ Step 4 done (2026-06-01): with the measurement dashboard, the R rate was measured across 2 independent runs → see table below. The escalation rate was not fully measurable because of the Free-tier API quota, so we kept the estimated value plus a candid annotation. Confirmation deferred to a follow-up after billing is attached (next-sprint homework).
★ Estimated vs measured (confirmed 2026-06-01)
Results from the measurement dashboard (`/integrations/measurement/`) using real Gemini Flash calls. The R rate reproduced at 28% across 2 independent runs. The escalation rate was not fully measurable because of Free-tier 429 limits; we kept the estimated value with annotation for honesty.
★ Footnote: the R rate reproduced at 28% across 2 independent runs (2026-06-01T00:23:51Z + 00:26:55Z). The escalation rate was held at the estimated value because the Free-tier 429 limit prevents full measurement. Confirmation deferred to a follow-up on a paid tier or on the WASM version of RLM.
Method: The RLM mock module (`slimetree-rlm-mock.js`, KNOWN_FACTS 6 + MUTE_TRIGGERS 4) produces D/μ/R verdicts → only R items invoke Gemini 2.5 Flash → `judgeResponseQuality` re-judges → only insufficient items would escalate to Pro (Pro was not reached during the measurement runs, so the observed escalation rate depends on cheap-response length).
★ NEW: Measurement dashboard → 6-pattern showcase → Pattern detail → Hub →
6 routes ― who gets relief from what
Each route's typical pain in one line. See the demos for detail.
SlimeTree-RLM Prompt Gateway
"What are your business hours?" / "Tell me the specs of XX" gets answered instantly without ever hitting Claude. Claude is consumed only by discussions that truly need an LLM.
Try the demo →Threads automated posting (µ-prefilter)
A µ warning on the draft → avoid account suspension from Meta moderation. Routine announcements pass deterministically.
Try the demo →Messenger safe bot
"Hours," "shipping status," "pricing" answered at zero tokens 24h/day; complex consultation goes to Claude. Your operators can sleep.
Try the demo →WhatsApp Business safe bot
Instant reply within 24h of the customer's last send = ship safely without waiting for template approval. Clears audit-chain requirements for medical / finance at the same time.
Try the demo →Instagram DM safe bot
Investment solicitation / affiliate / phishing DMs are screened deterministically inside the browser. Zero Claude calls, operator peace of mind intact.
Try the demo →Graph API generic client
Hit any Facebook / Instagram / Pages / Threads / WhatsApp endpoint from one UI. Post-stage RLM audit of response text fits right here.
Try the demo →Delivery options ― RLM license and LLM usage are fully separated
| Tier | Scope (RLM features) | RLM fee | LLM fee |
|---|---|---|---|
| ① 30-day free trial (open now) |
Register email + password at the Gateway → email approval → 30-day trial begins automatically. RLM features usable across all Meta 6 routes + X 6 routes. D / µ verdicts run at zero cost. |
$0 | Your choice (BYO key) Gemini = free tier OK / Claude / OpenAI / Grok = each provider's billing |
| ② SaaS monthly (in preparation) |
Subscription for continued RLM use. Release upon catalog finalization. Per-route / bundle selectable. | To be announced | Same as above (BYO key) |
| ③ Custom integration / OEM (on demand) |
Custom RLM integration into existing systems / Meta App / X App environments; actual WASM license + engineering. LLM connection designed to fit requirements. | Inquire | Customer choice (optional) |
Supported LLM providers (your free choice, 4 + α)
The same RLM implementation routes R verdicts via the LLM you pick. No need to fix one; combine for different use cases as needed.
* Recommended starting config: Gemini 2.5 Flash (free tier) ― verify connectivity individually; switching to another LLM in production needs only swapping the key in localStorage. No RLM configuration changes required.
Technical detail, benchmarks, patents, source
"What is it doing inside?" / "Validation of the −20.4 pt cut over 6,870 trials" / "Why is a Rust standalone binary just 272 KB" lives in Resource.
- Meta integration hub (technical detail of the 6 routes)
- SlimeTree-RLM primary materials ― measurement procedure / 4-LLM cross-validation / paper v10 / patent claims 1-44
- RLM tutorial (JA: intro → intermediate → advanced)
- Gateway source code (MIT)
- Product page: SlimeTree-RLM (DEVICE primary + AI applied)
