I’ve been maintaining perplexity-go for a while now — a small Go client for the Perplexity AI API. It started as scratching my own itch (I needed a CLI to query Perplexity from scripts) and grew into a library I now reach for whenever a Go project needs real-time, cited answers from the web.
This post is a tour of what it does, why it exists, and how to use it.
Why Perplexity, and why a Go client
Perplexity differs from a typical LLM in one important way: it actually searches the web in real time and returns answers with citations. That makes it a much better fit than a closed-weights model when your application needs current information — news, documentation, prices, anything that changes.
I wanted to use that from Go, and at the time there wasn’t a maintained client. So I wrote one.
The design goals were modest and haven’t changed:
- Idiomatic Go — functional options, no magic
- Zero dependencies beyond the standard library
- Thread-safe, context-aware
- Stable surface area, even as Perplexity’s models churn
That last point matters. Perplexity rotates and renames models often. Since v2.5.0, the library tracks the default model only and lets you override it with WithModel(...) if you need something specific. This way the library doesn’t break every time a model is retired.
Installation
go get github.com/sgaunet/perplexity-go/v2
You’ll need an API key from the Perplexity API portal. I keep mine in PPLX_API_KEY.
Chat completions
The most common use case — ask a question, get an answer with citations:
package main
import (
"fmt"
"os"
"github.com/sgaunet/perplexity-go/v2"
)
func main() {
client := perplexity.NewClient(os.Getenv("PPLX_API_KEY"))
msg := []perplexity.Message{
{Role: "user", Content: "What's the capital of France?"},
}
req := perplexity.NewCompletionRequest(
perplexity.WithMessages(msg),
)
if err := req.Validate(); err != nil {
fmt.Printf("validation: %v\n", err)
os.Exit(1)
}
res, err := client.SendCompletionRequest(req)
if err != nil {
fmt.Printf("request: %v\n", err)
os.Exit(1)
}
fmt.Println(res.GetLastContent())
}
Three things worth pointing out:
NewCompletionRequesttakes functional options. That’s how all configuration flows —WithModel,WithMessages,WithReturnImages, etc. No struct soup.req.Validate()is explicit. The library catches malformed requests before the network call, which makes failures cheap to debug.res.GetLastContent()extracts the assistant’s reply. No JSON spelunking required.
The Search API
In 2024, Perplexity exposed a separate Search API that returns raw ranked results — URLs, titles, snippets — without the generative LLM layer. It’s faster, cheaper, and ideal when you want web search results to feed into your own pipeline (RAG, scoring, deduplication, whatever).
perplexity-go supports it natively:
client := perplexity.NewClient(os.Getenv("PPLX_API_KEY"))
req := perplexity.NewSearchRequest("golang best practices")
resp, err := client.SendSearchRequest(req)
if err != nil {
log.Fatal(err)
}
for i, r := range resp.Results {
fmt.Printf("%d. %s\n %s\n", i+1, r.Title, r.URL)
}
You can also pass multiple queries at once and tune the search with options:
req := perplexity.NewSearchRequest(
"machine learning papers",
perplexity.WithSearchMaxResults(10),
perplexity.WithSearchReturnSnippets(true),
perplexity.WithSearchCountry("US"),
perplexity.WithSearchDomains([]string{"arxiv.org", "github.com"}),
)
Pricing is decoupled from chat completions — at the time of writing, the Search API runs ~$5 per 1,000 requests. For workloads that just need ranked URLs, it’s a much better fit than paying for token generation you’ll throw away.
Context, cancellation, timeouts
Anything that talks to a remote API in Go should respect context.Context. The library does:
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
resp, err := client.SendSearchRequestWithContext(ctx, req)
Same pattern for completions. This composes cleanly with HTTP servers, worker pools, and errgroup.
A few practical tips
A few things I’ve learned shipping this in real applications:
- Pick the right model for the job.
sonaris fast and cheap;sonar-prois slower but handles longer outputs (up to ~8,000 tokens). For most queries, the default is fine. - Lower
max_tokensaggressively. The default is 4,000. If you’re summarizing or doing classification, a few hundred is usually enough — and it dramatically cuts latency. - Cache aggressively. Search results are stable enough over short windows that even a 60-second cache eliminates a lot of duplicate spend.
- Don’t forget the citations. The whole point of Perplexity over a vanilla LLM is the source attribution. Surface those URLs to your users.
CLI counterpart: pplx
If you want to query Perplexity from the shell — for testing prompts, automation, or just curiosity — I also maintain a CLI tool built on top of this library: pplx. Same API key, same models, no Go code required.
Where to go next
- Source: github.com/sgaunet/perplexity-go
- GoDoc: pkg.go.dev/github.com/sgaunet/perplexity-go/v2
- Issues / feature requests: very welcome — especially around the Search API and any beta endpoints I haven’t been able to test against yet.
It’s MIT-licensed, lightweight, and used in production by at least one person (me). If it saves you the afternoon I spent figuring out the API quirks, it’s done its job.