Metrics

GoServe provides production-grade monitoring via Prometheus. All metrics are exposed in the standard Prometheus text format at the /metrics endpoint.

Configuration

The metrics endpoint is enabled by default and listens on the main server address (default :8080).

To scrape GoServe with Prometheus, add the following to your prometheus.yml:

scrape_configs:
  - job_name: 'goserve'
    static_configs:
      - targets: ['localhost:8080']

Available Metrics

HTTP Metrics

These metrics track the health and performance of the GoServe REST API.

Metric Name	Type	Labels	Description
`goserve_http_requests_total`	Counter	`method`, `status`, `path`	Total number of HTTP requests processed.
`goserve_http_request_duration_seconds`	Histogram	`method`, `path`	Latency distribution of HTTP requests.

Inference Metrics

These metrics provide deep visibility into the performance of your Machine Learning models.

Metric Name	Type	Labels	Description
`goserve_inference_duration_seconds`	Histogram	`model_name`	Time spent executing the ONNX model (excluding HTTP overhead).
`goserve_inference_errors_total`	Counter	`model_name`, `error_type`	Count of failed inference attempts.

Querying Examples (PromQL)

Average Inference Latency (Last 5 min)

rate(goserve_inference_duration_seconds_sum[5m]) / rate(goserve_inference_duration_seconds_count[5m])

Request Volume by Status Code

sum(rate(goserve_http_requests_total[5m])) by (status)

Error Rate per Model

sum(rate(goserve_inference_errors_total[1m])) by (model_name)

Runtime Metrics

GoServe also exposes standard Go runtime metrics, including: - go_goroutines: Number of active goroutines. - go_memstats_alloc_bytes: Current heap memory usage. - process_cpu_seconds_total: CPU usage of the GoServe process.