Metrics
GoServe provides production-grade monitoring via Prometheus. All metrics are exposed in the standard Prometheus text format at the /metrics endpoint.
Configuration
The metrics endpoint is enabled by default and listens on the main server address (default :8080).
To scrape GoServe with Prometheus, add the following to your prometheus.yml:
Available Metrics
HTTP Metrics
These metrics track the health and performance of the GoServe REST API.
| Metric Name | Type | Labels | Description |
|---|---|---|---|
goserve_http_requests_total |
Counter | method, status, path |
Total number of HTTP requests processed. |
goserve_http_request_duration_seconds |
Histogram | method, path |
Latency distribution of HTTP requests. |
Inference Metrics
These metrics provide deep visibility into the performance of your Machine Learning models.
| Metric Name | Type | Labels | Description |
|---|---|---|---|
goserve_inference_duration_seconds |
Histogram | model_name |
Time spent executing the ONNX model (excluding HTTP overhead). |
goserve_inference_errors_total |
Counter | model_name, error_type |
Count of failed inference attempts. |
Querying Examples (PromQL)
Average Inference Latency (Last 5 min)
rate(goserve_inference_duration_seconds_sum[5m]) / rate(goserve_inference_duration_seconds_count[5m])
Request Volume by Status Code
Error Rate per Model
Runtime Metrics
GoServe also exposes standard Go runtime metrics, including:
- go_goroutines: Number of active goroutines.
- go_memstats_alloc_bytes: Current heap memory usage.
- process_cpu_seconds_total: CPU usage of the GoServe process.