@borouhin

borouhin@alien.top · 8 months ago

Heh, it’s a valuable OSINT source of information indeed :) Even if it was just one time a sysadmin issued a single certificate for multiple domains that were not meant to look connected to each other, CT logs show that these domains’ owners are actually affiliated.

borouhin@alien.top · 8 months ago

I own a small business, 20-30 devices only. But they’re a mix of all possible platforms (Windows, MacOS, Android, iOS). Would like to force disk encryption, strong password policy, automatically install/update/configure corporate VPN/mail/etc., prevent use of blacklisted programs, remote wipe of lost/stolen/otherwise compromised devices. I know it’s not feasible with any selfhosted solution, sadly.

borouhin@alien.top · 8 months ago

Any MDM solution. All self-hosted options that were available (onemdm, flyve) are dead. I’m my own employer, so we definitely agree everything should be self-hosted :)

borouhin@alien.top · 9 months ago

InfluxDB is just a storage. If you have a service that saves metrics to InfluxDB (IIRC, Proxmox can do that), Grafana can read it from there. Grafana can aggregate data from many sources, Prometheus+Loki+InfluxDB+even queries to arbitrary JSON APIs etc.

borouhin@alien.top · 9 months ago

No, they serve different purposes. Loki is for logs, Prometheus is for metrics. Grafana helps to visualize data from both.

borouhin@alien.top · 9 months ago

Good luck, if you get into it, you’ll be unable to stop. Perfecting your monitoring system is a kind of mania :)

One more advice for another kind of monitoring. When you are installing / configuring something on your server - it’s handy if you can monitor it’s resource usage in real time. And that’s why I use MobaXterm as my terminal program. It has many drawbacks, and competitors such as XShell, RoyalTS or Tabby look better in many ways… but it has one killer feature. It shows a status bar with current server load (CPU, RAM, disk usage, traffic) right below your SSH session, so that you don’t have to switch to another window to see the effect of your actions. Saved me a lot of potential headache.

borouhin@alien.top · 9 months ago

When you have several Prometheus instances (HA or in different datacenters), setting up separate AlertManagers for each of them is a good idea. But as OP is only beginning his journey to monitoring, I guess he will be setting up a single server with both Prometheus and Grafana on it. In this scenario a separate AlertManager doesn’t add reliability, but adds complexity.

As for source control, you can write a simple script using Grafana API to export alert rules (and dashboards as well) and push them to git. Not ideal, sure, but it will work.

Anyway, it’s never too late to go further and add AlertManager, Loki, Mimir and whatever else. But to flatten the learning curve I’d recommend starting with Grafana alerts that are much more user-friendly.

borouhin@alien.top · 9 months ago

Alerts are much more important than fancy dashboards. You won’t be staring at your dashboard 24/7 and you probably won’t be staring at it when bad things happen.

Creating your alert set not easy. Ideally, every problem you encounter should be preceded by corresponding alert, and no alert should be false positive (require no action). So if you either have a problem without being alerted from your monitoring, or get an alert which requires no action - you should sit down and think carefully what should be changed in your alerts.

As for tools - I recommend Prometheus+Grafana. No need for separate AletrManager, as many guides recommend, recent versions of Grafana have excellent built-in alerting. Don’t use those ready-to-use dashboards, start from scratch, you need to understand PromQL to set everything up efficiently. Start with a simple dashboard (and alerts!) just for generic server health (node exporter), then add exporters for your specific services, network devices (snmp), remote hosts (blackbox), SSL certs etc. etc. Then write your own exporters for what you haven’t found :)