IT Systems Administrator / DevOps Engineer
We are seeking a senior IT Systems Administrator / DevOps Engineer to design, operate, and harden our on-prem and hybrid infrastructure. This role owns the end-to-end lifecycle of networking, secure remote access, Linux server operations, storage, backups/DR, observability, and artifact/package distribution. You will be hands-on: building reliable systems, automating repetitive work with Python/Bash, enforcing security baselines, operating firewalls, and collaborating closely with engineering teams to keep services performant and resilient.
Responsibilities
Network, Edge, and Firewalls (Routing/Switching/VPN)
• Deploy, configure, and maintain MikroTik/Cisco routers, L3 switches, and VPN solutions (including WireGuard).
• Own firewall operations across edge and internal segments: policy design, rule lifecycle, NAT, segmentation, logging, and controlled change management.
• Design VLANs, routing, ACLs, and segmentation to support production, lab, and office networks.
• Troubleshoot network performance/connectivity using packet captures, flow/log analysis, and structured diagnostics.
Internal / Core / External Network Architecture & Secure Access
• Plan and implement internal/DMZ/external network architecture with explicit trust boundaries and clear routing/firewall policy.
• Design and operate bastion/jump server patterns (least privilege, auditable access, session controls).
• Own SSH access patterns: ssh_config, key policies, ProxyJump/ProxyCommand, and secure tunneling.
• Implement secure RDP access patterns (gateways, restricted exposure, policy controls).
Protocols, Reverse Proxy, and TLS
• Apply strong fundamentals in TCP/UDP, HTTP/S, WebSockets, NAT, MTU, and timeouts/retries.
• Own IPv4 address management (subnetting, DHCP/DNS integration, static allocations).
• Deploy and operate Nginx reverse proxy/load balancing; health checks, routing, headers, rate limiting.
• Implement TLS/SSL best practices: certificate issuance/renewal, cipher policy, mTLS where appropriate.
Sysadmin & Infrastructure Operations
• Administer Linux systems: provisioning, patching, package management, service supervision, tuning, log management.
• Manage server hardware and secure out-of-band access (IPMI).
• Manage storage: RAID selection, monitoring, rebuild workflows, performance trade-offs.
• Lead capacity planning, lifecycle management, inventory, and standard operating procedures.
Data, DR, and Reliability (SRE)
• Operate databases with HA/DR mindset: hot/cold standby, replication, backups/restores, and failover playbooks.
• Own snapshots and DR planning: RPO/RTO definition, periodic restore tests, DR drills, and runbooks.
• Support reliability practices: SLIs/SLOs, alerting hygiene, incident response, postmortems.
• Deploy/operate object storage and integrate with backup/retention and artifact workflows where applicable.
Artifactory, Registries, and Mirrors (Supply Chain & Availability)
• Deploy, operate, and maintain a central artifact repository.
• Build and manage repository mirrors/proxies to improve availability, performance, and supply-chain control, including common ecosystems such as:
• Docker registry proxying/caching (Docker Hub, GHCR, etc.)
• APT/YUM package mirrors and internal repositories
• PyPI/NPM/Maven/Go module proxying (as applicable to the org)
• Define retention, cleanup, and storage tiering policies for artifacts; manage quotas and growth forecasting.
• Implement access control (RBAC), audit logging, and signing/verification practices where applicable.
• Integrate Artifactory with CI/CD pipelines for publish/pull workflows, immutability, promotion, and traceability.
• Design mirror strategy for constrained/air-gapped or unreliable internet environments (fallbacks, sync windows, verification).
Containers & CI/CD
• Operate Docker/Docker-Compose services with clean patterns for config, secrets, logging, and upgrades.
• Manage Portainer (or equivalent) with RBAC, approvals, and auditability.
• Support CI/CD fundamentals: runners/agents, artifact pipelines, environment promotion, safe deployment practices.
Security, Hardening, and Controlled Pentest Activities
• Implement security baselines: OS hardening, firewalling, secure remote access, secrets handling, patch/vulnerability management.
• Manage Nitrokey/Yubikey workflows (enrollment, rotation, revocation, recovery).
• Support or integrate HSM solutions for key custody and sensitive cryptographic operations.
• Perform authorized internal security assessments: exposure review, config/access-control review, vulnerability scanning in controlled environments, TLS posture checks; write findings with remediation steps and validate fixes.
Observability
• Implement metrics/dashboards via Prometheus and Grafana; define actionable alerts and SLO-oriented views.
• Operate ClickHouse where required for high-volume telemetry/analytics and retention.
Automation & Documentation
• Automate operations with Python and Bash (provisioning, audits, backups, checks, incident tooling).
• Maintain diagrams/runbooks: networks, firewall policies, access procedures, DR plans, and change logs.
• Collaborate effectively with software teams, security stakeholders, and technicians.
Minimum Qualifications
• Bachelor’s degree in Computer Engineering (or equivalent).
• 5+ years in sysadmin, network operations, and/or DevOps/SRE roles.
• Strong Linux administration skills; proficiency in Python and Bash scripting.
• Networking fundamentals (TCP/UDP/HTTP(S)/WebSockets), IPv4 subnetting/routing, and troubleshooting tooling.
• MikroTik/Cisco routers, L3 switching, and VPN experience (WireGuard preferred).
• Firewall operations experience (policy, NAT, segmentation, logging, safe change practices).
• Nginx reverse proxy + TLS/SSL certificate management experience.
• IPMI, RAID/storage operations, hardware lifecycle and capacity planning.
• Docker/Docker-Compose and CI/CD fundamentals.
• Strong documentation and incident response discipline.
Preferred Qualifications
• Experience designing segmented network architectures, DMZ patterns, and bastion/jump host workflows.
• Hands-on experience operating Artifactoryand implementing mirrors/proxy repositories across multiple ecosystems (Docker/APT/YUM/PyPI/NPM/Maven/Go).
• Security mindset: hardening, secrets management, MFA/hardware keys, and audit readiness; experience with internal security assessments and remediation tracking.
• DB HA/DR depth: standby/replication, snapshot workflows, restore testing, and DR drills.
• Object storage experience and integration into backup/artifact pipelines.
• SRE practice: SLIs/SLOs, alert design, on-call workflows, postmortems.
• Observability depth: Prometheus/Grafana and ClickHouse operations/tuning.
• HSM integration and key management process ownership.
