Our cloud handles almost all major user-facing services, as well as many internal Yandex services. The system development team is responsible for the core layer of our cloud software, including the OS kernel, which is deployed when new Yandex servers are put into operation.
We develop and maintain the following projects:
- Porto – an application containerization system exclusively designed for extensive use within Yandex. All cloud containers are based on Porto. The system also contains plugins such as a metrics collector for container data exported to the internal monitoring system and a shim that enables CRI integration for Kubernetes.
- Skybone/Copier – a data transfer system that uses a BitTorrent-like protocol to efficiently distribute data to multiple servers. It consists of a client-side host daemon and a server-side tracker for peers along with a service for HTTP to P2P conversion.
- Netmon – a service that actively monitors the status of the entire Yandex network. The system consists of both agent and server components, with several databases hosted on the server, including ClickHouse, the primary one.
- eBPF-agent – an agent responsible for managing BPF programs on the host, allowing for granular network stack adjustments that deliver improved efficiency.
- HBF-agent – a host-level manager for firewall rules.
- Perf-manager – a system for cluster-wide performance analysis of applications similar to Google Cloud Profiler.
- Other projects related to system development, networking, and containers.
We’re looking for an experienced System Developer who is open to new technologies and eager to help support and develop them. You will be responsible for improving the performance, fault tolerance, and user-friendliness of the cloud.