Learning C++ by Implementing SOCKS5

I started my career writing a bit of C and C++ for embedded systems. Then I didn’t touch either language seriously for more than ten years.

In the meantime, I built things in C#, TypeScript, Go, Python, and whatever else the job needed. I’ve always believed in using the right tool for the job, which means I’ve often been pushed into a new language or framework I didn’t know yet. Each time, the language was new. The thinking wasn’t.

A while back I built a small library called simditoa. It was the first time I leaned heavily on AI for the syntax side of things, while I stayed focused on the actual problem. That was the moment something clicked: programming stays programming. Logical thinking is the work. The syntax is the surface.

So when I decided to come back to C++ properly, I needed a problem worth solving. Something I already understood well enough that the language would be the only friction.

I picked SOCKS5.

Why SOCKS5, Again

Every time I want to learn a new language properly, I pick a problem I already know how to solve and reimplement it.

SOCKS5 is one of my favourites. I’ve written it in a handful of languages over the years. It’s small enough to finish in a weekend, but it has just enough teeth to force you into the parts of a language you’d otherwise avoid: raw sockets, binary parsing, byte order, threads, blocking I/O, and a state machine that doesn’t forgive sloppy reads.

If you can read RFC 1928 and produce a working proxy, you’ve touched most of what a language gives you at the systems level. That’s the test.

The Setup

I kept the project deliberately small. A single main.cpp file, compiled with clang++ and -std=c++23. No CMake, no headers, no build system worth talking about. Just a script.sh that compiles and runs.

I also wrote a small Node.js client that uses socks-proxy-agent to make an HTTPS request through the proxy. That client became my forcing function. Until the C++ server spoke real SOCKS5, the client would fail. There was no faking it.

Step 1: A TCP Server, By Hand

The first real step was a Berkeley sockets TCP server. No SOCKS5 yet. Just a process that listens on port 8080, accepts one client, sends a greeting, and exits.

int server_fd = socket(AF_INET, SOCK_STREAM, 0);

sockaddr_in address{};
address.sin_family = AF_INET;
address.sin_addr.s_addr = INADDR_ANY;
address.sin_port = htons(8080);

bind(server_fd, reinterpret_cast<sockaddr*>(&address), sizeof(address));
listen(server_fd, 1);

int client_fd = accept(server_fd, nullptr, nullptr);

Five lines of real work. But every one of them carries something you forget if you’ve spent a decade in higher-level languages.

sockaddr_in{} uses aggregate initialisation to zero every byte. Forget it and you ship uninitialised padding into a syscall. htons reminds you that network byte order is not your CPU’s byte order, and never has been. reinterpret_cast<sockaddr*> is the price you pay for a C API that predates type-safe polymorphism: every address family pretends to be the same struct, and the cast is how you keep the compiler quiet.

There’s no framework here. No middleware. No “app.listen()”. Just the kernel and a file descriptor.

It felt good. It also felt slow. I’d forgotten how much of the work the language won’t do for you.

Step 2: Reading the First Packet

A SOCKS5 conversation starts with the client sending a greeting: which authentication methods it supports. The format is dead simple. One byte for the version (always 0x05), one byte for the number of methods, then the methods themselves.

char buffer[25];
ssize_t n = recv(client_fd, buffer, sizeof(buffer), 0);

char ver = buffer[0];
char nauth = buffer[1];

for (int i = 0; i < nauth; i++) {
    char auth = buffer[2 + i];
    std::cout << "auth: " << static_cast<int>(auth) << std::endl;
}

The first time you print a byte in C++, you learn something annoying: std::cout << buffer[0] will try to print a character, not a number. You need static_cast<int> to see the value.

The second thing you learn, a few lines later, is that char is signed on most platforms. If a byte happens to have the top bit set, your “number” becomes negative. For values like 0x05 and 0x02 it doesn’t matter. For a length field that reaches 0x80, it does. I switched to unsigned char for byte buffers in the next step and stopped fighting the type system.

This is the kind of nuance you only re-learn by writing the code. You can read about it ten times. It doesn’t stick until a buffer reads -1 and you have to figure out why.

Step 3: Talking Back

Parsing a greeting is half the job. The server has to respond with which method it picked. Two bytes: the version, and the chosen method. Or 0xFF if no method is acceptable.

This is also where I started extracting functions:

char handle_greeting_request(int client_fd) {
    unsigned char buffer[1024];
    ssize_t n = recv(client_fd, buffer, sizeof(buffer), 0);

    unsigned char ver = buffer[0];
    unsigned char nauth = buffer[1];

    for (int i = 0; i < nauth; i++) {
        if (buffer[2 + i] == 2) {
            unsigned char response[2] = {ver, 0x02};
            send(client_fd, response, sizeof(response), 0);
            return 0x02;
        }
    }

    unsigned char response[2] = {ver, 0xff};
    send(client_fd, response, sizeof(response), 0);
    return -1;
}

This is a small refactor that mattered more than it looks. SOCKS5 is a small state machine: greeting, then auth, then connect, then relay. Each phase wants its own function. Each function has one job and one return value: a status, a socket, or an error.

I’d built the same protocol in TypeScript and Go with classes or interfaces. In C++ I just used free functions and a few named integers. It was refreshing, in the way old code can sometimes be.

Step 4: Variable-Length Fields

After greeting comes username/password authentication, defined in RFC 1929. The packet format is variable length: a version byte, a length, the username, another length, the password.

This is where binary parsing gets fun.

char buffer[512];
recv(client_fd, buffer, sizeof(buffer), 0);

char ver = buffer[0];
char id_len = buffer[1];
std::string id(&buffer[2], id_len);

char pw_len = buffer[2 + id_len];
std::string pw(&buffer[3 + id_len], pw_len);

unsigned char response[2] = {static_cast<unsigned char>(ver), 0x00};
send(client_fd, response, sizeof(response), 0);

Two things are worth pointing out.

First, std::string(const char* ptr, size_t len). This constructor is a small revelation. It takes a pointer and a length and produces a real C++ string with no null-terminator gymnastics. For a binary protocol where every field carries its own length, it’s exactly the right tool.

Second, the offset arithmetic. &buffer[3 + id_len] is the kind of expression that looks innocent and isn’t. If the client lies about lengths, or if the packet was short, you read past the end of the buffer. In a production proxy, you’d validate every length against the bytes you actually received. In a learning project, you trust the client and move on.

For this exercise, the server accepts any credentials and always replies 0x00 (success). Authentication wasn’t the point. Parsing was.

The Final Commit: CONNECT and Relay

This is the step where the project stops being a parser and starts being a proxy.

The client sends a CONNECT request: “open a TCP connection to this host on this port, and relay my bytes through it.” The address can be an IPv4 address or a domain name. The port is two bytes, big-endian.

if (atyp == 0x01) {
    char addr[INET_ADDRSTRLEN];
    inet_ntop(AF_INET, &buffer[4], addr, sizeof(addr));
    host = std::string(addr);

    unsigned char p1 = static_cast<unsigned char>(buffer[8]);
    unsigned char p2 = static_cast<unsigned char>(buffer[9]);
    port = (p1 << 8) | p2;
}
else if (atyp == 0x03) {
    unsigned char domain_len = static_cast<unsigned char>(buffer[4]);
    host = std::string(&buffer[5], domain_len);

    unsigned char p1 = static_cast<unsigned char>(buffer[5 + domain_len]);
    unsigned char p2 = static_cast<unsigned char>(buffer[6 + domain_len]);
    port = (p1 << 8) | p2;
}

A few things stand out.

inet_ntop is the modern, IPv4/IPv6-friendly replacement for inet_ntoa. It takes a binary address and writes a string representation into a buffer you own. No static buffers, no thread-safety footguns.

The port decode is unglamorous and important. Two bytes, high byte first. Cast each to unsigned char so the shift doesn’t sign-extend, then (p1 << 8) | p2. Forget the cast and a port like 49152 becomes nonsense.

Once I had the host and port, DNS resolution was straightforward:

addrinfo hints{};
hints.ai_family = AF_INET;
hints.ai_socktype = SOCK_STREAM;

addrinfo *result = nullptr;
int gai = getaddrinfo(host.c_str(), port_str.c_str(), &hints, &result);

int remote_fd = socket(result->ai_family, result->ai_socktype, result->ai_protocol);
connect(remote_fd, result->ai_addr, result->ai_addrlen);
freeaddrinfo(result);

getaddrinfo is one of those POSIX functions I’d forgotten I knew. It takes a host string and a port string, does the DNS work, and hands you a linked list of addrinfo structs you can hand straight to socket() and connect(). The matching freeaddrinfo is the only thing standing between you and a leak.

Then comes the relay. The proxy now holds two sockets: one to the client, one to the remote server. It needs to copy bytes between them in both directions, simultaneously, until one side closes.

void relay(int from_fd, int to_fd) {
    char buffer[4096];

    while (true) {
        ssize_t n = recv(from_fd, buffer, sizeof(buffer), 0);
        if (n <= 0) break;
        send(to_fd, buffer, n, 0);
    }

    shutdown(to_fd, SHUT_WR);
}

// ...

std::thread t1(relay, client_fd, remote_fd);
std::thread t2(relay, remote_fd, client_fd);
t1.join();
t2.join();

Two threads. One per direction. Each reads from one socket and writes to the other until the read side closes. shutdown(to_fd, SHUT_WR) is what signals “I’m done sending” to the peer, so the other thread eventually sees recv return zero and exits cleanly.

std::thread is shockingly pleasant for someone coming from C. You pass the function, you pass the arguments, the constructor spawns the thread. join() waits for it. RAII isn’t quite there (a thread that goes out of scope without being joined will std::terminate you), but it’s a long way from pthread_create.

The first time I ran the whole thing and the Node.js client printed back the IP address from api.ipify.org, I had to read the output twice to convince myself it was real. A C++ binary I wrote in a weekend was carrying real HTTPS traffic, in both directions, between two sockets I’d built by hand.

That feeling never gets old.

What C++ Felt Like in 2026

A few things surprised me about coming back.

Modern C++ is genuinely nicer. Aggregate initialisation, std::string constructors that take pointer-and-length, std::thread, and static_cast everywhere instead of C-style casts. The language has accumulated good ergonomics. You can write recognisably modern code without dragging in templates or metaprogramming.

The POSIX socket API hasn’t moved. Everything below std::thread is the same API I used ten years ago. That’s a feature. The kernel didn’t change. The syscalls didn’t change. The skill transferred straight across.

RFCs are a great way to slow down. Binary protocols force you to read carefully. Every byte has a meaning. Every offset matters. There’s no library hiding the wire format from you. The work is the wire format.

What I’d Tell Past-Me

If I’d known ten years ago that I’d come back to C++ this way, I think I would’ve been less precious about the languages I picked up in between.

Every language I learned along the way made this one easier. Go taught me to think about concurrency in terms of independent goroutines, which mapped neatly onto two relay threads. TypeScript taught me that binary parsing is its own discipline and deserves named functions. Python taught me to keep the test harness in a different language so I couldn’t cheat. C#, ten years ago, taught me what a state machine looks like when you draw it out before writing code.

None of that was wasted. The language is the surface. The thinking is the work. Pick a problem you know. Pick a language you don’t. Let the friction teach you.

This one’s done. I’m already thinking about which language is next.