A deep dive into execution time, heap allocation, and memory usage when printing 2,000,000 numbers to the console.

If you have ever argued with a fellow developer about which language is "fastest," you know how quickly that conversation turns into religion. Everyone has an opinion. Rust fans swear by zero-cost abstractions. Python developers point to productivity. Node.js engineers talk about its non-blocking I/O model. So I decided to settle it — at least for one very specific, very revealing workload — with actual numbers.
I wrote a program in all three languages that does one simple thing: count from 1 to 2,000,000 and print every single number to the console. Then I measured execution time, heap/data memory used, and total Resident Set Size (RSS — the actual RAM consumed by the process). The results tell a story that goes far beyond raw speed.
🔗 All source code is available on GitHub: github.com/lordscoba/rust-node-python-performance
Why This Benchmark Matters
Counting and printing are not glamorous tasks. It does not involve machine learning, database queries, or HTTP servers. But that is exactly what makes it useful. This benchmark isolates one of the most fundamental bottlenecks in software: standard output (I/O). Every language eventually has to talk to your operating system and terminal, and how it does so reveals deep architectural truths about its design.
The Results at a Glance
Metric | Node.js (Streaming) | Python (Mega-String) | Rust (Buffered) |
|---|---|---|---|
Total Time | 8376.00 ms | 3486.86 ms | 3060.21 ms |
Heap / Data Used | 15.00 MB | 14.20 MB | 9.66 KB |
Peak RSS (Total RAM) | 71.42 MB | 164.72 MB | ~5.00 MB |
Those numbers deserve a careful read. Rust finished in just over 3 seconds using less than 10 kilobytes of heap memory. Python came surprisingly close in speed but consumed over 164 MB of RAM. Node.js was the slowest at over 8 seconds — and the reason why is one of the most instructive parts of this whole experiment.
Rust: The Efficiency King
Rust won on every meaningful metric. It was the fastest, and it used so little memory that the heap number looks like a typo — 9.66 KB while the other two were measuring in megabytes.
The secret is BufWriter. Instead of writing each number to the terminal one at a time, Rust fills an internal 8 KB buffer and flushes it to the OS in one system call. It never holds more than a handful of numbers in RAM at any moment. The result is a process that is both blindingly efficient and extraordinarily lean.
This is the essence of Rust's philosophy: you get exactly what you ask for, nothing more. No garbage collector is running in the background, no virtual machine warming up, no runtime overhead. The compiled binary speaks directly to the operating system.
use std::alloc::{GlobalAlloc, Layout, System};
use std::io::{self, Write, BufWriter};
use std::sync::atomic::{AtomicUsize, Ordering};
use std::time::Instant;
// --- MEMORY TRACKER START ---
struct TrackingAllocator;
static PEAK_ALLOCATED: AtomicUsize = AtomicUsize::new(0);
static CURRENT_ALLOCATED: AtomicUsize = AtomicUsize::new(0);
unsafe impl GlobalAlloc for TrackingAllocator {
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
let ptr = System.alloc(layout);
if !ptr.is_null() {
let size = layout.size();
let current = CURRENT_ALLOCATED.fetch_add(size, Ordering::SeqCst) + size;
let _ = PEAK_ALLOCATED.fetch_max(current, Ordering::SeqCst);
}
ptr
}
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
System.dealloc(ptr, layout);
CURRENT_ALLOCATED.fetch_sub(layout.size(), Ordering::SeqCst);
}
}
#[global_allocator]
static GLOBAL: TrackingAllocator = TrackingAllocator;
// --- MEMORY TRACKER END ---
fn main() -> io::Result<()> {
let start = Instant::now();
let stdout = io::stdout();
let lock = stdout.lock();
// BufWriter has a default capacity (usually 8 KB)
let mut writer = BufWriter::new(lock);
for i in 1..=2000000 {
writeln!(writer, "{}", i)?;
}
writer.flush()?;
let duration = start.elapsed();
let peak_mem = PEAK_ALLOCATED.load(Ordering::SeqCst);
println!("\n--- Execution Stats Rust ---");
println!("Total time: {:.4} ms", duration.as_secs_f64() * 1000.0);
println!("Peak Memory: {:.2} KB", peak_mem as f64 / 1024.0);
Ok(())
}
Key insight: Compile with cargo build --release for full compiler optimizations. The difference between debug and release builds in Rust is dramatic.
Verdict: If you need maximum performance with minimal resource consumption — systems programming, embedded environments, high-throughput servers — Rust is the undisputed champion.
Python: The Memory-for-Speed Trade-off
Python's result is the most counterintuitive of the three. It nearly matched Rust in speed (3.4 seconds vs 3.0 seconds), which might make you think Python is nearly as fast as Rust. It is not. What Python did here was a clever trick — and it came with a high cost.
The strategy was to pre-calculate every number as a string, store all 2,000,000 of them in a list, join them into one enormous string using "".join(lines), and hand that single giant string to the OS in one shot. From the OS's perspective, this looks like one big write operation — fast and efficient.
But look at the RSS: 164.72 MB. Python had to hold the entire output in memory before writing a single byte to the terminal. This "Mega-String" strategy trades RAM for speed. On a machine with plenty of memory, it works. In a resource-constrained environment — a Docker container with a memory limit, a Raspberry Pi, a shared server — this approach could become a serious problem.
import time
import sys
import os
try:
import resource
except ImportError:
resource = None
def get_peak_rss_mb():
if not resource:
return None
usage = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
# Linux = KB, macOS = bytes
if sys.platform == "darwin":
return usage / (1024 * 1024)
else:
return usage / 1024
start_time = time.perf_counter()
# Creates a list of strings in memory
lines = [f"{i}\n" for i in range(1, 2_000_001)]
# join is O(n) and much faster than repeated string concatenation
output = "".join(lines)
# Bypass the slower print() overhead
sys.stdout.write(output)
end_time = time.perf_counter()
print("\n--- Python Execution Stats ---")
print(f"Total time: {(end_time - start_time) * 1000:.4f} ms")
print(f"Final String Size: {sys.getsizeof(output) / 1024 / 1024:.2f} MB")
rss = get_peak_rss_mb()
if rss:
print(f"Peak RSS: {rss:.2f} MB")
else:
print("RSS Stats: Not available (Windows requires psutil)")
Key insight: Using sys.stdout.write instead of print() eliminates per-call overhead. The "".join() The pattern is also significantly faster than building strings with += in a loop.
Verdict: Python's speed here is real, but it is borrowed from your RAM. If your server has 512 MB and you are running multiple Python workers doing similar things, you can run out of memory fast.
Node.js: The Streaming Paradox
Node.js took the longest — 8.3 seconds — and that number deserves an explanation rather than a dismissal. Node.js was not failing here. It was doing exactly what it was designed to do, and that design choice reveals something fundamental about the Node.js runtime.
When you call stream.write() In Node.js, it does not blindly dump data into the terminal as fast as possible. It participates in a protocol called backpressure. Before each write, Node.js checks whether the receiving end (the terminal, a file, a network socket) is ready to accept more data. If the terminal is busy rendering the previous batch of numbers, Node.js waits. This prevents your application from building up a massive, unbounded buffer in memory while the terminal struggles to keep up.
This is not a bug. It is the safety valve that makes Node.js suitable for production workloads — real-time data streams, log pipelines, file processors — where blasting data faster than the receiver can handle it would cause crashes or memory exhaustion.
The tradeoff is visible in the numbers: 8376 ms, but only 71.42 MB RSS — far leaner than Python's 164 MB.
const start = performance.now();
// 1. Capture initial memory state
const initialMem = process.memoryUsage();
// Use process.stdout directly as our stream
const stream = process.stdout;
for (let i = 1; i <= 2000000; i++) {
// .write returns 'false' if the buffer is full — this is "backpressure"
stream.write(i + "\n");
}
// 2. Capture memory state immediately after the loop
const peakMem = process.memoryUsage();
const end = performance.now();
// Helper to convert bytes to Megabytes
const toMB = (bytes) => (bytes / 1024 / 1024).toFixed(2);
console.log(`\n--- Node.js Streaming Stats ---`);
console.log(`Total time: ${(end - start).toFixed(4)} ms`);
console.log(`Heap Increase: ${toMB(peakMem.heapUsed - initialMem.heapUsed)} MB`);
console.log(`Total (RSS): ${toMB(peakMem.rss)} MB`);
Key insight: Use process.stdout.write instead of console.log. The latter adds newline logic and string formatting overhead on every single call — across 2 million iterations, that adds up.
Verdict: Node.js is the right tool for long-running streaming processes where memory safety matters more than raw throughput. Its slowness here is a feature in disguise.
Three Key Takeaways
1. The Real Bottleneck Is the Terminal, Not Your Code
In all three languages, the CPU finished all the arithmetic in milliseconds. The remaining 3 to 8 seconds were spent waiting for the terminal to render text. This is called the I/O bottleneck, and it is the dominant constraint in this kind of workload. Your choice of language matters far less than your I/O strategy.
2. Streaming vs. Batching Is an Architectural Decision
Batching (Python): Fast execution, but forces huge memory allocation upfront. Works great when you have RAM to spare.
Streaming (Node.js): Slower wall-clock time, but memory stays predictable and bounded. Essential for long-running services.
Buffered I/O (Rust): Gets the best of both worlds — a small, fixed-size buffer means low memory, while batched system calls mean speed.
3. Heap Usage vs. RSS: They Are Not the Same Thing
One of the most common misconceptions in performance profiling is equating heap usage with total memory consumption. The RSS numbers in this benchmark show why that is wrong. Node.js reports 15 MB of heap increase — but the total RSS is 71 MB. Where did the other 56 MB go? The V8 JavaScript engine, the event loop, the compiled bytecode, the garbage collector — all of that lives in RAM too, even if your code never touched it. Python's interpreter is similarly "expensive." Rust, with no runtime overhead, shows just how lean a language can be when it compiles directly to machine code.
How to Replicate This Benchmark
Rust: Clone the repo, then run
cargo build --releasefollowed by./target/release/your_binary. Always use the release flag — the difference is massive.Node.js: Run
node benchmark.js. Make sure you are usingprocess.stdout.writeand notconsole.log.Python: Run
python3 benchmark.py. On Linux, theresourcemodule is available natively. On Windows, installpsutilfor RSS tracking.
All three scripts are in the repository: github.com/lordscoba/rust-node-python-performance
Final Thoughts: Which Language Should You Choose?
The honest answer is that this benchmark does not tell you which language to build your next project in. It tells you something more useful: every runtime makes tradeoffs, and understanding those tradeoffs makes you a better engineer.
Rust's performance is real, but it comes with a steep learning curve and a slower development cycle. Python's speed trick here was borrowed from RAM and would not scale in a memory-constrained environment. Node.js's "slowness" is actually a safety guarantee that makes it well-suited for production streaming workloads.
The best language is the one whose tradeoffs align with your specific constraints — hardware, team expertise, deployment environment, and workload characteristics.
And if you want to see what happens when the CPU actually has to work — say, computing prime numbers instead of just printing integers — that is a benchmark worth running next. Because when you take the terminal out of the equation and make the CPU sweat, the gap between these three languages widens considerably.
Have you run similar benchmarks? What were your results? Drop a comment or open an issue on the GitHub repo — I'd love to see your numbers.


