Mastering AWS Security Groups vs NACLs the Right Way

Arjun, a new Cloud Security Engineer, just joined a fast-growing fintech startup. His job? To secure their AWS infrastructure.

One morning, Arjun’s manager dropped a bomb.

“We’re going live with our EC2-powered web app next week. Set up the VPC-level security. I want airtight protection — no open doors!”

Arjun smiled. “Time to meet the real gatekeepers of the AWS network – Security Groups and NACLs.”

Let’s walk through his journey — and while we’re at it, let’s understand how these two security layers actually work.


🛡️ Scene 1: The Two Types of Guards

Imagine your EC2 instance is a VIP in a secure mansion.

  • Security Group (SG) is the personal bodyguard at the VIP’s door.
  • NACL (Network Access Control List) is the fence guard who controls traffic entering the mansion itself (the subnet).

🧠 Quick Definitions:

FeatureSecurity GroupNACL
LevelEC2 InstanceSubnet
RulesAllow onlyAllow & Deny
Stateful?✅ Yes❌ No
Applies toOnly attached resourcesAll resources in subnet
Rule EvaluationAll rules evaluatedFirst match wins (rule order matters)

🧭 Scene 2: The Incoming Guest

Arjun sets up a custom VPC, deploys a web server EC2 in a public subnet, and waits.

A user sends a request to the server.

🛣️ Traffic path:

  1. 🚧 Hits the NACL first (like a checkpoint at the mansion gate).
  2. ✅ If allowed → reaches the subnet.
  3. 🕵️ Then hits the Security Group attached to EC2 (like the bodyguard at the door).
  4. ✅ If allowed → request reaches the server.

Now comes the twist.

✅ Security Group is Stateful

This means: If incoming traffic is allowed, the outgoing response is auto-allowed — no need to add extra outbound rules.

Arjun smiled, “Cool! I don’t have to worry about the response — SG remembers the incoming request!”

❌ NACL is Stateless

This means: For the response to leaveoutbound rules in the NACL must also allow it. NACL doesn’t “remember” the initial request.

“Ah! NACL is like that guard who needs a fresh permission every time — even for return traffic.”


🌍 Scene 3: The Outgoing Request

Now, the EC2 instance tries to connect to www.google.com.

Here’s how Arjun explains the flow:

  1. ✅ Security Group checks outbound rule – “Can this EC2 go out to the internet?”
  2. ✅ If allowed → NACL outbound rule is checked – “Can the subnet allow this?”
  3. 🌐 Request hits Google.
  4. ✅ NACL inbound rule – Must allow the response back in.
  5. ✅ Security Group inbound rule is not checked again because it’s stateful and remembers.

Boom. Request goes out and comes back successfully.


🔥 Scene 4: The Hidden Gate — Ephemeral Ports and NACL Confusion

During a VPC traffic inspection, Arjun hit a strange issue. His web server in the private subnet could send traffic to the RDS database, but responses weren’t coming back.

He checked the Security Group — ✅ all good.
He checked the NACL — ❌ something was off.


🧩 What’s Going On?

The traffic to the database on port 3306 (MySQL) was working fine. But the return traffic from RDS to the EC2 instance was getting blocked by the Network ACL.

That’s when it clicked:

“Oh! The response from the DB is coming back to a random high-numbered port on the EC2 instance. I forgot to allow that!”


🧠 What Are Ephemeral Ports?

Ephemeral ports are temporary, short-lived ports automatically assigned by a client’s operating system when it initiates a connection to a server.

They are used for outbound communication, and the server replies to that specific ephemeral port, not to a fixed one.


📌 Here’s What Happens Under the Hood:

  1. The client (EC2 web server) initiates a connection to a server (RDS).
  2. It randomly picks an ephemeral port, say TCP 50123, to open the connection.
  3. The server (RDS) responds to this port 50123.
  4. If your NACL doesn’t allow inbound traffic to that port, the response will be blocked.

Ephemeral ports are client-side ports. You never define them manually, but your firewall (or NACL) must allow them for the return traffic to work.


🎯 Why Do We Need to Allow Ephemeral Ports in NACL?

Because NACLs are stateless.

Unlike Security Groups, which remember the connection and allow return traffic automatically, NACLs treat each direction (inbound/outbound) as separate rules.

So, you must:

  • Allow outbound traffic from client to server (e.g., port 3306).
  • AND allow inbound traffic from server to ephemeral ports (e.g., 49152–65535).

🔎 OS-Specific Ephemeral Port Ranges

Operating SystemDefault Ephemeral Port Range
Linux32768–60999 (can be changed)
Windows49152–65535
macOS49152–65535

🧱 Scene 5 (Deep Dive): Default vs Custom NACLs – Arjun’s Realization

After setting up a secure EC2 environment, Arjun turned his attention to subnet-level security. His company had multiple subnets:

  • public-subnet-1a – for Load Balancers
  • private-app-subnet-1a – for EC2 app servers
  • private-db-subnet-1a – for RDS and backend systems

He ran this command in the console:Copy

aws ec2 describe-network-acls --region ap-south-1

And noticed something odd.

The default NACL was attached to all subnets — and it had these rules:

🟢 Inbound Rules:

Rule #ProtocolPort RangeSourceAllow/Deny
100ALLALL0.0.0.0/0ALLOW

🟢 Outbound Rules:

Rule #ProtocolPort RangeDestinationAllow/Deny
100ALLALL0.0.0.0/0ALLOW

😲 Why Does AWS Allow Everything by Default?

AWS is developer-friendly by design. The default NACL is:

  • Meant for ease of testing.
  • Ensures new subnets don’t get accidentally blocked.
  • Simplifies setup for beginners using the console.

Arjun realized: “This is great for learning, but a disaster in production!”


❌ The Hidden Risks of Default NACL in Production

Arjun created a checklist of why relying on the default NACL is a security anti-pattern:

RiskDescription
🔓 Everything AllowedEvery port, every IP, every protocol is open.
🕳️ No IP filteringYou can’t restrict malicious IP ranges.
🧱 No granular controlCan’t separate rules by function or environment.
😵 No visibilityHard to track what traffic is intended vs attack.
🧪 Test contaminationDefault NACL may allow unintentional traffic during testing.

🛠️ Arjun’s Plan: Create Custom NACLs for Each Subnet

Why Create Custom NACLs?

PurposeCustom NACL Strategy
App TierAllow only HTTP/HTTPS from ELB subnet
DB TierAllow MySQL from App subnet only
Logging TierAllow only specific NAT/SSM traffic
AuditabilityEasily trace which rule does what
Least PrivilegeEvery subnet gets just enough access

“Subnet equals a zone. Each zone gets a gatekeeper tailored to its purpose.”


🔢 Rule Numbering – Arjun’s Golden Rules

Unlike Security Groups, NACLs evaluate rules in order, starting from lowest number.

So Arjun used this pattern:

Rule NumberPurposeExample
100Allow CriticalHTTP/HTTPS, SSH
200Allow App TrafficRDS, Redis, etc.
300Ephemeral Port RangeAllow return traffic
400Internal ServicesNAT, SSM, etc.
900Explicit DenyBlock suspicious CIDRs
DefaultDeny AllImplicit at end

🧪 Case Study: Blocking a Malicious IP Using NACL

During an internal audit, Arjun discovered brute-force login attempts from 5.188.62.113.

With Security Groups, he couldn’t explicitly deny this IP.

So, he added a Deny rule in the NACL:Copy

Rule # : 10  
Type   : ALL  
Protocol: ALL  
Source : 5.188.62.113/32  
Action : DENY

What happened?

  • All requests from that IP got dropped.
  • Even if SG allowed it, NACL dropped it first (since NACL is evaluated before SG).

He added logging via VPC Flow Logs to monitor similar threats.


🎯 Best Practices for NACLs – Arjun’s Production Checklist

PracticeDescription
✅ Create custom NACLsOne per subnet or environment
✅ Use allow + deny rulesControl access precisely
✅ Follow least privilegeStart with deny-all, then allow what’s needed
✅ Number wiselyUse 100, 200… avoid clutter
✅ Tag everythingSo audits and reviews are easy
✅ Don’t mix prod & dev in one NACLSeparate environment = separate controls
✅ Open ephemeral portsAllow 1024–65535 inbound for responses
✅ Don’t forget NAT/SSM portsAllow NAT traffic (like 443, 80 outbound) and SSM (443 to region endpoint)

🧠 Memory Hook (For AWS SAA Exam):

SGs are forgiving. NACLs are strict.
SGs remember your previous action. NACLs ask every time.”


🏁 Conclusion: Arjun’s Takeaway

At the end of the week, Arjun had this posted on his Slack channel:

“Default NACLs are for learners.
Custom NACLs are for leaders.
If you’re running prod traffic through an open fence, you’re asking for trouble.”


Read More About AWS VPC

Follow me for more such content

Share your love
Jay Tillu
Jay Tillu
Articles: 22

Newsletter Updates

Enter your email address below and subscribe to our newsletter

Leave a Reply

Your email address will not be published. Required fields are marked *