Building a Payment Gateway: Technical Deep Dive

How we built a secure, scalable payment processing system handling thousands of transactions daily. Architecture, security, and lessons learned.

Mustafa Hasırcıoğlu
Mustafa Hasırcıoğlu
Software Engineer & Founder
January 5, 2025
4 min read
Building a Payment Gateway: Technical Deep Dive

Building a Payment Gateway: Technical Deep Dive

Payment systems are critical infrastructure. One bug can cost millions. Here's how we built hsrcpay.com to handle transactions securely at scale.

The Challenge

Build a payment gateway that:

  • Processes 10K+ transactions/day
  • Handles multiple payment providers
  • Maintains 99.99% uptime
  • Complies with PCI-DSS standards
  • Provides instant reconciliation

Architecture Overview

High-Level Design

Client → API Gateway → Payment Service → Provider (Stripe/PayPal)
                    ↓
              Database (PostgreSQL)
                    ↓
              Message Queue (RabbitMQ)
                    ↓
        Background Workers (Reconciliation)

Key Technical Decisions

1. Idempotency Keys

Problem: Network failures can cause duplicate charges.

Solution: Every transaction gets a unique idempotency key.

interface PaymentRequest {
  idempotencyKey: string; // UUID
  amount: number;
  currency: string;
  userId: string;
  // ...
}

async function processPayment(req: PaymentRequest) {
  const existing = await db.transactions.findByIdempotencyKey(
    req.idempotencyKey
  );
  
  if (existing) {
    return existing; // Return cached result
  }
  
  // Process new payment
  const result = await paymentProvider.charge(req);
  await db.transactions.save({
    ...result,
    idempotencyKey: req.idempotencyKey
  });
  
  return result;
}

2. State Machine for Transaction Status

We use a finite state machine to track transaction states:

PENDING → PROCESSING → COMPLETED
                    ↓
                 FAILED → REFUNDING → REFUNDED

This prevents invalid state transitions and makes the system predictable.

3. Webhook Handling

Payment providers send webhooks for async events:

// Webhook handler
async function handleWebhook(event: WebhookEvent) {
  // 1. Verify signature
  const isValid = verifySignature(event);
  if (!isValid) throw new Error('Invalid signature');
  
  // 2. Check for duplicates
  const processed = await redis.get(`webhook:${event.id}`);
  if (processed) return { status: 'already_processed' };
  
  // 3. Process event
  await processEvent(event);
  
  // 4. Mark as processed
  await redis.setex(`webhook:${event.id}`, 86400, 'true');
  
  return { status: 'success' };
}

4. Security Measures

Data at Rest:

  • Encrypted database
  • PCI-compliant tokenization
  • Never store full card numbers

Data in Transit:

  • TLS 1.3 only
  • Certificate pinning for mobile apps

Access Control:

  • JWT with short expiry
  • Rate limiting (100 req/min per user)
  • IP whitelisting for webhooks

5. Reconciliation System

Daily automated reconciliation:

async function reconcileTransactions(date: Date) {
  // 1. Fetch from our DB
  const ourTransactions = await db.transactions.forDate(date);
  
  // 2. Fetch from provider
  const providerTransactions = await provider.listTransactions(date);
  
  // 3. Compare and flag discrepancies
  const mismatches = findMismatches(ourTransactions, providerTransactions);
  
  if (mismatches.length > 0) {
    await notifyTeam(mismatches);
    await createReconciliationTickets(mismatches);
  }
}

Performance Optimizations

1. Database Indexing

Critical indexes for payment queries:

CREATE INDEX idx_transactions_user_created 
ON transactions(user_id, created_at DESC);

CREATE INDEX idx_transactions_status_created 
ON transactions(status, created_at DESC);

2. Caching Strategy

  • User limits: Redis (5 min TTL)
  • Provider status: Redis (30 sec TTL)
  • Transaction results: Redis (24 hours)

3. Queue-Based Processing

Non-critical operations run async:

  • Email receipts
  • Webhook retries
  • Analytics updates
  • Reporting

Monitoring & Alerting

We track:

  • Success rate: Must be >99.5%
  • Response time: P95 <500ms
  • Failed payments: Alert if >10 in 5 min
  • Provider downtime: Auto-failover to backup

Lessons Learned

1. Always Have a Backup Provider

When Stripe went down for 2 hours, our fallback to PayPal saved the day.

2. Test Failure Scenarios

We simulate:

  • Provider timeouts
  • Database connection loss
  • Network partitions
  • Race conditions

3. Observability Is Key

Without proper logging, debugging payment issues is impossible. We log:

  • Every request/response
  • All state transitions
  • Provider API calls
  • Error stack traces

4. Never Trust External APIs

Wrap provider SDKs with circuit breakers and retries:

const circuitBreaker = new CircuitBreaker(provider.charge, {
  timeout: 5000,
  errorThresholdPercentage: 50,
  resetTimeout: 30000
});

Results

After 1 year in production:

  • 99.98% uptime
  • <300ms P95 latency
  • Zero data breaches
  • 10K+ daily transactions
  • $2M+ processed monthly

Conclusion

Building payment systems is hard but rewarding. Key takeaways:

  1. Idempotency is non-negotiable
  2. Security must be paranoid
  3. Monitor everything
  4. Plan for provider failures
  5. Automate reconciliation

Questions about payment systems? Hit me up!


Next post: How we reduced payment processing costs by 40% with smart routing.

Mustafa Hasırcıoğlu

Written by Mustafa Hasırcıoğlu

Software Engineer & Founder

Enjoyed this article?

Subscribe to get notified about new posts or reach out if you want to discuss this topic further.

Building a Payment Gateway: Technical Deep Dive | Mustafa Hasırcıoğlu